I came across an interesting scenario few days back where customer added a CUC Secondary server back to the Cluster but it was not working as it should.
Customer had two CUC servers in the cluster in a load-balanced configuration with half of the ports on unity-02(sub) and other half on unity-01(Pub). For some hardware failure the secondary failed and they had to re-build it from Scratch. The problem they were facing was that unity was not taking any calls as the first 80 ports were on unity-02 and for some reason it was not happy.
I checked the replication status first and found this:
admin:utils dbreplication runtimestate
DB and Replication Services: ALL RUNNING
Cluster Replication State: Replication status command started at: 2012-10-18-11-44
Replication status command COMPLETED 1 tables checked out of 425
Processing Table: typedberrors with 982 records
No Errors or Mismatches found.
Use ‘file view activelog cm/trace/dbl/sdi/ReplicationStatus.2012_10_23_11_44_00.out’ to see the details
DB Version: ccm7_1_2_20000_2
Number of replicated tables: 425
Cluster Detailed View from PUB (2 Servers):
PING REPLICATION REPL. DBver& REPL. REPLICATION SETUP
SERVER-NAME IP ADDRESS (msec) RPC? STATUS QUEUE TABLES LOOP? (RTMT) & details
———– ———— —— —- ———– —– ——- —– —————–
UNITY-01 x.x.0.35 0.049 Yes Connected 0 match N/A (2) PUB Setup Completed
unity-02 x.x.0.36 0.254 Yes Off-Line N/A 0 N/A (4) Setup Completed
admin:file view activelog cm/trace/dbl/sdi/ReplicationStatus.2012_10_23_11_44_00.out
SERVER ID STATE STATUS QUEUE CONNECTION CHANGED
g_unity_01_ccm7_1_2_20000_2 2 Active Local 0
I can see (4) with Secondary which means not good!
I then found Secondary server complaining about some CDR records. A little search at Cisco and I found a well known defect. Bug Id: CSCta15666 for CUC 220.127.116.1100-140.
– – –
In CDR Define logs (file list activelog /cm/trace/dbl/*)
We got exception in Cdr define
Ignoreable exception occurred will continue. Value:92
In CDR output broadcast logs (file list activelog /cm/trace/dbl/*)
Error 17 while doing cdr check, will cdr deleteTime taken to do cdr check[1.92180991173]
Exception from cdr delete e.value  e.msg[Error executing [su -c ‘ulimit -c 0;cdr delete server g_nhbl_vo_cl1fs02_ccm7_1_2_10000_16’ – informix] returned ]
The steps taken to fix this were the following:
– utils dbreplication stop all on publisher
– utils dbreplication dropadmindb on both servers
– utils dbreplication forcedatasyncsub on subscriber
– utils dbreplication reset all
– rebooted the subscriber
After this, the dbreplication was fixed.