Tried reseting 4.5 protocol flag at the start of rollback(to 4.4) and this time 
rollback succedded. But for upgrade I have set the 4.5protocol flag at the 
campaign completion stage.

Another observation: But during the failure attempt (when ticket was raised) it 
was observed that even if 4.5 protocol flag was not reset at the start of 
rollback, its value of already reset at failure time stamp.


---

** [tickets:#1002] FileSync assertion in immnd resulted in smf mw-rollback 
failure(4.4 - 4.5)**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Thu Aug 21, 2014 06:02 AM UTC by Hrishikesh
**Last Updated:** Thu Aug 21, 2014 09:11 AM UTC
**Owner:** nobody

Setup: SLES 64bit, 4nodes
Changeset :  5608 branch opensaf-4.5.x along with patches for #938,#994 and 
#997.
             5044 branch opensaf-4.4.x

Test case: Middleware upgrade between 4.4 - 4.5

Testprocedre: 
1. Cluster is up and running on 4.4, merging OpensafImm_Upgrade_4.5.xml (immcfg 
-f, xml taken from 4.5) after setting NoStdFlags for scheme change.
<immadm -o 1 -p opensafImmNostdFlags:SA_UINT32_T:1 
opensafImm=opensafImm,safApp=safImmService>

2. Upgrade was triggered to 4.5
3. After upgrade is successful, rollback is triggered to bring back the cluster 
to 4.4.

Failure description: Step 1, 2 of procedure was successful. After triggering 
rollback, SC-1,SC-2 was rolled back successfully to 4.4 and at the end of 
rollback of SC-2,
there was an error observed on PL-3 and PL-4 of "finalizeSync: Assertion" as 
given in log snippet below.

After crossing the limit of 10 immnd restarts, SU Failover was triggered on 
PL-3,PL-4
 and nodes went for reboot.

As PL-3 and PL-4 never joined the cluster again , SMF failed its campaign after 
timeout
waiting for nodes to join.
SC-1 Active
SC-2 Standby

syslog snippet:
===============
Aug 20 20:47:35 SLES2-3 osafimmnd[2269]: ER ccb->mState:11  !=  ol->ccbState:9 
for CCB:16
Aug 20 20:47:35 SLES2-3 osafimmnd[2269]: ImmModel.cc:16878: finalizeSync: 
Assertion 'ccb->mState == (ImmCcbState) ol->ccbState' failed.
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' component restart probation timer 
started (timeout: 60000000000 ns)
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO Restarting a component of 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO 
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Aug 20 20:47:35 SLES2-3 osafimmnd[2842]: Started

Aug 20 20:47:39 SLES2-3 osafimmnd[2842]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: NO SERVER STATE: 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: ER Can not sync Ccb that is active
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: ER Unexpected local error 21 in 
finalizeSync for sync client - aborting
Aug 20 20:47:40 SLES2-3 osafamfnd[2289]: NO Restarting a component of 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' (comp restart count: 2)
Aug 20 20:47:40 SLES2-3 osafamfnd[2289]: NO 
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Aug 20 20:47:40 SLES2-3 osafimmnd[2862]: Started



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to