Tried reseting 4.5 protocol flag at the start of rollback(to 4.4) and this time
rollback succedded. But for upgrade I have set the 4.5protocol flag at the
campaign completion stage.
Another observation: But during the failure attempt (when ticket was raised) it
was observed that even if 4.5 protocol flag was not reset at the start of
rollback, its value of already reset at failure time stamp.
---
** [tickets:#1002] FileSync assertion in immnd resulted in smf mw-rollback
failure(4.4 - 4.5)**
**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Thu Aug 21, 2014 06:02 AM UTC by Hrishikesh
**Last Updated:** Thu Aug 21, 2014 09:11 AM UTC
**Owner:** nobody
Setup: SLES 64bit, 4nodes
Changeset : 5608 branch opensaf-4.5.x along with patches for #938,#994 and
#997.
5044 branch opensaf-4.4.x
Test case: Middleware upgrade between 4.4 - 4.5
Testprocedre:
1. Cluster is up and running on 4.4, merging OpensafImm_Upgrade_4.5.xml (immcfg
-f, xml taken from 4.5) after setting NoStdFlags for scheme change.
<immadm -o 1 -p opensafImmNostdFlags:SA_UINT32_T:1
opensafImm=opensafImm,safApp=safImmService>
2. Upgrade was triggered to 4.5
3. After upgrade is successful, rollback is triggered to bring back the cluster
to 4.4.
Failure description: Step 1, 2 of procedure was successful. After triggering
rollback, SC-1,SC-2 was rolled back successfully to 4.4 and at the end of
rollback of SC-2,
there was an error observed on PL-3 and PL-4 of "finalizeSync: Assertion" as
given in log snippet below.
After crossing the limit of 10 immnd restarts, SU Failover was triggered on
PL-3,PL-4
and nodes went for reboot.
As PL-3 and PL-4 never joined the cluster again , SMF failed its campaign after
timeout
waiting for nodes to join.
SC-1 Active
SC-2 Standby
syslog snippet:
===============
Aug 20 20:47:35 SLES2-3 osafimmnd[2269]: ER ccb->mState:11 != ol->ccbState:9
for CCB:16
Aug 20 20:47:35 SLES2-3 osafimmnd[2269]: ImmModel.cc:16878: finalizeSync:
Assertion 'ccb->mState == (ImmCcbState) ol->ccbState' failed.
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' component restart probation timer
started (timeout: 60000000000 ns)
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO Restarting a component of
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
Aug 20 20:47:35 SLES2-3 osafamfnd[2289]: NO
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Aug 20 20:47:35 SLES2-3 osafimmnd[2842]: Started
Aug 20 20:47:39 SLES2-3 osafimmnd[2842]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: ER Can not sync Ccb that is active
Aug 20 20:47:40 SLES2-3 osafimmnd[2842]: ER Unexpected local error 21 in
finalizeSync for sync client - aborting
Aug 20 20:47:40 SLES2-3 osafamfnd[2289]: NO Restarting a component of
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' (comp restart count: 2)
Aug 20 20:47:40 SLES2-3 osafamfnd[2289]: NO
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Aug 20 20:47:40 SLES2-3 osafimmnd[2862]: Started
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Slashdot TV.
Video for Nerds. Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets