Hi, By below information in the logs, there is a link loss happened. check the link between two nodes.
Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service in NOACTIVE state Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMD lost contact with peer IMMD (NCSMDS_RED_DOWN) Jun 20 17:01:54 fedora1 osaffmd[7745]: Role: STANDBY, Node Down for node id: 2020f Jun 20 17:01:54 fedora1 osaffmd[7745]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Received Node Down for Active peer /Neel. On Thursday 20 June 2013 08:06 PM, Aditya Sahay wrote: > Hi, > > I have been trying to start OpenSAF in 2N (Redundancy mode) on Fedora 17 > (linux-3.4.44 kernel). I started one node first which started normally as > ACTIVE, after which I started the other node which initially started normally > as STANDBY. However, after sometime, the STANDBY node IMMD loses contact with > the peer IMMD and switches to ACTIVE mode. Both the nodes, then, continue to > run separately as ACTIVE. The debug messages are as follows: > > Jun 20 17:01:40 fedora1 osafdtmd[7709]: Started > Jun 20 17:01:40 fedora1 osafrded[7729]: Started > Jun 20 17:01:40 fedora1 osafrded[7729]: rde@2020f has active state => Standby > role > Jun 20 17:01:40 fedora1 osaffmd[7745]: Started > Jun 20 17:01:40 fedora1 osafimmd[7761]: Started > Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:40 fedora1 osafimmnd[7778]: Started > Jun 20 17:01:40 fedora1 osafimmnd[7778]: Director Service is up > Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: IMM_SERVER_ANONYMOUS > --> IMM_SERVER_CLUSTER_WAITING > Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: > IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING > Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: > IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING > Jun 20 17:01:40 fedora1 osafimmnd[7778]: NODE STATE-> IMM_NODE_ISOLATED > Jun 20 17:01:41 fedora1 osafimmd[7761]: Ruling epoch noted as:3 on IMMD > standby > Jun 20 17:01:41 fedora1 osafimmd[7761]: IMMND coord at 2020f > Jun 20 17:01:41 fedora1 osafimmnd[7778]: NODE STATE-> IMM_NODE_W_AVAILABLE > Jun 20 17:01:41 fedora1 osafimmnd[7778]: SERVER STATE: > IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT > Jun 20 17:01:46 fedora1 osafimmnd[7778]: NODE STATE-> > IMM_NODE_FULLY_AVAILABLE 1900 > Jun 20 17:01:46 fedora1 osafimmnd[7778]: RepositoryInitModeT is > SA_IMM_INIT_FROM_FILE > Jun 20 17:01:46 fedora1 osafimmnd[7778]: Epoch set to 3 in ImmModel > Jun 20 17:01:46 fedora1 osafimmd[7761]: SBY: New Epoch for IMMND process at > node 2020f old epoch: 2 new epoch:3 > Jun 20 17:01:46 fedora1 osafimmd[7761]: IMMND coord at 2020f > Jun 20 17:01:46 fedora1 osafimmd[7761]: SBY: New Epoch for IMMND process at > node 2010f old epoch: 0 new epoch:3 > Jun 20 17:01:46 fedora1 osafimmnd[7778]: SERVER STATE: IMM_SERVER_SYNC_CLIENT > --> IMM SERVER READY > Jun 20 17:01:46 fedora1 osaflogd[7800]: Started > Jun 20 17:01:46 fedora1 osafntfd[7817]: Started > Jun 20 17:01:46 fedora1 osafclmd[7834]: Started > Jun 20 17:01:46 fedora1 osafclmna[7851]: Started > Jun 20 17:01:46 fedora1 osafclmna[7851]: > safNode=fedora1,safCluster=myClmCluster Joined cluster, nodeid=2010f > Jun 20 17:01:46 fedora1 osafamfd[7867]: Started > Jun 20 17:01:47 fedora1 osafimmnd[7778]: Implementer (applier) connected: 6 > (@safAmfService2010f) <7, 2010f> > Jun 20 17:01:47 fedora1 osafamfnd[7885]: Started > Jun 20 17:01:47 fedora1 osafamfnd[7885]: > 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED => > INSTANTIATING > Jun 20 17:01:47 fedora1 osafamfnd[7885]: 'safSu=SC-1,safSg=2N,safApp=OpenSAF' > Presence State UNINSTANTIATED => INSTANTIATING > Jun 20 17:01:47 fedora1 osafamfwd[7948]: Started > Jun 20 17:01:48 fedora1 osafckptnd[7988]: Started > Jun 20 17:01:48 fedora1 osafevtd[8008]: Started > Jun 20 17:01:48 fedora1 osafamfnd[7885]: > 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING => > INSTANTIATED > Jun 20 17:01:48 fedora1 osafckptnd[7988]: cpnd amf hlth chk start failed > Jun 20 17:01:48 fedora1 osafckptd[8042]: Started > Jun 20 17:01:48 fedora1 osafckptd[8042]: cpd health check start failed > Jun 20 17:01:48 fedora1 osafamfnd[7885]: 'safSu=SC-1,safSg=2N,safApp=OpenSAF' > Presence State INSTANTIATING => INSTANTIATED > Jun 20 17:01:50 fedora1 osafamfnd[7885]: Assigning > 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to > 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' > Jun 20 17:01:50 fedora1 osafamfnd[7885]: Assigned > 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to > 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' > Jun 20 17:01:51 fedora1 osafamfd[7867]: Cold sync complete! > Jun 20 17:01:52 fedora1 osafamfnd[7885]: Assigning > 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF' > Jun 20 17:01:52 fedora1 osafamfnd[7885]: Assigned > 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF' > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service in NOACTIVE state > Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMD lost contact with peer IMMD > (NCSMDS_RED_DOWN) > Jun 20 17:01:54 fedora1 osaffmd[7745]: Role: STANDBY, Node Down for node id: > 2020f > Jun 20 17:01:54 fedora1 osaffmd[7745]: Rebooting OpenSAF NodeId = 131599 EE > Name = , Reason: Received Node Down for Active peer > Jun 20 17:01:54 fedora1 osafimmnd[7778]: DISCARD DUPLICATE FEVS message:1034 > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Error code 2 returned for message > type 57 - ignoring > Jun 20 17:01:54 fedora1 osafimmnd[7778]: DISCARD DUPLICATE FEVS message:1035 > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Error code 2 returned for message > type 57 - ignoring > Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMND DOWN on active controller f2 > detected at standby immd!! f1. Possible failover > Jun 20 17:01:54 fedora1 osafimmd[7761]: Skipping re-send of fevs message 1034 > since it has recently been resent. > Jun 20 17:01:54 fedora1 osafimmd[7761]: Skipping re-send of fevs message 1035 > since it has recently been resent. > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Global discard node received for > nodeId:2020f pid:7664 > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 5 <0, > 2020f(down)> (safEvtService) > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 4 <0, > 2020f(down)> (safCheckPointService) > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 3 <0, > 2020f(down)> (safAmfService) > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 2 <0, > 2020f(down)> (safClmService) > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 1 <0, > 2020f(down)> (safLogService) > Jun 20 17:01:54 fedora1 opensaf_reboot: Rebooting remote node in the absence > of PLM is outside the scope of OpenSAF > Jun 20 17:01:54 fedora1 osafrded[7729]: rde_rde_set_role: role set to 1 > Jun 20 17:01:54 fedora1 osafimmd[7761]: ACTIVE request > Jun 20 17:01:54 fedora1 osaflogd[7800]: ACTIVE request > Jun 20 17:01:54 fedora1 osafntfd[7817]: ACTIVE request > Jun 20 17:01:54 fedora1 osafclmd[7834]: ACTIVE request > Jun 20 17:01:54 fedora1 osafamfd[7867]: FAILOVER StandBy --> Active > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service Is NEWACTIVE state > Jun 20 17:01:54 fedora1 osafimmd[7761]: New coord elected, resides at 2010f > Jun 20 17:01:54 fedora1 osafimmnd[7778]: This IMMND is now the NEW Coord > Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event > Jun 20 17:01:54 fedora1 osafamfnd[7885]: Assigning > 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF' > Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer connected: 10 > (safCheckPointService) <232, 2010f> > > Thanks and Regards, > Aditya Sahay > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Opensaf-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/opensaf-users ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
