Hi,
By below information in the logs, there is a link loss happened. check 
the link between two nodes.

Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service in NOACTIVE state
Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMD lost contact with peer IMMD 
(NCSMDS_RED_DOWN)
Jun 20 17:01:54 fedora1 osaffmd[7745]: Role: STANDBY, Node Down for node id: 
2020f
Jun 20 17:01:54 fedora1 osaffmd[7745]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Received Node Down for Active peer

/Neel.


On Thursday 20 June 2013 08:06 PM, Aditya Sahay wrote:
> Hi,
>
> I have been trying to start OpenSAF in 2N (Redundancy mode) on Fedora 17 
> (linux-3.4.44 kernel). I started one node first which started normally as 
> ACTIVE, after which I started the other node which initially started normally 
> as STANDBY. However, after sometime, the STANDBY node IMMD loses contact with 
> the peer IMMD and switches to ACTIVE mode. Both the nodes, then, continue to 
> run separately as ACTIVE. The debug messages are as follows:
>
> Jun 20 17:01:40 fedora1 osafdtmd[7709]: Started
> Jun 20 17:01:40 fedora1 osafrded[7729]: Started
> Jun 20 17:01:40 fedora1 osafrded[7729]: rde@2020f has active state => Standby 
> role
> Jun 20 17:01:40 fedora1 osaffmd[7745]: Started
> Jun 20 17:01:40 fedora1 osafimmd[7761]: Started
> Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:40 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: Started
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: Director Service is up
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: IMM_SERVER_ANONYMOUS 
> --> IMM_SERVER_CLUSTER_WAITING
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: 
> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: SERVER STATE: 
> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
> Jun 20 17:01:40 fedora1 osafimmnd[7778]: NODE STATE-> IMM_NODE_ISOLATED
> Jun 20 17:01:41 fedora1 osafimmd[7761]: Ruling epoch noted as:3 on IMMD 
> standby
> Jun 20 17:01:41 fedora1 osafimmd[7761]: IMMND coord at 2020f
> Jun 20 17:01:41 fedora1 osafimmnd[7778]: NODE STATE-> IMM_NODE_W_AVAILABLE
> Jun 20 17:01:41 fedora1 osafimmnd[7778]: SERVER STATE: 
> IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
> Jun 20 17:01:46 fedora1 osafimmnd[7778]: NODE STATE-> 
> IMM_NODE_FULLY_AVAILABLE 1900
> Jun 20 17:01:46 fedora1 osafimmnd[7778]: RepositoryInitModeT is 
> SA_IMM_INIT_FROM_FILE
> Jun 20 17:01:46 fedora1 osafimmnd[7778]: Epoch set to 3 in ImmModel
> Jun 20 17:01:46 fedora1 osafimmd[7761]: SBY: New Epoch for IMMND process at 
> node 2020f old epoch: 2  new epoch:3
> Jun 20 17:01:46 fedora1 osafimmd[7761]: IMMND coord at 2020f
> Jun 20 17:01:46 fedora1 osafimmd[7761]: SBY: New Epoch for IMMND process at 
> node 2010f old epoch: 0  new epoch:3
> Jun 20 17:01:46 fedora1 osafimmnd[7778]: SERVER STATE: IMM_SERVER_SYNC_CLIENT 
> --> IMM SERVER READY
> Jun 20 17:01:46 fedora1 osaflogd[7800]: Started
> Jun 20 17:01:46 fedora1 osafntfd[7817]: Started
> Jun 20 17:01:46 fedora1 osafclmd[7834]: Started
> Jun 20 17:01:46 fedora1 osafclmna[7851]: Started
> Jun 20 17:01:46 fedora1 osafclmna[7851]: 
> safNode=fedora1,safCluster=myClmCluster Joined cluster, nodeid=2010f
> Jun 20 17:01:46 fedora1 osafamfd[7867]: Started
> Jun 20 17:01:47 fedora1 osafimmnd[7778]: Implementer (applier) connected: 6 
> (@safAmfService2010f) <7, 2010f>
> Jun 20 17:01:47 fedora1 osafamfnd[7885]: Started
> Jun 20 17:01:47 fedora1 osafamfnd[7885]: 
> 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED => 
> INSTANTIATING
> Jun 20 17:01:47 fedora1 osafamfnd[7885]: 'safSu=SC-1,safSg=2N,safApp=OpenSAF' 
> Presence State UNINSTANTIATED => INSTANTIATING
> Jun 20 17:01:47 fedora1 osafamfwd[7948]: Started
> Jun 20 17:01:48 fedora1 osafckptnd[7988]: Started
> Jun 20 17:01:48 fedora1 osafevtd[8008]: Started
> Jun 20 17:01:48 fedora1 osafamfnd[7885]: 
> 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING => 
> INSTANTIATED
> Jun 20 17:01:48 fedora1 osafckptnd[7988]: cpnd amf hlth chk start failed
> Jun 20 17:01:48 fedora1 osafckptd[8042]: Started
> Jun 20 17:01:48 fedora1 osafckptd[8042]: cpd health check start failed
> Jun 20 17:01:48 fedora1 osafamfnd[7885]: 'safSu=SC-1,safSg=2N,safApp=OpenSAF' 
> Presence State INSTANTIATING => INSTANTIATED
> Jun 20 17:01:50 fedora1 osafamfnd[7885]: Assigning 
> 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 
> 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
> Jun 20 17:01:50 fedora1 osafamfnd[7885]: Assigned 
> 'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 
> 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
> Jun 20 17:01:51 fedora1 osafamfd[7867]: Cold sync complete!
> Jun 20 17:01:52 fedora1 osafamfnd[7885]: Assigning 
> 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
> Jun 20 17:01:52 fedora1 osafamfnd[7885]: Assigned 
> 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service in NOACTIVE state
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMD lost contact with peer IMMD 
> (NCSMDS_RED_DOWN)
> Jun 20 17:01:54 fedora1 osaffmd[7745]: Role: STANDBY, Node Down for node id: 
> 2020f
> Jun 20 17:01:54 fedora1 osaffmd[7745]: Rebooting OpenSAF NodeId = 131599 EE 
> Name = , Reason: Received Node Down for Active peer
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: DISCARD DUPLICATE FEVS message:1034
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Error code 2 returned for message 
> type 57 - ignoring
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: DISCARD DUPLICATE FEVS message:1035
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Error code 2 returned for message 
> type 57 - ignoring
> Jun 20 17:01:54 fedora1 osafimmd[7761]: IMMND DOWN on active controller f2 
> detected at standby immd!! f1. Possible failover
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Skipping re-send of fevs message 1034 
> since it has recently been resent.
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Skipping re-send of fevs message 1035 
> since it has recently been resent.
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Global discard node received for 
> nodeId:2020f pid:7664
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 5 <0, 
> 2020f(down)> (safEvtService)
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 4 <0, 
> 2020f(down)> (safCheckPointService)
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 3 <0, 
> 2020f(down)> (safAmfService)
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 2 <0, 
> 2020f(down)> (safClmService)
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer disconnected 1 <0, 
> 2020f(down)> (safLogService)
> Jun 20 17:01:54 fedora1 opensaf_reboot: Rebooting remote node in the absence 
> of PLM is outside the scope of OpenSAF
> Jun 20 17:01:54 fedora1 osafrded[7729]: rde_rde_set_role: role set to 1
> Jun 20 17:01:54 fedora1 osafimmd[7761]: ACTIVE request
> Jun 20 17:01:54 fedora1 osaflogd[7800]: ACTIVE request
> Jun 20 17:01:54 fedora1 osafntfd[7817]: ACTIVE request
> Jun 20 17:01:54 fedora1 osafclmd[7834]: ACTIVE request
> Jun 20 17:01:54 fedora1 osafamfd[7867]: FAILOVER StandBy --> Active
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Director Service Is NEWACTIVE state
> Jun 20 17:01:54 fedora1 osafimmd[7761]: New coord elected, resides at 2010f
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: This IMMND is now the NEW Coord
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:54 fedora1 osafimmd[7761]: Received IMMD service event
> Jun 20 17:01:54 fedora1 osafamfnd[7885]: Assigning 
> 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
> Jun 20 17:01:54 fedora1 osafimmnd[7778]: Implementer connected: 10 
> (safCheckPointService) <232, 2010f>
>
> Thanks and Regards,
> Aditya  Sahay
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Opensaf-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-users


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to