We have a situation that /var/log/messages reported standby controller lost 
contact with a payload node, but OpenSAF did not
reboot on either node.

There were 4 nodes in the cluster: controller2 was the active controller, 
controller1 was standby, and payload1 and payload2 were payload nodes.

In /var/log/messages files:

Sep 15 05:51:44 controller1 osafdtmd[22897]: NO Lost contact with 'payload2'
Sep 15 05:51:44 payload2 osafdtmd[5726]: NO Lost contact with 'controller1'

There was no log entry that they "Established contact with" each other later on.

At meantime, amf-state node all command on controller1 reported that payload2 
is UNLOCKED and ENABLED:

safAmfNode=payload2,safAmfCluster=MyCLuster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)

controller2 is the active controller,  in its mds.log, it reported some errors:

Sep 15  5:55:26.472907 <14923> ERR    |MDS_SND_RCV: Timeout or Error occured
Sep 15  5:55:26.473186 <14923> ERR    |MDS_SND_RCV: Timeout occured on red 
sndrsp message from svc_id=19, to svc_id=19
Sep 15  5:55:26.473299 <14923> ERR    |MDS_SND_RCV: Adest=<0x00000000,1>
Sep 15  5:55:26.473361 <14923> ERR    |MDS_SND_RCV: Anchor=<0x00020a0f,23043>
Sep 15  5:55:26.942531 <14923> ERR    |MDS_SND_RCV: Sync entry doesnt exits

0x00020a0f is the node id of standby controller controller1, 23043 is the pid 
of amfd process

What could cause this problem? If a network problem caused this scenario, what 
could we do to the cluster to make sure the cluster is in good condition?

Thanks!
Shu Wang




________________________________
The information transmitted herein is intended only for the person or entity to 
which it is addressed and may contain confidential, proprietary and/or 
privileged material. Any review, retransmission, dissemination or other use of, 
or taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received this 
in error, please contact the sender and delete the material from any computer.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to