I am testing controller fail-over. I does not work very well. I get this
in the syslog of the standby:

Oct 31 10:13:11 SC_2_1 kernel: TIPC: Lost contact with <1.1.47>
Oct 31 10:13:13 SC_2_1 ncs_scap: NCS_AvSv: Card going for reboot
-safComp=CompT_MQD,safSu=SuT_NCS_CNTLR,safNode=SC_2_1 faulted due to 6
-rcvr=9
Oct 31 10:13:20 SC_2_1 kernel: drbd0: PingAck did not arrive in time.

As you can see, I get the MQD error at a time when I have no disk
partition since DRBD has not performed a fail-over yet. And the result
is that the fail-over does not work, the active reboots and becomes
active again. The standby reboots and becomes standby again.

So what does the part "faulted due to 6 -rcvr=9" mean?

There is nothing in the MQD log files. No core dump, nothing. I have
seen similar problems in other processes such as MAS, DTS when testing
fail-over so I guess there is a general problem.

Could the problem be that OpenSAF fails-over before DRBD fails-over?

Thanks,
Hans
_______________________________________________
Users mailing list
[email protected]
http://list.opensaf.org/maillist/listinfo/users

Reply via email to