I am testing controller fail-over. I does not work very well. I get this in the syslog of the standby:
Oct 31 10:13:11 SC_2_1 kernel: TIPC: Lost contact with <1.1.47> Oct 31 10:13:13 SC_2_1 ncs_scap: NCS_AvSv: Card going for reboot -safComp=CompT_MQD,safSu=SuT_NCS_CNTLR,safNode=SC_2_1 faulted due to 6 -rcvr=9 Oct 31 10:13:20 SC_2_1 kernel: drbd0: PingAck did not arrive in time. As you can see, I get the MQD error at a time when I have no disk partition since DRBD has not performed a fail-over yet. And the result is that the fail-over does not work, the active reboots and becomes active again. The standby reboots and becomes standby again. So what does the part "faulted due to 6 -rcvr=9" mean? There is nothing in the MQD log files. No core dump, nothing. I have seen similar problems in other processes such as MAS, DTS when testing fail-over so I guess there is a general problem. Could the problem be that OpenSAF fails-over before DRBD fails-over? Thanks, Hans _______________________________________________ Users mailing list [email protected] http://list.opensaf.org/maillist/listinfo/users
