Is RDE the entity that discovers that the peer has disappeared and reports that to AMF?
Is it using a protocol over its TCP/IP connections for this purpose? Thanks, Hans > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Hans Feldt > Sent: den 31 oktober 2007 10:45 > To: [email protected] > Subject: [Users] Help with controller fail-over issues? > > I am testing controller fail-over. I does not work very well. > I get this in the syslog of the standby: > > Oct 31 10:13:11 SC_2_1 kernel: TIPC: Lost contact with > <1.1.47> Oct 31 10:13:13 SC_2_1 ncs_scap: NCS_AvSv: Card > going for reboot > -safComp=CompT_MQD,safSu=SuT_NCS_CNTLR,safNode=SC_2_1 faulted due to 6 > -rcvr=9 > Oct 31 10:13:20 SC_2_1 kernel: drbd0: PingAck did not arrive in time. > > As you can see, I get the MQD error at a time when I have no > disk partition since DRBD has not performed a fail-over yet. > And the result is that the fail-over does not work, the > active reboots and becomes active again. The standby reboots > and becomes standby again. > > So what does the part "faulted due to 6 -rcvr=9" mean? > > There is nothing in the MQD log files. No core dump, nothing. > I have seen similar problems in other processes such as MAS, > DTS when testing fail-over so I guess there is a general problem. > > Could the problem be that OpenSAF fails-over before DRBD fails-over? > > Thanks, > Hans > _______________________________________________ > Users mailing list > [email protected] > http://list.opensaf.org/maillist/listinfo/users > _______________________________________________ Users mailing list [email protected] http://list.opensaf.org/maillist/listinfo/users
