Hi Lars

I wish I could not believe it either :)
The services were running fine but the heartbeat service itself (I mean literally the heartbeat service, but the services heartbeat is supposed to control were still running fine and active on the same node) was stopped at some point and I don't understand why.

I am using 2.1.3-3.el5.centos

Thanks for anything you guys can think of.

Lars Marowsky-Bree wrote:
On 2008-09-10T18:12:06, Jeffery Soo <[EMAIL PROTECTED]> wrote:

Both nodes don't have any firewall enabled and after 6 days, heartbeat suddenly stopped itself. There is no indication of this in the logs and also the other heartbeat node didn't seem to detect this (that the active node's heartbeat was stopped).

I really don't believe that. Can you please specify what you mean with
"stopped" - the services were stopped or the heartbeat processes were
down on the node?

Heartbeat logs all resource operations. If there are no logs, heartbeat
didn't do it.

And the peer not noticing that the heartbeat processes on the other node
went down is ... exceptionally unlikely, as the node then would stop
sending messages on the network, which is hard to miss.

What versions are you running?


Regards,
    Lars


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to