Hi Lars
I wish I could not believe it either :)
The services were running fine but the heartbeat service itself (I mean
literally the heartbeat service, but the services heartbeat is supposed
to control were still running fine and active on the same node) was
stopped at some point and I don't understand why.
I am using 2.1.3-3.el5.centos
Thanks for anything you guys can think of.
Lars Marowsky-Bree wrote:
On 2008-09-10T18:12:06, Jeffery Soo <[EMAIL PROTECTED]> wrote:
Both nodes don't have any firewall enabled and after 6 days, heartbeat
suddenly stopped itself.
There is no indication of this in the logs and also the other heartbeat
node didn't seem to detect this (that the active node's heartbeat was
stopped).
I really don't believe that. Can you please specify what you mean with
"stopped" - the services were stopped or the heartbeat processes were
down on the node?
Heartbeat logs all resource operations. If there are no logs, heartbeat
didn't do it.
And the peer not noticing that the heartbeat processes on the other node
went down is ... exceptionally unlikely, as the node then would stop
sending messages on the network, which is hard to miss.
What versions are you running?
Regards,
Lars
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems