On 4 January 2011 13:47, Igor Chudov <ichu...@gmail.com> wrote: > Further reading indicates that heartbeat itself sets a limit for itself > every so often. > > Then it exceeds the limit (probably due to a bug). I am sure that tha's why > whoever wrote heartbeat, set cpu limit, instead of foxing their bugs. > > Then it dies with SIGXCPU, leaving everything in an extremely messy state, > leading to split brain, destruction of shared resources (DRBD data). > > I was trying to be a little patient. A little forgiving. I must say that my > patience is rapidly running out. > > I absolutely cannot use this "solution" as a basis of a high reliability > cluster, because it is the opposite of reliability. > > We had an old cluster that works very well with heartbeat V1. But it is > getting old, the disks are wearing out, the fans are not getting newer, etc. > I set up a new cluster in summer, but never fully trusted it, and it looks > like I will not be able to trust it. We never completed a switchover. > > At this point I feel rather desperate. Perhaps I should give "pacemaker" > another go. I really have no idea and I am running out of options. > > i > > On Tue, Jan 4, 2011 at 7:32 AM, Igor Chudov <ichu...@g.mail.com> wrote: > >> A few weeks I reported that heartbeat died on one of the cluster machines, >> due to SIGXCPU. >> >> Well, it happened again. Heartbeat died, now both machines had the shared >> IP address up, what a god awful mess!!! >> >> Nopw they have split brain and the whole nine yards! >> >> I looked at /proc/<heartbeat_pid>/limits and found: >> >> Limit Soft Limit Hard Limit Units >> >> Max cpu time 43 unlimited seconds >> >> >> So, this process somehow has a limit set for it. >> >> Does anyone have ANY clue who would set a limit for this process??? WTF? >> Does it do it for itself or what? >>
I cannot answer your question, but I suspect it might be useful if you mentioned which version of heartbeat and what resource manager you are using. Perhaps provide a copy of your heartbeat configuration. Is heartbeat using too much CPU? It should be pretty much idle relative to the rest of the system - If not, it is worth finding out why not. Regards, Steve _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems