On 4 January 2011 13:47, Igor Chudov <ichu...@gmail.com> wrote:
> Further reading indicates that heartbeat itself sets a limit for itself
> every so often.
>
> Then it exceeds the limit (probably due to a bug). I am sure that tha's why
> whoever wrote heartbeat, set cpu limit, instead of foxing their bugs.
>
> Then it dies with SIGXCPU, leaving everything in an extremely messy state,
> leading to split brain, destruction of shared resources (DRBD data).
>
> I was trying to be a little patient. A little forgiving. I must say that my
> patience is rapidly running out.
>
> I absolutely cannot use this "solution" as a basis of a high reliability
> cluster, because it is the opposite of reliability.
>
> We had an old cluster that works very well with heartbeat V1. But it is
> getting old, the disks are wearing out, the fans are not getting newer, etc.
> I set up a new cluster in summer, but never fully trusted it, and it looks
> like I will not be able to trust it. We never completed a switchover.
>
> At this point I feel rather desperate. Perhaps I should give "pacemaker"
> another go. I really have no idea and I am running out of options.
>
> i
>
> On Tue, Jan 4, 2011 at 7:32 AM, Igor Chudov <ichu...@g.mail.com> wrote:
>
>> A few weeks I reported that heartbeat died on one of the cluster machines,
>> due to SIGXCPU.
>>
>> Well, it happened again. Heartbeat died, now both machines had the shared
>> IP address up, what a god awful mess!!!
>>
>> Nopw they have split brain and the whole nine yards!
>>
>> I  looked at /proc/<heartbeat_pid>/limits and found:
>>
>> Limit                     Soft Limit           Hard Limit           Units
>>
>> Max cpu time              43                   unlimited            seconds
>>
>>
>> So, this process somehow has a limit set for it.
>>
>> Does anyone have ANY clue who would set a limit for this process??? WTF?
>> Does it do it for itself or what?
>>

I cannot answer your question, but I suspect it might be useful if you
mentioned which version of heartbeat and what resource manager you are
using. Perhaps provide a copy of your heartbeat configuration.

Is heartbeat using too much CPU? It should be pretty much idle
relative to the rest of the system - If not, it is worth finding out
why not.

Regards,
Steve
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to