Re: [Pacemaker] High load issues

2010-02-10 Thread Andrew Beekhof
On Fri, Feb 5, 2010 at 12:35 PM, Dominik Klein  wrote:
> Just for the record: heartbeat (3.0.2) was not able to recover either.
>
> It also manages to see a failure on the dead node but fails to recover.

What is "it" in this instance?

If $good sent a message to $bad and it didn't get a response and thats
how Pacemaker found out that $bad was bad, then I'd agree that its a
Pacemaker bug.
But thats not what is happening. Corosync is telling Pacemaker that
$bad is gone, but only after $good sends a message.

It shouldn't take Pacemaker sending a cluster message for
(corosync|heartbeat) to notice that comms are down.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] High load issues

2010-02-05 Thread Dominik Klein
Just for the record: heartbeat (3.0.2) was not able to recover either.

It also manages to see a failure on the dead node but fails to recover.

Regards
Dominik

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] High load issues

2010-02-05 Thread Michael Schwartzkopff
Regarding that high load problem:

This should be addresses by the system health feature that was / will be 
introduced into pacemaker. There are two problems in the order if relevance:
- Problem of automatic failback not solved. See bug 2269 in bugzilla.
- There are very little system health agents available.
- Need for a Meta-Agent bundeling other Health agents to beautify the GUI and 
CRM output.

Greetings,

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] High load issues

2010-02-05 Thread Dejan Muhamedagic
Hi,

On Fri, Feb 05, 2010 at 08:59:50AM +0100, Dominik Klein wrote:
> > But generally I believe this test case is invalid.
> 
> I might agree here that this test case does not necessarily reproduce
> what happened on my production system (unfortunately I do not know for
> sure what happened there, the dev who caused this just tells me he used
> some stupid sql statement and even executed it several times in
> parallel), but I do not think the testcase is invalid. If there is an
> OOM situation on a node and therefore the local pacemaker can't do it's
> job anymore (I base this statement on the various lrmd "cannot allocate
> memory" logs), this is a case the cluster should be able to recover from.

Yes, I'd say the cluster should be able to deal with a node which
is in just about any state. This time, at least it seems so, the
problem was that corosync ran as a realtime process and crmd not.
Perhaps corosync should watch the local processes, i.e. to have
some kind of IPC heartbeat ...

> What I saw while doing this test was that the bad node discovered
> failures on the running ip and mysql resources, scheduled the recovery,
> but never managed to recover.
> 
> I think it was lmb who suggested "periodic health-checks" on the
> pacemaker layer. If pacemaker on $good had periodically tried to talk to
> pacemaker on $bad, then it might have seen that $bad does not respond
> and might have done something about it. Just my theory though.

... or the higher level heartbeats as you suggested here. There
is still, however, a problem with false positives. At any rate,
the user should have a way to specify when a node is not usable
anymore.

Thanks,

Dejan

> Opinions?
> 
> Regards
> Dominik
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] High load issues

2010-02-05 Thread Dominik Klein
> But generally I believe this test case is invalid.

I might agree here that this test case does not necessarily reproduce
what happened on my production system (unfortunately I do not know for
sure what happened there, the dev who caused this just tells me he used
some stupid sql statement and even executed it several times in
parallel), but I do not think the testcase is invalid. If there is an
OOM situation on a node and therefore the local pacemaker can't do it's
job anymore (I base this statement on the various lrmd "cannot allocate
memory" logs), this is a case the cluster should be able to recover from.

What I saw while doing this test was that the bad node discovered
failures on the running ip and mysql resources, scheduled the recovery,
but never managed to recover.

I think it was lmb who suggested "periodic health-checks" on the
pacemaker layer. If pacemaker on $good had periodically tried to talk to
pacemaker on $bad, then it might have seen that $bad does not respond
and might have done something about it. Just my theory though.

Opinions?

Regards
Dominik

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] High load issues

2010-02-04 Thread Steven Dake
On Thu, 2010-02-04 at 16:09 +0100, Dominik Klein wrote:
> Hi people,
> 
> I'll take the risk of annoying you, but I really think this should not
> be forgotten.
> 
> If there is high load on a node, the cluster seems to have problems
> recovering from that. I'd expect the cluster to recognize that a node is
> unresponsive, stonith it and start services elsewhere.
> 
> By unresponsive I mean not being able to use the cluster's service, not
> being able to ssh into the node.
> 
> I am not sure whether this is an issue of pacemaker (iiuc, beekhof seems
> to think it is not) or corosync (iiuc, sdake seems to think it is not)
> or maybe a configuration/thinking thing on my side (which might just be).
> 
> Anyway, attached you will find a hb_report which covers the startup of
> the cluster nodes, then what it does when there is high load and no
> memory left. Then I killed the load producing things and almost
> immediately, the cluster cleaned up things.
> 
> I had at least expected that after I saw "FAILED" status in crm_mon,
> that after the configured timeouts for stop (120s max in my case), the
> failover should happen, but it did not.
> 
> What I did to produce load:
> * run several "md5sum $file" on 1gig files
> * run several heavy sql statements on large tables
> * saturate(?) the nic using netcat -l on the busy node and netcat -w fed
> by /dev/urandom on another node
> * start a forkbomb script which does "while (true); do bash $0; done;"
> 
> Used versions:
> corosync 1.2.0
> pacemaker 1.0.7
> 64 bit packages from clusterlabs for opensuse 11.1
> 

The forkbomb triggers an OOM situation.  In Linux, when OOM happens
really all bets are off as to what will occur.  I expect that the system
would work properly without the forkbomb.  Could you try that?

Corosync actually works quite well in OOM situations and usually doesn't
detect this as a failure unless the oom killer blows away the corosync
process.  To corosync, the node is fully operational (because it is
designed to work in an OOM situation).

Detecting memory overcommit and doing something about it may be
something we should do with Corosync.

But generally I believe this test case is invalid.  A system should be
properly sized memory wise to handle the applications that are intended
to run on it.  Really sounds like a deployment issue if the systems
don't contain the appropriate ram to run the applications.

I believe there is a way of setting affinity in the OOM killer but it's
been 4 years since I've worked on the kernel fulltime so I don't know
the details.  One option is to set the affinity to always try to blow
away the corosync process.  Then you would get fencing in this
condition.

Regards
-steve

> If you need more information, want me to try patches, whatever, please
> let me know.
> 
> Regards
> Dominik
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker