Re: [Pacemaker] High load issues
On Fri, Feb 5, 2010 at 12:35 PM, Dominik Klein wrote: > Just for the record: heartbeat (3.0.2) was not able to recover either. > > It also manages to see a failure on the dead node but fails to recover. What is "it" in this instance? If $good sent a message to $bad and it didn't get a response and thats how Pacemaker found out that $bad was bad, then I'd agree that its a Pacemaker bug. But thats not what is happening. Corosync is telling Pacemaker that $bad is gone, but only after $good sends a message. It shouldn't take Pacemaker sending a cluster message for (corosync|heartbeat) to notice that comms are down. ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] High load issues
Just for the record: heartbeat (3.0.2) was not able to recover either. It also manages to see a failure on the dead node but fails to recover. Regards Dominik ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] High load issues
Regarding that high load problem: This should be addresses by the system health feature that was / will be introduced into pacemaker. There are two problems in the order if relevance: - Problem of automatic failback not solved. See bug 2269 in bugzilla. - There are very little system health agents available. - Need for a Meta-Agent bundeling other Health agents to beautify the GUI and CRM output. Greetings, -- Dr. Michael Schwartzkopff MultiNET Services GmbH Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany Tel: +49 - 89 - 45 69 11 0 Fax: +49 - 89 - 45 69 11 21 mob: +49 - 174 - 343 28 75 mail: mi...@multinet.de web: www.multinet.de Sitz der Gesellschaft: 85630 Grasbrunn Registergericht: Amtsgericht München HRB 114375 Geschäftsführer: Günter Jurgeneit, Hubert Martens --- PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B Skype: misch42 ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] High load issues
Hi, On Fri, Feb 05, 2010 at 08:59:50AM +0100, Dominik Klein wrote: > > But generally I believe this test case is invalid. > > I might agree here that this test case does not necessarily reproduce > what happened on my production system (unfortunately I do not know for > sure what happened there, the dev who caused this just tells me he used > some stupid sql statement and even executed it several times in > parallel), but I do not think the testcase is invalid. If there is an > OOM situation on a node and therefore the local pacemaker can't do it's > job anymore (I base this statement on the various lrmd "cannot allocate > memory" logs), this is a case the cluster should be able to recover from. Yes, I'd say the cluster should be able to deal with a node which is in just about any state. This time, at least it seems so, the problem was that corosync ran as a realtime process and crmd not. Perhaps corosync should watch the local processes, i.e. to have some kind of IPC heartbeat ... > What I saw while doing this test was that the bad node discovered > failures on the running ip and mysql resources, scheduled the recovery, > but never managed to recover. > > I think it was lmb who suggested "periodic health-checks" on the > pacemaker layer. If pacemaker on $good had periodically tried to talk to > pacemaker on $bad, then it might have seen that $bad does not respond > and might have done something about it. Just my theory though. ... or the higher level heartbeats as you suggested here. There is still, however, a problem with false positives. At any rate, the user should have a way to specify when a node is not usable anymore. Thanks, Dejan > Opinions? > > Regards > Dominik > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] High load issues
> But generally I believe this test case is invalid. I might agree here that this test case does not necessarily reproduce what happened on my production system (unfortunately I do not know for sure what happened there, the dev who caused this just tells me he used some stupid sql statement and even executed it several times in parallel), but I do not think the testcase is invalid. If there is an OOM situation on a node and therefore the local pacemaker can't do it's job anymore (I base this statement on the various lrmd "cannot allocate memory" logs), this is a case the cluster should be able to recover from. What I saw while doing this test was that the bad node discovered failures on the running ip and mysql resources, scheduled the recovery, but never managed to recover. I think it was lmb who suggested "periodic health-checks" on the pacemaker layer. If pacemaker on $good had periodically tried to talk to pacemaker on $bad, then it might have seen that $bad does not respond and might have done something about it. Just my theory though. Opinions? Regards Dominik ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] High load issues
On Thu, 2010-02-04 at 16:09 +0100, Dominik Klein wrote: > Hi people, > > I'll take the risk of annoying you, but I really think this should not > be forgotten. > > If there is high load on a node, the cluster seems to have problems > recovering from that. I'd expect the cluster to recognize that a node is > unresponsive, stonith it and start services elsewhere. > > By unresponsive I mean not being able to use the cluster's service, not > being able to ssh into the node. > > I am not sure whether this is an issue of pacemaker (iiuc, beekhof seems > to think it is not) or corosync (iiuc, sdake seems to think it is not) > or maybe a configuration/thinking thing on my side (which might just be). > > Anyway, attached you will find a hb_report which covers the startup of > the cluster nodes, then what it does when there is high load and no > memory left. Then I killed the load producing things and almost > immediately, the cluster cleaned up things. > > I had at least expected that after I saw "FAILED" status in crm_mon, > that after the configured timeouts for stop (120s max in my case), the > failover should happen, but it did not. > > What I did to produce load: > * run several "md5sum $file" on 1gig files > * run several heavy sql statements on large tables > * saturate(?) the nic using netcat -l on the busy node and netcat -w fed > by /dev/urandom on another node > * start a forkbomb script which does "while (true); do bash $0; done;" > > Used versions: > corosync 1.2.0 > pacemaker 1.0.7 > 64 bit packages from clusterlabs for opensuse 11.1 > The forkbomb triggers an OOM situation. In Linux, when OOM happens really all bets are off as to what will occur. I expect that the system would work properly without the forkbomb. Could you try that? Corosync actually works quite well in OOM situations and usually doesn't detect this as a failure unless the oom killer blows away the corosync process. To corosync, the node is fully operational (because it is designed to work in an OOM situation). Detecting memory overcommit and doing something about it may be something we should do with Corosync. But generally I believe this test case is invalid. A system should be properly sized memory wise to handle the applications that are intended to run on it. Really sounds like a deployment issue if the systems don't contain the appropriate ram to run the applications. I believe there is a way of setting affinity in the OOM killer but it's been 4 years since I've worked on the kernel fulltime so I don't know the details. One option is to set the affinity to always try to blow away the corosync process. Then you would get fencing in this condition. Regards -steve > If you need more information, want me to try patches, whatever, please > let me know. > > Regards > Dominik > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker