On 23/04/2013, at 1:50 AM, Greg Woods <wo...@ucar.edu> wrote: > On Mon, 2013-04-22 at 10:12 +1000, Andrew Beekhof wrote: >> On Saturday, April 20, 2013, Greg Woods wrote: >> Often one of the >>> nodes gets stuck at "Stopping HA Services" >> >> >> That means pacemaker is waiting for one of your resources to stop. >> Do you have anything that would take a long time (or fail to stop)? > > Not that I am aware of. But some things that came up during this > weekend's powerdown make me think that some of the stop actions are > failing, because setting the stop-all-resources=true property sometimes > caused nodes to be fenced.
In that case, almost certainly stop failures are the cause. > > I always dread having to try and find useful information in the > voluminous Pacemaker/Heartbeat logs, They're getting better. People now complain there isn't enough detail in syslog which is probably a good sign. > but I'll have to try. Of course, > this doesn't happen on the test clusters, and it is hard to debug it > when reproducing it requires creating a service outage on a production > cluster. > > --Greg > > > > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems