On Tue, Nov 02, 2010 at 01:28:09PM +0100, Pavlos Parissis wrote: > On 2 November 2010 13:18, Dejan Muhamedagic <deja...@fastmail.fm> wrote: > > > Hi, > > > > On Tue, Nov 02, 2010 at 01:09:02PM +0100, Pavlos Parissis wrote: > > > On 2 November 2010 13:02, Dejan Muhamedagic <deja...@fastmail.fm> wrote: > > > [...snip...] > > > > > > > > > > > > > Definitely not. If you do the monitor action from the command > > > > > > line does that also return the unexpected exit code: > > > > > > > > > > > > > > > > from the code I pasted you can see it returned 1. > > > > > > > > There is a difference. stonith-ng (stonithd) is a daemon that > > > > runs a perl script (fencing_legacy) which invokes stonith which > > > > then invokes the plugin. A problem can occur in any of these > > > > components. It's important to find out where. > > > > > > > > > > # stonith -t external/rackpdu community="empisteftiko" > > > > > > names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" ... -lS > > > > > > > > > > > > Which pacemaker release do you run? I couldn't reproduce this > > > > > > with a recent Pacemaker. > > > > > > > > > > > > > > > > that it was on 1.1.3 and now I run 1.0.9. > > > > > Do you want me to run the test on 1.0.9? > > > > > > > > Yes, please. 1.0.9 is still running the old, and well tested, > > > > stonithd, so the result could be different. > > > > > > > > > > > I have the pdu off because it stopped working anymore! As a result the > > > resource is stopped. > > > But I did the test I see that even rackpdu returns 1 on status stonithd > > > reports 256 > > > > Ah, I understand what's going on now. It's a bug in the interface > > to external plugins which was exposed by stonith-ng. It has been > > fixed in August. The fix is here (in hg.linux-ha.org/glue): > > > > changeset: 2427:b7df127fc09e > > user: Dejan Muhamedagic <de...@hello-penguin.com> > > date: Thu Aug 12 14:01:10 2010 +0200 > > summary: High: stonith: external: interpret properly exit codes from > > external stonith plugins (bnc#630357) > > > > There hasn't been a glue release since then, but there should be > > one fairly soon. Note that this affects only Pacemaker 1.1. > > > > Thanks, > > > > Dejan > > > > > > > > > Does this bug have to do anything with PE ignoring monitor failure?
The PE doesn't ignore the failure because it doesn't see it. The exit code 256 is actually encoded as 0 so, as far as the crmd and PE are concerned everything is OK. Thanks, Dejan > Pavlos > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker