Re: [ClusterLabs] multiple action= lines sent to STDIN of fencing agents - why?

2015-10-15 Thread Jan Pokorný
On 15/10/15 12:25 +0100, Adam Spiers wrote:
> I inserted some debugging into fencing.py and found that stonithd
> sends stuff like this to STDIN of the fencing agents it forks:
> 
> action=list
> param1=value1
> param2=value2
> param3=value3
> action=list
> 
> where paramX and valueX come from the configuration of the primitive
> for the fencing agent.
> 
> As a corollary, if the primitive for the fencing agent has 'action'
> defined as one of its parameters, this means that there will be three
> 'action=' lines, and the middle one could have a different value to
> the two sandwiching it.
> 
> When I first saw this, I had an extended #wtf moment and thought it
> was a bug.  But on closer inspection, it seems very deliberate, e.g.
> 
>   
> https://github.com/ClusterLabs/pacemaker/commit/bfd620645f151b71fafafa279969e9d8bd0fd74f
> 
> The "regardless of what the admin configured" comment suggests to me
> that there is an underlying assumption that any fencing agent will
> ensure that if the same parameter is duplicated on STDIN, the final
> value will override any previous ones.  And indeed fencing.py ensures
> this, but presumably it is possible to write agents which don't use
> fencing.py.
> 
> Is my understanding correct?  If so:
> 
> 1) Is the first 'action=' line intended in order to set some kind of
>default action, in the case that the admin didn't configure the
>primitive with an 'action=' parameter *and* _action wasn't one of
>list/status/monitor/metadata?  In what circumstances would this
>happen?
> 
> 2) Is this assumption of the agents always being order-sensitive
>(i.e. last value always wins) documented anywhere?  The best
>documentation on the API I could find was here:
> 
>   https://fedorahosted.org/cluster/wiki/FenceAgentAPI
> 
>but it doesn't mention this.

Agreed that this implicit and undocumented arrangement smells fishy.

In fact I raised this point within RH team 3 month back but there
wasn't any feedback at that time.  Haven't filed a bug because it's
unclear to me what's intentional and what's not between the two
interacting sides.  Back then, I was proposing that fencing.py should,
at least, emit a message about subsequent parameter overrides.

Note that "port" parameter (or whatever is the correct way to specify
a victim in the particular case) is affected as well.

-- 
Jan (Poki)


pgpYNZysQmwGG.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] multiple action= lines sent to STDIN of fencing agents - why?

2015-10-15 Thread Ken Gaillot
On 10/15/2015 06:25 AM, Adam Spiers wrote:
> I inserted some debugging into fencing.py and found that stonithd
> sends stuff like this to STDIN of the fencing agents it forks:
> 
> action=list
> param1=value1
> param2=value2
> param3=value3
> action=list
> 
> where paramX and valueX come from the configuration of the primitive
> for the fencing agent.
> 
> As a corollary, if the primitive for the fencing agent has 'action'
> defined as one of its parameters, this means that there will be three
> 'action=' lines, and the middle one could have a different value to
> the two sandwiching it.
> 
> When I first saw this, I had an extended #wtf moment and thought it
> was a bug.  But on closer inspection, it seems very deliberate, e.g.
> 
>   
> https://github.com/ClusterLabs/pacemaker/commit/bfd620645f151b71fafafa279969e9d8bd0fd74f
> 
> The "regardless of what the admin configured" comment suggests to me
> that there is an underlying assumption that any fencing agent will
> ensure that if the same parameter is duplicated on STDIN, the final
> value will override any previous ones.  And indeed fencing.py ensures
> this, but presumably it is possible to write agents which don't use
> fencing.py.
> 
> Is my understanding correct?  If so:

Yes, good sleuthing.

> 1) Is the first 'action=' line intended in order to set some kind of
>default action, in the case that the admin didn't configure the
>primitive with an 'action=' parameter *and* _action wasn't one of
>list/status/monitor/metadata?  In what circumstances would this
>happen?

The first action line is usually the only one.

Ideally, admins don't configure "action" as a parameter of a fence
device. They either specify nothing (in which case the cluster does what
it thinks should be done -- reboot, off, etc.), or they specify
pcmk_*_action to override the cluster's choice. For example,
pcmk_reboot_action=off tells the cluster to actually send the fence
agent "action=off" when a reboot is desired. (Perhaps the admin prefers
that flaky nodes stay down until investigated, or the fence device
doesn't handle reboots well.)

So the first action line is the result of that. If the admin configured
a pcmk_*_action for the requested action, the agent will get that,
otherwise it gets the requested action.

Second, any parameters in the device configuration are copied to the
agent. So if the admin did specify "action" there, it will get copied
(as a second instance, possibly different from the first).

But that would override *all* requested actions, which is a bad idea. No
one wants a recurring monitor action to shoot a node! :) So that last
step is a failsafe, if the admin did supply an "action", re-send the
original line if the requested action was informational
(list/status/monitor/metadata) and not a "real" fencing action (off/reboot).

> 2) Is this assumption of the agents always being order-sensitive
>(i.e. last value always wins) documented anywhere?  The best
>documentation on the API I could find was here:
> 
>   https://fedorahosted.org/cluster/wiki/FenceAgentAPI
> 
>but it doesn't mention this.

Good point. It would be a good idea to add that to the API since it's
established practice, but it probably would also be a good idea for
pacemaker to send only the final value of any parameter.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org