On 04/10/2018 08:48 AM, Jehan-Guillaume de Rorthais wrote: > On Mon, 09 Apr 2018 17:59:26 -0500 > Ken Gaillot <kgail...@redhat.com> wrote: > >> On Tue, 2018-04-10 at 00:02 +0200, Jehan-Guillaume de Rorthais wrote: >>> On Tue, 03 Apr 2018 17:35:43 -0500 >>> Ken Gaillot <kgail...@redhat.com> wrote: >>> >>>> On Tue, 2018-04-03 at 21:46 +0200, Klaus Wenninger wrote: >>>>> On 04/03/2018 05:43 PM, Ken Gaillot wrote: >>>>>> On Tue, 2018-04-03 at 07:36 +0200, Klaus Wenninger wrote: >>>>>>> On 04/02/2018 04:02 PM, Ken Gaillot wrote: >>>>>>>> On Mon, 2018-04-02 at 10:54 +0200, Jehan-Guillaume de >>>>>>>> Rorthais >>>>>>>> wrote: >>> [...] >>>>>>> -inf constraints like that should effectively prevent >>>>>>> stonith-actions from being executed on that nodes. >>>>>> It shouldn't ... >>>>>> >>>>>> Pacemaker respects target-role=Started/Stopped for controlling >>>>>> execution of fence devices, but location (or even whether the >>>>>> device is >>>>>> "running" at all) only affects monitors, not execution. >>>>>> >>>>>>> Though there are a few issues with location constraints >>>>>>> and stonith-devices. >>>>>>> >>>>>>> When stonithd brings up the devices from the cib it >>>>>>> runs the parts of pengine that fully evaluate these >>>>>>> constraints and it would disable the stonith-device >>>>>>> if the resource is unrunable on that node. >>>>>> That should be true only for target-role, not everything that >>>>>> affects >>>>>> runnability >>>>> cib_device_update bails out via a removal of the device if >>>>> - role == stopped >>>>> - node not in allowed_nodes-list of stonith-resource >>>>> - weight is negative >>>>> >>>>> Wouldn't that include a -inf rule for a node? >>>> Well, I'll be ... I thought I understood what was going on there. >>>> :-) >>>> You're right. >>>> >>>> I've frequently seen it recommended to ban fence devices from their >>>> target when using one device per target. Perhaps it would be better >>>> to >>>> give a lower (but positive) score on the target compared to the >>>> other >>>> node(s), so it can be used when no other nodes are available. you >>>> could >>>> re-manage. >>> Wait, you mean a fencing resource can be triggered from its own >>> target? Wat >>> happen then? Node suicide and all the cluster nodes are shutdown? >>> >>> Thanks, >> A node can fence itself, though it will be the cluster's last resort >> when no other node can. It doesn't necessarily imply all other nodes >> are shut down ... > Indeed, sorry I was clear enough: I was talking about a fencing race > situation. Fencing races - as well if suicide is involved - shouldn't be prevented by one partition not having quorum. That should be an issue just with 2-node-feature enabled. Which scenario did you have in mind? > >> there may be other nodes up, but they are not allowed >> execute the relevant fence device for whatever reason. > In such situation, how other node can confirm the node fence itself without > confirmation?
Basically I see 2 cases: - sbd with watchdog-fencing where the other nodes assume suicide to be successful after a certain time - basically if a node is able to commit suicide (while part of a quorate partition) I would expect it to come back online after reboot telling the cluster that the resources are down Regards, Klaus > >> But of course there might be no other nodes up, in which case, yes, the >> cluster dies (the idea being that the node is known to be malfunctioning, so >> stop it from possibly corrupting data). > This make sense to me. > > Thanks, _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org