On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote: > On 8/18/20 7:49 AM, Andrei Borzenkov wrote: > > 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет: > > > On Mon, 17 Aug 2020 10:19:45 -0500 > > > Ken Gaillot <kgail...@redhat.com> wrote: > > > > > > > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > > > > > Thanks to all your suggestions, I now have the systems with > > > > > stonith > > > > > configured on ipmi. > > > > > > > > A word of caution: if the IPMI is on-board -- i.e. it shares > > > > the same > > > > power supply as the computer -- power becomes a single point of > > > > failure. If the node loses power, the other node can't fence > > > > because > > > > the IPMI is also down, and the cluster can't recover. > > > > > > > > Some on-board IPMI controllers can share an Ethernet port with > > > > the main > > > > computer, which would be a similar situation. > > > > > > > > It's best to have a backup fencing method when using IPMI as > > > > the > > > > primary fencing method. An example would be an intelligent > > > > power switch > > > > or sbd. > > > > > > How SBD would be useful in this scenario? Poison pill will not be > > > swallowed by > > > the dead node... Is it just to wait for the watchdog timeout? > > > > > > > Node is expected to commit suicide if SBD lost access to shared > > block > > device. So either node swallowed poison pill and died or node died > > because it realized it was impossible to see poison pill or node > > was > > dead already. After watchdog timeout (twice watchdog timeout for > > safety) > > we assume node is dead. > > Yes, like this a suicide via watchdog will be triggered if there are > issues with thedisk. This is why it is important to have a reliable > watchdog with SBD even whenusing poison pill. As this alone would > make a single shared disk a SPOF, runningwith pacemaker integration > (default) a node with SBD will survive despite ofloosing the disk > when it has quorum and pacemaker looks healthy. As corosync-quorum > in 2-node-mode obviously won't be fit for this purpose SBD will > switch > to checking for presence of both nodes if 2-node-flag is set. > > Sorry for the lengthy explanation but the full picture is required > to understand whyit is sufficiently reliable and useful if configured > correctly. > > Klaus
What I'm not sure about is how watchdog-only sbd would behave as a fail-back method for a regular fence device. Will the cluster wait for the sbd timeout no matter what, or only if the regular fencing fails, or ...? -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/