Re: [ClusterLabs] Question about fencing

digimer Wed, 17 Apr 2019 14:33:27 -0700

Fencing requires some mechanism, outside the nodes themselves, that canterminate the nodes. Typically, IPMI (iLO, iRMC, RSA, DRAC, etc) is usedfor this. Alternatively, switched PDUs are common. If you don't havethese but do have a watchdog timer on your nodes, SBD (storage-baseddeath) can work.

You can use 'fence_<device> <options> -o status' at the command line tofigure out the what will work with your hardware. Once you can called'fence_foo ... -o status' and get the status of each node, thentranslating that into a pacemaker configuration is pretty simple. That'swhen you enable stonith.

Once stonith is setup and working in pacemaker (ie: you can crash a nodeand the peer reboots it), then you will go to DRBD and set 'fencing:resource-and-stonith;' (tells DRBD to block on communication failurewith the peer and request a fence), and then setup the 'fence-handler/path/to/crm-fence-peer.sh' and 'unfence-handler/path/to/crm-unfence-handler.sh' (I am going from memory, check the manpage to verify syntax).

With all this done; if either pacemaker/corosync or DRBD lose contactwith the peer, they will block and fence. Only after the peer has beenconfirmed terminated will IO resume. This way, split-nodes becomeeffectively impossible.


digimer

On 2019-04-17 5:17 p.m., JCA wrote:

Here is what I did:

# pcs stonith create disk_fencing fence_scsi pcmk_host_list="one two"pcmk_monitor_action="metadata" pcmk_reboot_action="off"devices="/dev/disk/by-id/ata-VBOX_HARDDISK_VBaaa429e4-514e8ecb" metaprovides="unfencing"

where ata-VBOX-... corresponds to the device where I have thepartition that is shared between both nodes in my cluster. The commandcompletes without any errors (that I can see) and after that I have


# pcs status
Cluster name: ClusterOne
Stack: corosync

Current DC: one (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition withquorum

Last updated: Wed Apr 17 14:35:25 2019
Last change: Wed Apr 17 14:11:14 2019 by root via cibadmin on one

2 nodes configured
5 resources configured

Online: [ one two ]

Full list of resources:

 MyCluster(ocf::myapp:myapp-script):Stopped
 Master/Slave Set: DrbdDataClone [DrbdData]
     Stopped: [ one two ]
 DrbdFS(ocf::heartbeat:Filesystem):Stopped
 disk_fencing (stonith:fence_scsi):Stopped

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Things stay that way indefinitely, until I set stonith-enabled tofalse - at which point all the resources above get started immediately.


Obviously, I am missing something big here. But, what is it?

On Wed, Apr 17, 2019 at 2:59 PM Adam Budziński<budzinski.a...@gmail.com <mailto:budzinski.a...@gmail.com>> wrote:


    You did not configure any fencing device.

    śr., 17.04.2019, 22:51 użytkownik JCA <1.41...@gmail.com
    <mailto:1.41...@gmail.com>> napisał:

        I am trying to get fencing working, as described in the
        "Cluster from Scratch" guide, and I am stymied at get-go :-(

        The document mentions a property named stonith-enabled. When I
        was trying to get my first cluster going, I noticed that my
        resources would start only when this property is set to false,
        by means of

            # pcs property set stonith-enabled=false

        Otherwise, all the resources remain stopped.

        I created a fencing resource for the partition that I am
        sharing across the the nodes, by means of DRBD. This works
        fine - but I still have the same problem as above - i.e. when
        stonith-enabled is set to true, all the resources get stopped,
        and remain in that state.

        I am very confused here. Can anybody point me in the right
        direction out of this conundrum?



        _______________________________________________
        Manage your subscription:
        https://lists.clusterlabs.org/mailman/listinfo/users

        ClusterLabs home: https://www.clusterlabs.org/

    _______________________________________________
    Manage your subscription:
    https://lists.clusterlabs.org/mailman/listinfo/users

    ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Question about fencing

Reply via email to