Hello all, I tested several versions of DRBD kernel drivers. The results are as follows.
- drbd-8.4.11-1 - OK (worked as expected) - drbd-9.0.15-1 - NG - drbd-9.0.19-0rc2 - NG Regards, -- Takashi Sogabe -----Original Message----- From: [email protected] <[email protected]> On Behalf Of Takashi Sogabe Sent: Monday, June 24, 2019 4:27 PM To: [email protected] Subject: [DRBD-user] Resource Level Fencing Issue: DRBD9 with pacemaker/corosync Hello all, I encountered an issue when I tested behavior of resource level fencing with pacemaker/corosync. It seems something is wrong with handling of fencing events in case of quiescent situation. I am a newbie of DRBD, so I followed the procedure from LINBIT documentations[1]. I would be great if DRBD guys could let me know mis-configuration if my configuration was wrong. Or, are there any potential issues around there? App packages: - OS - Ubuntu 18.04LTS - DRBD packages - drbd-utils - 9.10.0-1ppa1~bionic1 - drbd-dkms - 9.0.19~rc1-1ppa1~bionic1 - Pacemaker / Corosync - pacemaker - 1.1.18-0ubuntu1.1 - corosync - 2.4.3-0ubuntu1.1 Network topology: +--------------+ <- (Redundancy for resource level fencing) node1 node2 | | +-------+------+ <- (DRBD data, NFS data) | node3 - node1 - DRBD Primary - NFS Server - node2 - DRBD Secondary - NFS Server - node3 - NFS Client Description of the Issue: 1. When I disconnected connections between node1 and node2 by using iptables, fencing was invoked as expected. # node1 syslog output: Jun 24 05:51:22 node1 kernel: [17025.778428] drbd r0/0 drbd0 node2: pdsk( UpToDate -> DUnknown ) repl( Established ->Off ) Jun 24 05:51:22 node1 kernel: [17025.810749] drbd r0 node2: helper command: /sbin/drbdadm fence-peer Jun 24 05:51:23 node1 kernel: [17026.624983] drbd r0/0 drbd0 node2: pdsk( DUnknown -> Outdated ) # node2 syslog output: Jun 24 05:51:22 node2 kernel: [17020.463472] drbd r0/0 drbd0 node1: pdsk( UpToDate -> DUnknown ) repl( Established -> Off ) 2. After I recovered connections, corresponding unfencing was not invoked because state of pdsk (at node1) was transitioned from 'Outdated' to 'UpToDate' directly. [2] describes 'after-resync-target' is called on a resync target when a node state changes from Inconsnstent to Consistent when a resync finishes. So, it seems natural nothing is happen. # node1 syslog output: Jun 24 05:53:19 node1 kernel: [17143.117883] drbd r0/0 drbd0 node2: pdsk( Outdated -> UpToDate ) repl( Off -> Established ) # node2 syslog output: Jun 24 05:53:19 node2 kernel: [17137.789972] drbd r0/0 drbd0 node1: pdsk( DUnknown -> UpToDate ) repl( Off -> Established ) Note that If something was written in the DRBD disk after '1.', unfencing was invoked as follows. 1'. # node1 syslog output: Jun 24 05:56:32 node1 kernel: [17336.114786] drbd r0/0 drbd0 node2: pdsk( UpToDate -> DUnknown ) repl( Established -> Off ) Jun 24 05:56:32 node1 kernel: [17336.146599] drbd r0 node2: helper command: /sbin/drbdadm fence-peer Jun 24 05:56:32 node1 kernel: [17336.186206] drbd r0/0 drbd0 node2: pdsk( DUnknown -> Outdated ) # node2 syslog output: Jun 24 05:56:32 node2 kernel: [17330.741001] drbd r0/0 drbd0 node1: pdsk( UpToDate -> DUnknown ) repl( Established -> Off ) 2'. # node1 syslog output: Jun 24 06:01:54 node1 kernel: [17658.236522] drbd r0/0 drbd0 node2: pdsk( Outdated -> Consistent ) repl( Off -> WFBitMapS ) Jun 24 06:01:54 node1 kernel: [17658.237470] drbd r0/0 drbd0 node2: pdsk( Consistent -> Outdated ) Jun 24 06:01:54 node1 kernel: [17658.239750] drbd r0/0 drbd0 node2: pdsk( Outdated -> Inconsistent ) repl( WFBitMapS -> SyncSource ) Jun 24 06:01:54 node1 kernel: [17658.307340] drbd r0/0 drbd0 node2: pdsk( Inconsistent -> UpToDate ) repl( SyncSource-> Established ) Jun 24 06:01:54 node1 kernel: [17658.307445] drbd r0 node2: helper command: /sbin/drbdadm unfence-peer # node2 syslog output: Jun 24 06:01:54 node2 kernel: [17652.870284] drbd r0/0 drbd0 node1: pdsk( DUnknown -> UpToDate ) repl( Off -> WFBitMapT ) Jun 24 06:01:54 node2 kernel: [17652.908634] drbd r0/0 drbd0: disk( Outdated -> Inconsistent ) Jun 24 06:01:54 node2 kernel: [17652.908636] drbd r0/0 drbd0 node1: repl( WFBitMapT -> SyncTarget ) Jun 24 06:01:54 node2 kernel: [17652.918096] drbd r0/0 drbd0: disk( Inconsistent -> UpToDate ) Jun 24 06:01:54 node2 kernel: [17652.918098] drbd r0/0 drbd0 node1: repl( SyncTarget -> Established ) Jun 24 06:01:54 node2 kernel: [17652.918221] drbd r0/0 drbd0 node1: helper command: /sbin/drbdadm after-resync-target Pacemaker Configuration (crm interactive): ---- sudo crm configure primitive drbd_nfs ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="29s" role="Master" \ op monitor interval="31s" role="Slave" ms ms_drbd_nfs drbd_nfs \ meta master-max="1" master-node-max="1" \ clone-max="2" clone-node-max="1" \ notify="true" primitive fs_nfs ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/r0/0" \ directory="/mnt/drbd" fstype="xfs" primitive ip_nfs ocf:heartbeat:IPaddr2 \ params ip="192.168.1.100" nic="enp0s8" primitive nfsd lsb:nfs-kernel-server group nfs fs_nfs ip_nfs nfsd colocation nfs_on_drbd \ inf: nfs ms_drbd_nfs:Master order nfs_after_drbd \ inf: ms_drbd_nfs:promote nfs:start commit exit ---- DRBD Configuration (/etc/drbd.d/r0.res): ---- resource r0 { protocol C; # for resource level fencing net { fencing resource-only; } handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.9.sh"; } # device /dev/drbd0; disk /dev/local/r0; meta-disk internal; on node1 { address 192.168.1.11:7789; } on node2 { address 192.168.1.12:7789; } } ---- Corosync Configuration (/etc/corosync/corosync.conf): ---- totem { version: 2 cluster_name: drbd secauth: off transport: udpu rrp_mode: active } nodelist { node { ring0_addr: 192.168.1.11 ring1_addr: 172.24.1.11 nodeid: 1 } node { ring0_addr: 192.168.1.12 ring1_addr: 172.24.1.12 nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } ---- [1] "14. Integrating DRBD with Pacemaker clusters", The DRBD9 and LINSTOR User’s Guide, https://docs.linbit.com/docs/users-guide-9.0/#ch-pacemaker [2] Section handlers Parameters after-resync-target cmd https://docs.linbit.com/man/v9/drbd-conf-5/#Section_handlers_Parameters Regards, -- Takashi Sogabe <[email protected]> _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user _______________________________________________ Star us on GITHUB: https://github.com/LINBIT drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
