Hello,

I've recently upgraded my Pacemaker + Corosync cluster to Pacemaker 1.1.8 and 
Corosync 2.1.0. One of the resources I have is a DRBD resource (DRBD 8.3.11). 
After performing the upgrade, I noticed that crm-fence-peer.sh was returning 
the following error state (defined here: 
http://www.drbd.org/users-guide/s-fence-peer.html) when I would disrupt the 
DRBD replication link, even though the location constraint was being 
successfully added to the CIB: 
Connection to the peer node failed, peer could not be reached. (exit code 5)

I discovered that my CIB XML does not contain the "ha" attribute for the 
<node_state> tag, which crm-fence-peer.sh uses to determine if the peer node is 
reachable:
if ! echo "$state_lines" | grep -v -F uname=\"$DRBD_PEER\" | grep -q 
'ha="active"'; then

This ha attribute appears to still be used in the latest version of DRBD as 
well ( 
http://git.linbit.com/gitweb.cgi?p=drbd-8.3.git;a=blob;f=scripts/crm-fence-peer.sh;hb=HEAD
 ).

I updated the script to check the status of the crmd parameter instead (repeat 
this same modification in all of the places where ha="active" is used):
if ! echo "$state_lines" | grep -v -F uname=\"$DRBD_PEER\" | grep -q 
'crmd="online"'; then

After making this change, crm-fence-peer.sh returns the expected exit code (4)
Peer’s disk state was successfully set to Outdated (or was Outdated to begin 
with).

Does crm-fence-peer.sh also run "drbdadm outdate res" on the peer node (it 
doesn't seem to be working for me), or is that unnecessary in this scenario 
because the CIB constraint will prevent the DRBD resource from starting 
elsewhere?

Thanks,

Andrew
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to