[ClusterLabs] inquiry - remote node fails over

Janghyuk Boo Wed, 13 Oct 2021 14:13:26 -0700

Dear Community,

I have a Pacemaker cluster with two cluster nodes with two network interfaces each, and two remote nodes and one fabric fencing agent for the whole cluster.

nodelist {
   node {
       ring0_addr: xxx
       ring1_addr: xxx
       name: jangcluster-srv-1
       nodeid: 1
   }
   node {
       ring0_addr: xxx
       ring1_addr: xxx
       name: jangcluster-srv-2
       nodeid: 2
   }
}

Node List:
* Online: [ jangcluster-srv-1 jangcluster-srv-2 ]
* RemoteOnline: [ jangcluster-srv-3 jangcluster-srv-4 ]

Full List of Resources:
* GPFS-Fence (stonith:fence_gpfs):   Started jangcluster-srv-1
* jangcluster-srv-3  (ocf::pacemaker:remote):        Started jangcluster-srv-1
* jangcluster-srv-4  (ocf::pacemaker:remote):        Started jangcluster-srv-2

node 1: jangcluster-srv-1 \
       attributes ethmonitor-eth1=1
node 2: jangcluster-srv-2 \
       attributes ethmonitor-eth1=1
node jangcluster-srv-3:remote \
       attributes ethmonitor-eth1=1
node jangcluster-srv-4:remote \
       attributes ethmonitor-eth1=1
primitive GPFS-Fence stonith:fence_gpfs \
       params ipaddr=jangcluster-srv-1 pcmk_host_list=" jangcluster-srv-1 jangcluster-srv-2 jangcluster-srv-3 jangcluster-srv-4" secure=true \
       op monitor interval=10s timeout=500s \
       op off interval=0 \
       meta is-managed=true
primitive NIC_eth1 ethmonitor \
       params interface=eth1 repeat_count=4 repeat_interval=4 link_status_only=true \
       op monitor timeout=30s interval=4 \
       op start timeout=60s interval=0s \
       op stop interval=0s timeout=20s
location prefer-node-jangcluster-srv-3 jangcluster-srv-3 100: jangcluster-srv-1
location prefer-node-jangcluster-srv-4 jangcluster-srv-4 100: jangcluster-srv-2
location prefer-node-jangcluster-srv-3-2 jangcluster-srv-3 50: jangcluster-srv-2
location prefer-node-jangcluster-srv-4-2 jangcluster-srv-4 50: jangcluster-srv-1

I noticed that remote node gets fenced when the quorum node its connected to gets fenced or experiences network failure.
For example, when I disconnected srv-2 from the rest of the cluster,

I expected that remote node jangcluster-srv-4 would failover to srv-1 given my location constraints,

but srv-4 was getting fenced along with srv-2 instead of failing over.

How can I configure the cluster so that remote node srv-4 fails over instead of getting fenced?

Thank you

Janghyuk Boo.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users


ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] inquiry - remote node fails over

Reply via email to