Dear Community,
I have a Pacemaker cluster with two cluster nodes with two network interfaces each, and two remote nodes and one fabric fencing agent for the whole cluster.
nodelist {
node {
ring0_addr: xxx
ring1_addr: xxx
name: jangcluster-srv-1
nodeid: 1
}
node {
ring0_addr: xxx
ring1_addr: xxx
name: jangcluster-srv-2
nodeid: 2
}
}
node {
ring0_addr: xxx
ring1_addr: xxx
name: jangcluster-srv-1
nodeid: 1
}
node {
ring0_addr: xxx
ring1_addr: xxx
name: jangcluster-srv-2
nodeid: 2
}
}
Node List:
* Online: [ jangcluster-srv-1 jangcluster-srv-2 ]
* RemoteOnline: [ jangcluster-srv-3 jangcluster-srv-4 ]
Full List of Resources:
* GPFS-Fence (stonith:fence_gpfs): Started jangcluster-srv-1
* jangcluster-srv-3 (ocf::pacemaker:remote): Started jangcluster-srv-1
* jangcluster-srv-4 (ocf::pacemaker:remote): Started jangcluster-srv-2
node 1: jangcluster-srv-1 \
attributes ethmonitor-eth1=1
node 2: jangcluster-srv-2 \
attributes ethmonitor-eth1=1
node jangcluster-srv-3:remote \
attributes ethmonitor-eth1=1
node jangcluster-srv-4:remote \
attributes ethmonitor-eth1=1
primitive GPFS-Fence stonith:fence_gpfs \
params ipaddr=jangcluster-srv-1 pcmk_host_list=" jangcluster-srv-1 jangcluster-srv-2 jangcluster-srv-3 jangcluster-srv-4" secure=true \
op monitor interval=10s timeout=500s \
op off interval=0 \
meta is-managed=true
primitive NIC_eth1 ethmonitor \
params interface=eth1 repeat_count=4 repeat_interval=4 link_status_only=true \
op monitor timeout=30s interval=4 \
op start timeout=60s interval=0s \
op stop interval=0s timeout=20s
location prefer-node-jangcluster-srv-3 jangcluster-srv-3 100: jangcluster-srv-1
location prefer-node-jangcluster-srv-4 jangcluster-srv-4 100: jangcluster-srv-2
location prefer-node-jangcluster-srv-3-2 jangcluster-srv-3 50: jangcluster-srv-2
location prefer-node-jangcluster-srv-4-2 jangcluster-srv-4 50: jangcluster-srv-1
* Online: [ jangcluster-srv-1 jangcluster-srv-2 ]
* RemoteOnline: [ jangcluster-srv-3 jangcluster-srv-4 ]
Full List of Resources:
* GPFS-Fence (stonith:fence_gpfs): Started jangcluster-srv-1
* jangcluster-srv-3 (ocf::pacemaker:remote): Started jangcluster-srv-1
* jangcluster-srv-4 (ocf::pacemaker:remote): Started jangcluster-srv-2
node 1: jangcluster-srv-1 \
attributes ethmonitor-eth1=1
node 2: jangcluster-srv-2 \
attributes ethmonitor-eth1=1
node jangcluster-srv-3:remote \
attributes ethmonitor-eth1=1
node jangcluster-srv-4:remote \
attributes ethmonitor-eth1=1
primitive GPFS-Fence stonith:fence_gpfs \
params ipaddr=jangcluster-srv-1 pcmk_host_list=" jangcluster-srv-1 jangcluster-srv-2 jangcluster-srv-3 jangcluster-srv-4" secure=true \
op monitor interval=10s timeout=500s \
op off interval=0 \
meta is-managed=true
primitive NIC_eth1 ethmonitor \
params interface=eth1 repeat_count=4 repeat_interval=4 link_status_only=true \
op monitor timeout=30s interval=4 \
op start timeout=60s interval=0s \
op stop interval=0s timeout=20s
location prefer-node-jangcluster-srv-3 jangcluster-srv-3 100: jangcluster-srv-1
location prefer-node-jangcluster-srv-4 jangcluster-srv-4 100: jangcluster-srv-2
location prefer-node-jangcluster-srv-3-2 jangcluster-srv-3 50: jangcluster-srv-2
location prefer-node-jangcluster-srv-4-2 jangcluster-srv-4 50: jangcluster-srv-1
I noticed that remote node gets fenced when the quorum node its connected to gets fenced or experiences network failure.
For example, when I disconnected srv-2 from the rest of the cluster,
For example, when I disconnected srv-2 from the rest of the cluster,
I expected that remote node jangcluster-srv-4 would failover to srv-1 given my location constraints,
but srv-4 was getting fenced along with srv-2 instead of failing over.
How can I configure the cluster so that remote node srv-4 fails over instead of getting fenced?
Thank you
Janghyuk Boo.
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/