[ClusterLabs] Move a resource only where another has Started

martin doc Thu, 07 Oct 2021 08:45:23 -0700

Hi,

I've been trying to work out if it is possible to leave a resource on the 
cluster node that it is on and only move it to another node if a dependent 
resource is started. This is all using Red Hat's presentation in RHEL...


Ok, that might sound gibberish...

The cluster config I'm trying to build starts out with a basic ping:

pcs resource create MyGw ocf:pacemaker:ping host_list=192.168.1.254 
failure_score=1 migration-threshold=1
pcs resource clone MyGw globally-unique=true

and lets assume that there's three nodes, node1, node2, and node3 so the above 
gets pacemaker running MyGw-clone on both node1, node2, and node3. (3 nodes 
makes it more interesting ;)

All good.

Now lets add a VIP into the mix and set it to run where MyGw-clone is running:

pcs resource create VIP ocf:heartbeat:IPaddr2 ip=192.168.1.1 nic=eno3 
cidr_netmask=24
pcs constraint colocation add VIP with MyGw-clone

All good. Now comes the fun part. I want to run my own app only on one node and 
only where the gateway is:

pcs resource create App ocf:internal:app
pcs constraint colocation add App with VIP

If the VIP is on node1 and MyGw-clone is on node1 and running successfully then 
so too will App. The problem starts when I unplug eno3 (not the same NIC as is 
used for the cluster mgmt.)

As soon as the ping fails, MyGw-clone stops on node1 and this forces both VIP 
and App onto another node.

The problem is that pacemaker will eventually expire the failure and then 
decide to restart MyGw-clone on node1 and at the same time, stop VIP & App on 
node2. It then tries to start both VIP and App on node1.

What I'd really like to happen is for VIP & App to only move back to node1 if 
and when MyGw-clone is in the "Started" state (i.e. after a successful ping). 
i.e. to only do the "Move" of VIP&APP after the recovery of MyGw:0 has been 
successful.

If I set "failure_timeout" to be 0 then both App & VIP will stay put, and the 
cluster never again tests to see if MyGw:0 is healthy until I do a "pcs 
resource cleanup."

I've tried colocation rules, but I wasn't any more successful with those than 
the basic constraint configuration (assuming I got those right.)

I suppose another way to go about this would be to run another clone'd resource 
that mimics the ping and automatically runs a "resource cleanup MyGw-clone" if 
it notices the clone is down on and node and the ping would succeed. But is 
there a cleaner way?

Thanks,
D.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Move a resource only where another has Started

Reply via email to