Hi, I've been trying to work out if it is possible to leave a resource on the cluster node that it is on and only move it to another node if a dependent resource is started. This is all using Red Hat's presentation in RHEL...
Ok, that might sound gibberish... The cluster config I'm trying to build starts out with a basic ping: pcs resource create MyGw ocf:pacemaker:ping host_list=192.168.1.254 failure_score=1 migration-threshold=1 pcs resource clone MyGw globally-unique=true and lets assume that there's three nodes, node1, node2, and node3 so the above gets pacemaker running MyGw-clone on both node1, node2, and node3. (3 nodes makes it more interesting ;) All good. Now lets add a VIP into the mix and set it to run where MyGw-clone is running: pcs resource create VIP ocf:heartbeat:IPaddr2 ip=192.168.1.1 nic=eno3 cidr_netmask=24 pcs constraint colocation add VIP with MyGw-clone All good. Now comes the fun part. I want to run my own app only on one node and only where the gateway is: pcs resource create App ocf:internal:app pcs constraint colocation add App with VIP If the VIP is on node1 and MyGw-clone is on node1 and running successfully then so too will App. The problem starts when I unplug eno3 (not the same NIC as is used for the cluster mgmt.) As soon as the ping fails, MyGw-clone stops on node1 and this forces both VIP and App onto another node. The problem is that pacemaker will eventually expire the failure and then decide to restart MyGw-clone on node1 and at the same time, stop VIP & App on node2. It then tries to start both VIP and App on node1. What I'd really like to happen is for VIP & App to only move back to node1 if and when MyGw-clone is in the "Started" state (i.e. after a successful ping). i.e. to only do the "Move" of VIP&APP after the recovery of MyGw:0 has been successful. If I set "failure_timeout" to be 0 then both App & VIP will stay put, and the cluster never again tests to see if MyGw:0 is healthy until I do a "pcs resource cleanup." I've tried colocation rules, but I wasn't any more successful with those than the basic constraint configuration (assuming I got those right.) I suppose another way to go about this would be to run another clone'd resource that mimics the ping and automatically runs a "resource cleanup MyGw-clone" if it notices the clone is down on and node and the ping would succeed. But is there a cleaner way? Thanks, D.
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/