I have a simple resource defined: [root@ha-d1 ~]# pcs resource show dmz1 Resource: dmz1 (class=ocf provider=internal type=ip-address) Attributes: address=172.16.10.192 monitor_link=true Meta Attrs: migration-threshold=3 failure-timeout=30s Operations: monitor interval=7s (dmz1-monitor-interval-7s)
This is a custom resource which provides an ethernet alias to one of the interfaces on our system. I can unplug the cable on either node and failover occurs as expected, and 30s after re-plugging it I can repeat the exercise on the opposite node and failover will happen as expected. However, if I unplug the cable from both nodes, the failcount goes up, and the 30s failure-timeout does not reset the failcounts, meaning that pacemaker never tries to start the failed resource again. Full list of resources: Resource Group: network inif (off::internal:ip.sh): Started ha-d1.dev.com outif (off::internal:ip.sh): Started ha-d2.dev.com dmz1 (off::internal:ip.sh): Stopped Master/Slave Set: DRBDMaster [DRBDSlave] Masters: [ ha-d1.dev.com ] Slaves: [ ha-d2.dev.com ] Resource Group: filesystem DRBDFS (ocf::heartbeat:Filesystem): Stopped Resource Group: application service_failover (off::internal:service_failover): Stopped Failcounts for dmz1 ha-d1.dev.com: 4 ha-d2.dev.com: 4 Is there any way to automatically recover from this scenario, other than setting an obnoxiously high migration-threshold? -- Sam Gardner Software Engineer Trustwave | SMART SECURITY ON DEMAND ________________________________ This transmission may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org