I am experimenting with pacemaker for high availability for some load balancers. I was able to sucessfully get two CentOS (6.9) machines (scahadev01da and scahadev01db) to form a cluster and the shared IP was assigned to scahadev01da. I simulated a failure by halting the primary and the secondary eventually noticed bringing up the shared IP on its eth0. So far, so good.
A problem arises when the primary comes back up and, for some reason, each node thinks the other is offline. This leads to both nodes adding the duplicate IP to its own eth0. I probably do not need to tell you the mischief that can cause if these were production servers. I tried restarting cman, pcsd and pacemaker on both machines with no effect on the situation. I've found several mentions of it in the search engines but I've been unable to find how to fix it. Any help is appreciated Both nodes have quorum disabled in /etc/sysconfig/cman CMAN_QUORUM_TIMEOUT=0 #------------------------------------------------ Node 1 scahadev01da# sudo pcs status Cluster name: scahadev01d Stack: cman Current DC: scahadev01da (version 1.1.15-5.el6-e174ec8) - partition WITHOUT quorum Last updated: Mon Jul 31 10:43:54 2017 Last change: Mon Jul 31 10:30:46 2017 by root via cibadmin on scahadev01da 2 nodes and 1 resource configured Online: [ scahadev01da ] OFFLINE: [ scahadev01db ] Full list of resources: VirtualIP (ocf::heartbeat:IPaddr2): Started scahadev01da Daemon Status: cman: active/enabled corosync: active/disabled pacemaker: active/enabled pcsd: active/enabled #------------------------------------------------ Node 2 scahadev01db ~]$ sudo pcs status Cluster name: scahadev01d Stack: cman Current DC: scahadev01db (version 1.1.15-5.el6-e174ec8) - partition WITHOUT quorum Last updated: Mon Jul 31 10:43:47 2017 Last change: Sat Jul 29 13:45:15 2017 by root via cibadmin on scahadev01da 2 nodes and 1 resource configured Online: [ scahadev01db ] OFFLINE: [ scahadev01da ] Full list of resources: VirtualIP (ocf::heartbeat:IPaddr2): Started scahadev01db Daemon Status: cman: active/enabled corosync: active/disabled pacemaker: active/enabled pcsd: active/enabled -- Stephen Carville _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org