Thanks a lot. We also thought to use Fencing (stonith). But production cluster works in the cloud, node1 and node2 is virtual machines without any hardware fencing devices. We looked in the direction of the SBR, but its use as far as we understand is not justified without shared storage in two-node cluster: http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit Are there any ways to do fencing? Specifically for our situation, we have found another workaround - use DR instead of NAT in IPVS. In the case of DR, even if both servers are active at the same time it does not matter which of them serve the connection from the client. Web servers responds to the client directly. This workaround has a right to life?
Kind regards, Vladimir Pavlov Message: 2 Date: Tue, 28 Jun 2016 18:53:38 +0300 From: "Pavlov, Vladimir" <vladimir.pav...@tns-global.ru> To: "'Users@clusterlabs.org'" <Users@clusterlabs.org> Subject: [ClusterLabs] Default Behavior Message-ID: <b38b34ec5621e34dabce13e8b18936e6033f0b17c...@exserv.gallup.tns> Content-Type: text/plain; charset="koi8-r" Hello! We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), with resources IPaddr2 and ldirectord. Cluster Properties: cluster-infrastructure: cman dc-version: 1.1.11-97629de no-quorum-policy: ignore stonith-enabled: false The cluster has been configured for this documentation: http://clusterlabs.org/quickstart-redhat-6.html Recently, there was a communication failure between cluster nodes and the behavior was like this: - During a network failure, each server has become the Master. - After the restoration of the network, one node killing services of Pacemaker on the second node. - The second node was not available for the cluster, but all resources remain active (Ldirectord,ipvs,ip address). That is, both nodes continue to be active. We decided to create a test stand and play the situation, but with current version of Pacemaker in CentOS repos, ?luster behaves differently: - During a network failure, each server has become the Master. - After the restoration of the network, all resources are stopped. - Then the resources are run only on one node. - This behavior seems to be more logical. Current Cluster Properties on test stand: cluster-infrastructure: cman dc-version: 1.1.14-8.el6-70404b0 have-watchdog: false no-quorum-policy: ignore stonith-enabled: false Changed the behavior of the cluster in the new version or accident is not fully emulated? Thank you. Kind regards, Vladimir Pavlov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://clusterlabs.org/pipermail/users/attachments/20160628/b340b971/attachment-0001.html> ------------------------------ Message: 3 Date: Tue, 28 Jun 2016 12:07:36 -0500 From: Ken Gaillot <kgail...@redhat.com> To: users@clusterlabs.org Subject: Re: [ClusterLabs] Default Behavior Message-ID: <5772aed8.6060...@redhat.com> Content-Type: text/plain; charset=UTF-8 On 06/28/2016 10:53 AM, Pavlov, Vladimir wrote: > Hello! > > We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), > with resources IPaddr2 and ldirectord. > > Cluster Properties: > > cluster-infrastructure: cman > > dc-version: 1.1.11-97629de > > no-quorum-policy: ignore > > stonith-enabled: false > > The cluster has been configured for this documentation: > http://clusterlabs.org/quickstart-redhat-6.html > > Recently, there was a communication failure between cluster nodes and > the behavior was like this: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, one node killing services > of Pacemaker on the second node. > > - The second node was not available for the cluster, but all > resources remain active (Ldirectord,ipvs,ip address). That is, both > nodes continue to be active. > > We decided to create a test stand and play the situation, but with > current version of Pacemaker in CentOS repos, ?luster behaves differently: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, all resources are stopped. > > - Then the resources are run only on one node. - This behavior > seems to be more logical. > > Current Cluster Properties on test stand: > > cluster-infrastructure: cman > > dc-version: 1.1.14-8.el6-70404b0 > > have-watchdog: false > > no-quorum-policy: ignore > > stonith-enabled: false > > Changed the behavior of the cluster in the new version or accident is > not fully emulated? If I understand your description correctly, the situation was not identical. The difference I see is that, in the original case, the second node is not responding to the cluster even after the network is restored. Thus, the cluster cannot communicate to carry out the behavior observed in the test situation. Fencing (stonith) is the cluster's only recovery mechanism in such a case. When the network splits, or a node becomes unresponsive, it can only safely recover resources if it can ensure the other node is powered off. Pacemaker supports both physical fencing devices such as an intelligent power switch, and hardware watchdog devices for self-fencing using sbd. > Thank you. > > > > > > Kind regards, > > > > *Vladimir Pavlov* ------------------------------ Message: 4 Date: Tue, 28 Jun 2016 16:51:50 -0400 From: Digimer <li...@alteeve.ca> To: Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org> Subject: Re: [ClusterLabs] Default Behavior Message-ID: <0021409c-86ba-7ef6-875f-0defd3fc9...@alteeve.ca> Content-Type: text/plain; charset=UTF-8 On 28/06/16 11:53 AM, Pavlov, Vladimir wrote: > Hello! > > We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), > with resources IPaddr2 and ldirectord. > > Cluster Properties: > > cluster-infrastructure: cman > > dc-version: 1.1.11-97629de > > no-quorum-policy: ignore > > stonith-enabled: false You need fencing to be enabled and configured. This is always true, but particularly so on RHEL 6 because it uses the cman plugin. Please configure and test stonith, and then repeat your tests to see if the behavior is more predictable. > The cluster has been configured for this documentation: > http://clusterlabs.org/quickstart-redhat-6.html > > Recently, there was a communication failure between cluster nodes and > the behavior was like this: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, one node killing services > of Pacemaker on the second node. > > - The second node was not available for the cluster, but all > resources remain active (Ldirectord,ipvs,ip address). That is, both > nodes continue to be active. > > We decided to create a test stand and play the situation, but with > current version of Pacemaker in CentOS repos, ?luster behaves differently: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, all resources are stopped. > > - Then the resources are run only on one node. - This behavior > seems to be more logical. > > Current Cluster Properties on test stand: > > cluster-infrastructure: cman > > dc-version: 1.1.14-8.el6-70404b0 > > have-watchdog: false > > no-quorum-policy: ignore > > stonith-enabled: false > > Changed the behavior of the cluster in the new version or accident is > not fully emulated? > > Thank you. > > > > > > Kind regards, > > > > *Vladimir Pavlov* > > > > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org