Re: [ClusterLabs] Default Behavior

Pavlov, Vladimir Wed, 29 Jun 2016 02:05:16 -0700

Thanks a lot.
We also thought to use Fencing (stonith).
But production cluster works in the cloud, node1 and node2 is virtual machines 
without any hardware fencing devices.
We looked in the direction of the SBR, but its use as far as we understand is 
not justified without shared storage in two-node cluster:
http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit
Are there any ways to do fencing?
Specifically for our situation, we have found another workaround - use DR 
instead of NAT in IPVS.
In the case of DR, even if both servers are active at the same time it does not 
matter which of them serve the connection from the client. Web servers responds 
to the client directly.
This workaround has a right to life?


Kind regards,
 
Vladimir Pavlov

Message: 2
Date: Tue, 28 Jun 2016 18:53:38 +0300
From: "Pavlov, Vladimir" <vladimir.pav...@tns-global.ru>
To: "'Users@clusterlabs.org'" <Users@clusterlabs.org>
Subject: [ClusterLabs] Default Behavior
Message-ID:
        <b38b34ec5621e34dabce13e8b18936e6033f0b17c...@exserv.gallup.tns>
Content-Type: text/plain; charset="koi8-r"

Hello!
We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), with 
resources IPaddr2 and ldirectord.
Cluster Properties:
cluster-infrastructure: cman
dc-version: 1.1.11-97629de
no-quorum-policy: ignore
stonith-enabled: false
The cluster has been configured for this documentation: 
http://clusterlabs.org/quickstart-redhat-6.html
Recently, there was a communication failure between cluster nodes and the 
behavior was like this:

-        During a network failure, each server has become the Master.

-        After the restoration of the network, one node killing services of 
Pacemaker on the second node.

-        The second node was not available for the cluster, but all resources 
remain active (Ldirectord,ipvs,ip address). That is, both nodes continue to be 
active.
We decided to create a test stand and play the situation, but with current 
version of Pacemaker in CentOS repos, ?luster behaves differently:

-        During a network failure, each server has become the Master.

-        After the restoration of the network, all resources are stopped.

-        Then the resources are run only on one node. - This behavior seems to 
be more logical.
Current Cluster Properties on test stand:
cluster-infrastructure: cman
dc-version: 1.1.14-8.el6-70404b0
have-watchdog: false
no-quorum-policy: ignore
stonith-enabled: false
Changed the behavior of the cluster in the new version or accident is not fully 
emulated?
Thank you.


Kind regards,

Vladimir Pavlov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://clusterlabs.org/pipermail/users/attachments/20160628/b340b971/attachment-0001.html>

------------------------------

Message: 3
Date: Tue, 28 Jun 2016 12:07:36 -0500
From: Ken Gaillot <kgail...@redhat.com>
To: users@clusterlabs.org
Subject: Re: [ClusterLabs] Default Behavior
Message-ID: <5772aed8.6060...@redhat.com>
Content-Type: text/plain; charset=UTF-8

On 06/28/2016 10:53 AM, Pavlov, Vladimir wrote:
> Hello!
> 
> We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7),
> with resources IPaddr2 and ldirectord.
> 
> Cluster Properties:
> 
> cluster-infrastructure: cman
> 
> dc-version: 1.1.11-97629de
> 
> no-quorum-policy: ignore
> 
> stonith-enabled: false
> 
> The cluster has been configured for this documentation:
> http://clusterlabs.org/quickstart-redhat-6.html
> 
> Recently, there was a communication failure between cluster nodes and
> the behavior was like this:
> 
> -        During a network failure, each server has become the Master.
> 
> -        After the restoration of the network, one node killing services
> of Pacemaker on the second node.
> 
> -        The second node was not available for the cluster, but all
> resources remain active (Ldirectord,ipvs,ip address). That is, both
> nodes continue to be active.
> 
> We decided to create a test stand and play the situation, but with
> current version of Pacemaker in CentOS repos, ?luster behaves differently:
> 
> -        During a network failure, each server has become the Master.
> 
> -        After the restoration of the network, all resources are stopped.
> 
> -        Then the resources are run only on one node. - This behavior
> seems to be more logical.
> 
> Current Cluster Properties on test stand:
> 
> cluster-infrastructure: cman
> 
> dc-version: 1.1.14-8.el6-70404b0
> 
> have-watchdog: false
> 
> no-quorum-policy: ignore
> 
> stonith-enabled: false
> 
> Changed the behavior of the cluster in the new version or accident is
> not fully emulated?

If I understand your description correctly, the situation was not
identical. The difference I see is that, in the original case, the
second node is not responding to the cluster even after the network is
restored. Thus, the cluster cannot communicate to carry out the behavior
observed in the test situation.

Fencing (stonith) is the cluster's only recovery mechanism in such a
case. When the network splits, or a node becomes unresponsive, it can
only safely recover resources if it can ensure the other node is powered
off. Pacemaker supports both physical fencing devices such as an
intelligent power switch, and hardware watchdog devices for self-fencing
using sbd.

> Thank you.
> 
>  
> 
>  
> 
> Kind regards,
> 
>  
> 
> *Vladimir Pavlov*



------------------------------

Message: 4
Date: Tue, 28 Jun 2016 16:51:50 -0400
From: Digimer <li...@alteeve.ca>
To: Cluster Labs - All topics related to open-source clustering
        welcomed        <users@clusterlabs.org>
Subject: Re: [ClusterLabs] Default Behavior
Message-ID: <0021409c-86ba-7ef6-875f-0defd3fc9...@alteeve.ca>
Content-Type: text/plain; charset=UTF-8

On 28/06/16 11:53 AM, Pavlov, Vladimir wrote:
> Hello!
> 
> We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7),
> with resources IPaddr2 and ldirectord.
> 
> Cluster Properties:
> 
> cluster-infrastructure: cman
> 
> dc-version: 1.1.11-97629de
> 
> no-quorum-policy: ignore
> 
> stonith-enabled: false

You need fencing to be enabled and configured. This is always true, but
particularly so on RHEL 6 because it uses the cman plugin. Please
configure and test stonith, and then repeat your tests to see if the
behavior is more predictable.

> The cluster has been configured for this documentation:
> http://clusterlabs.org/quickstart-redhat-6.html
> 
> Recently, there was a communication failure between cluster nodes and
> the behavior was like this:
> 
> -        During a network failure, each server has become the Master.
> 
> -        After the restoration of the network, one node killing services
> of Pacemaker on the second node.
> 
> -        The second node was not available for the cluster, but all
> resources remain active (Ldirectord,ipvs,ip address). That is, both
> nodes continue to be active.
> 
> We decided to create a test stand and play the situation, but with
> current version of Pacemaker in CentOS repos, ?luster behaves differently:
> 
> -        During a network failure, each server has become the Master.
> 
> -        After the restoration of the network, all resources are stopped.
> 
> -        Then the resources are run only on one node. - This behavior
> seems to be more logical.
> 
> Current Cluster Properties on test stand:
> 
> cluster-infrastructure: cman
> 
> dc-version: 1.1.14-8.el6-70404b0
> 
> have-watchdog: false
> 
> no-quorum-policy: ignore
> 
> stonith-enabled: false
> 
> Changed the behavior of the cluster in the new version or accident is
> not fully emulated?
> 
> Thank you.
> 
>  
> 
>  
> 
> Kind regards,
> 
>  
> 
> *Vladimir Pavlov*
> 
>  
> 
> 
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Default Behavior

Reply via email to