Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Nikhil Utane
Hmm. I will then work towards bringing this in. Thanks for your input.

On Wed, Jun 22, 2016 at 10:44 AM, Digimer  wrote:

> On 22/06/16 01:07 AM, Nikhil Utane wrote:
> > I don't get it.  Pacemaker + Corosync is providing me so much of
> > functionality.
> > For e.g. if we leave out the condition of split-brain for a while, then
> > it provides:
> > 1) Discovery and cluster formation
> > 2) Synchronization of data
> > 3) Heartbeat mechanism
> > 4) Swift failover of the resource
> > 5) Guarantee that one resource will be started on only 1 node
> >
> > So in case of normal fail-over we need the basic functionality of
> > resource being migrated to a standby node.
> > And it is giving me all that.
> > So I don't agree that it needs to be as black and white as you say. Our
> > solution has different requirements than a typical HA solution. But that
> > is only now. In the future we might have to implement all the things. So
> > in that sense Pacemaker gives us a good framework that we can extend.
> >
> > BTW, we are not even using a virtual IP resource which again I believe
> > is something that everyone employs.
> > Because of the nature of the service a small glitch is going to happen.
> > Using virtual IPs is not giving any real benefit for us.
> > And with regard to the question, why even have a standby and let it be
> > active all the time, two-node cluster is one of the possible
> > configuration, but main requirement is to support N + 1. So standby node
> > doesn't know which active it has to take over until a failover occurs.
> >
> > Your comments however has made me re-consider using fencing. It was not
> > that we didn't want to do it.
> > Just that I felt it may not be needed. So I'll definitely explore this
> > further.
>
> It is needed, and it is that black and white. Ask yourself, for your
> particular installation; Can I run X in two places at the same time
> without coordination?
>
> If the answer is "yes", then just do that and be done with it.
>
> If the answer is "no", then you need fencing to allow pacemaker to know
> the state of all nodes (otherwise, the ability to coordinate is lost).
>
> I've never once seen a valid HA setup where fencing was not needed. I
> don't claim to be the best by any means, but I've been around long
> enough to say this with some confidence.
>
> digimer
>
> > Thanks everyone for the comments.
> >
> > -Regards
> > Nikhil
> >
> > On Tue, Jun 21, 2016 at 10:17 PM, Digimer  > > wrote:
> >
> > On 21/06/16 10:57 AM, Dmitri Maziuk wrote:
> > > On 2016-06-20 17:19, Digimer wrote:
> > >
> > >> Nikhil indicated that they could switch where traffic went
> up-stream
> > >> without issue, if I understood properly.
> > >
> > > They have some interesting setup, but that notwithstanding: if
> split
> > > brain happens some clients will connect to "old master" and some:
> to
> > > "new master", dep. on arp update. If there's a shared resource
> > > unavailable on one node, clients going there will error out. The
> other
> > > ones will not. It will work for some clients.
> > >
> > > Cf. both nodes going into stonith deathmatch and killing each
> other: the
> > > service now is not available for all clients. What I don't get is
> the
> > > blanket assertion that this "more highly" available that option #1.
> > >
> > > Dimitri
> >
> > As I've explained many times (here and on IRC);
> >
> > If you don't need to coordinate services/access, you don't need HA.
> >
> > If you do need to coordinate services/access, you need fencing.
> >
> > So if Nikhil really believes s/he doesn't need fencing and that
> > split-brains are OK, then drop HA. If that is not the case, then s/he
> > needs to implement fencing in pacemaker. It's pretty much that
> simple.
> >
> > --
> > Digimer
> > Papers and Projects: https://alteeve.ca/w/
> > What if the cure for cancer is trapped in the mind of a person
> without
> > access to education?
> >
> > ___
> > Users mailing list: Users@clusterlabs.org  Users@clusterlabs.org>
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
> ___
> 

Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
On 22/06/16 01:09 AM, Nikhil Utane wrote:
> We are not using virtual IP. There is a separate discovery mechanism
> between the server and client. The client will reach out to new server
> only if it is incommunicado with the old one.

That's fine, but it really doesn't change anything. Whether you're using
a shared IP, shared storage or something else, it's all the same to
pacemaker in the end.

> On Tue, Jun 21, 2016 at 8:27 PM, Dmitri Maziuk  > wrote:
> 
> On 2016-06-20 17:19, Digimer wrote:
> 
> Nikhil indicated that they could switch where traffic went up-stream
> without issue, if I understood properly.
> 
> 
> They have some interesting setup, but that notwithstanding: if split
> brain happens some clients will connect to "old master" and some: to
> "new master", dep. on arp update. If there's a shared resource
> unavailable on one node, clients going there will error out. The
> other ones will not. It will work for some clients.
> 
> Cf. both nodes going into stonith deathmatch and killing each other:
> the service now is not available for all clients. What I don't get
> is the blanket assertion that this "more highly" available that
> option #1.
> 
> Dimitri
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Nikhil Utane
I don't get it.  Pacemaker + Corosync is providing me so much of
functionality.
For e.g. if we leave out the condition of split-brain for a while, then it
provides:
1) Discovery and cluster formation
2) Synchronization of data
3) Heartbeat mechanism
4) Swift failover of the resource
5) Guarantee that one resource will be started on only 1 node

So in case of normal fail-over we need the basic functionality of resource
being migrated to a standby node.
And it is giving me all that.
So I don't agree that it needs to be as black and white as you say. Our
solution has different requirements than a typical HA solution. But that is
only now. In the future we might have to implement all the things. So in
that sense Pacemaker gives us a good framework that we can extend.

BTW, we are not even using a virtual IP resource which again I believe is
something that everyone employs.
Because of the nature of the service a small glitch is going to happen.
Using virtual IPs is not giving any real benefit for us.
And with regard to the question, why even have a standby and let it be
active all the time, two-node cluster is one of the possible configuration,
but main requirement is to support N + 1. So standby node doesn't know
which active it has to take over until a failover occurs.

Your comments however has made me re-consider using fencing. It was not
that we didn't want to do it.
Just that I felt it may not be needed. So I'll definitely explore this
further.

Thanks everyone for the comments.

-Regards
Nikhil

On Tue, Jun 21, 2016 at 10:17 PM, Digimer  wrote:

> On 21/06/16 10:57 AM, Dmitri Maziuk wrote:
> > On 2016-06-20 17:19, Digimer wrote:
> >
> >> Nikhil indicated that they could switch where traffic went up-stream
> >> without issue, if I understood properly.
> >
> > They have some interesting setup, but that notwithstanding: if split
> > brain happens some clients will connect to "old master" and some: to
> > "new master", dep. on arp update. If there's a shared resource
> > unavailable on one node, clients going there will error out. The other
> > ones will not. It will work for some clients.
> >
> > Cf. both nodes going into stonith deathmatch and killing each other: the
> > service now is not available for all clients. What I don't get is the
> > blanket assertion that this "more highly" available that option #1.
> >
> > Dimitri
>
> As I've explained many times (here and on IRC);
>
> If you don't need to coordinate services/access, you don't need HA.
>
> If you do need to coordinate services/access, you need fencing.
>
> So if Nikhil really believes s/he doesn't need fencing and that
> split-brains are OK, then drop HA. If that is not the case, then s/he
> needs to implement fencing in pacemaker. It's pretty much that simple.
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker 1.1.15 released

2016-06-21 Thread Ken Gaillot
ClusterLabs is proud to announce the latest release of the Pacemaker
cluster resource manager, version 1.1.15. The source code is available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.15
The most significant enhancements since version 1.1.14 are:

* A new "alerts" section of the CIB allows you to configure scripts that
will be called after significant cluster events. Sample scripts are
installed in /usr/share/pacemaker/alerts.

* A new pcmk_action_limit option for fence devices allows multiple fence
actions to be executed concurrently. It defaults to 1 to preserve
existing behavior (i.e. serial execution of fence actions).

* Pacemaker Remote support has been improved. Most noticeably, if
pacemaker_remote is stopped without disabling the remote resource first,
any resources will be moved off the node (previously, the node would get
fenced). This allows easier software updates on remote nodes, since
updates often involve restarting the daemon.

* You may notice some files have moved from the pacemaker package to
pacemaker-cli, including most ocf:pacemaker resource agents, the
logrotate configuration, the XML schemas and the SNMP MIB. This allows
Pacemaker Remote nodes to work better when the full pacemaker package is
not installed.

* Have you ever wondered why a resource is not starting when you think
it should? crm_mon will now show why a resource is stopped, for example,
because it is unmanaged, or disabled in the configuration.

* In 1.1.14, the controld resource agent was modified to return a
monitor error when DLM is in the "wait fencing" state. This turned out
to be too aggressive, resulting in fencing the monitored node
unnecessarily if a slow fencing operation against another node was in
progress. The agent now does additional checking to determine whether to
return an error or not.

* Four significant regressions have been fixed. Compressed CIBs larger
than 1MB are again supported (a regression since 1.1.14), fenced unseen
nodes properly are not marked as unclean (also since 1.1.14),
have-watchdog is detected properly rather than always true (also since
1.1.14) and failures of multiple-level monitor checks should again cause
the resource to fail (since 1.1.10).

As usual, the release includes many bugfixes and minor enhancements. For
a more detailed list of changes, see the change log:

https://github.com/ClusterLabs/pacemaker/blob/1.1/ChangeLog

Everyone is encouraged to download, compile and test the new release. We
do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Andrew Beekhof, Bin Liu, Christian Schneider, Christoph Berg,
David Shane Holden, Ferenc Wágner, Gao Yan, Hideo Yamauchi, Jan Pokorný,
Ken Gaillot, Klaus Wenninger, Kostiantyn Ponomarenko, Kristoffer
Grönlund, Lars Ellenberg, Michal Koutný, Nakahira Kazutomo, Oyvind
Albrigtsen, Ruben Kerkhof, and Yusuke Iida. Apologies if I have
overlooked anyone.
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Ken Gaillot
On 06/20/2016 11:33 PM, Nikhil Utane wrote:
> Let me give the full picture about our solution. It will then make it
> easy to have the discussion.
> 
> We are looking at providing N + 1 Redundancy to our application servers,
> i.e. 1 standby for upto N active (currently N<=5). Each server will have
> some unique configuration. The standby will store the configuration of
> all the active servers such that whichever server goes down, the standby
> loads that particular configuration and becomes active. The server that
> went down will now become standby. 
> We have bundled all the configuration that every server has into a
> resource such that during failover the resource is moved to the newly
> active server, and that way it takes up the personality of the server
> that went down. To put it differently, every active server has a
> 'unique' resource that is started by Pacemaker whereas standby has none.
> 
> Our servers do not write anything to an external database, all the
> writing is done to the CIB file under the resource that it is currently
> managing. We also have some clients that connect to the active servers
> (1 client can connect to only 1 server, 1 server can have multiple
> clients) and provide service to end-users. Now the reason I say that
> split-brain is not an issue for us, is coz the clients can only connect
> to 1 of the active servers at any given time (we have to handle the case
> that all clients move together and do not get distributed). So even if
> two servers become active with same personality, the clients can only
> connect to 1 of them. (Initial plan was to go configure quorum but later
> I was told that service availability is of utmost importance and since
> impact of split-brain is limited, we are thinking of doing away with it).
> 
> Now the concern I have is, once the split is resolved, I would have 2
> actives, each having its own view of the resource, trying to synchronize
> the CIB. At this point I want the one that has the clients attached to
> it win.
> I am thinking I can implement a monitor function that can bring down the
> resource if it doesn't find any clients attached to it within a given
> period of time. But to understand the Pacemaker behavior, what exactly
> would happen if the same resource is found to be active on two nodes
> after recovery?
> 
> -Thanks
> Nikhil

In general, monitor actions should not change the state of the service
in any way.

Pacemaker's behavior when finding multiple instances of a resource
running when there should be only one is configurable via the
multiple-active property:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes

By default, it stops all the instances, and then starts one instance.
The alternatives are to stop all the instances and leave them stopped,
or to unmanage the resource (i.e. refuse to stop or start it).

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
On 21/06/16 01:27 PM, Dimitri Maziuk wrote:
> On 06/21/2016 12:13 PM, Andrei Borzenkov wrote:
> 
>> You should not run pacemaker without some sort of fencing. This need not
>> be network-controlled power socket (and tiebreaker is not directly
>> related to fencing).
> 
> Yes it can be sysadmin-controlled power socket. It has to be a power
> socket, if you don't trust me, read Dejan's list of fencing devices.

You can now use redundant and complex fencing configurations in pacemaker.

Our company always has this setup;

IPMI is the primary fence method (when it works, we can trust 'off'
100%, but it draws power from the host and is thus vulnerable)

Pair of switched PDUs as backup fencing (when it works, you are
confident that the outlets are opened, but you have to make sure the
cables are in the right place. However, it is entirely external to the
target).

> Tiebreaking is directly related to figuring out which of the two nodes
> is to be fenced. because neither of them can tell on its own.

See my comment on 'delay="15"'. You do NOT need a 3 node cluter/tie
breaker. We've run nothing but 2-node clusters for years all over north
america and we've heard of people running our system globally. With the
above fence setup and proper delay, it has never once been a problem.

>> I fail to see how heartbeat makes any difference here, sorry.
> 
> Third node and remote-controlled PDU were not a requirement for
> haresources mode. If I wanted to run it so that when it breaks I get to
> keep the pieces, I could.

You technically can in pacemaker, too, but it's dumb in any HA
environment. As soon as you make assumptions, you open up the chance of
being wrong.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Andrei Borzenkov
21.06.2016 20:05, Dimitri Maziuk пишет:
> On 06/21/2016 11:47 AM, Digimer wrote:
> 
>> If you don't need to coordinate services/access, you don't need HA.
>>
>> If you do need to coordinate services/access, you need fencing.
> 
> So what you're saying is we *cannot* run a pacemaker cluster without a
> tiebreaker node *and* a network-controlled power socket.
> 

You should not run pacemaker without some sort of fencing. This need not
be network-controlled power socket (and tiebreaker is not directly
related to fencing).

If you do not care about fencing, why not simply start services on both
nodes at boot time and be done with it?

> I knew that, actually, that's why I hung on to heartbeat for as long as

I fail to see how heartbeat makes any difference here, sorry.

> I could. It'd be nice to have it spelled out in bold at the start of
> every "explained from scratch" document on clusterlabs.org for the young
> players.
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Dimitri Maziuk
On 06/21/2016 11:47 AM, Digimer wrote:

> If you don't need to coordinate services/access, you don't need HA.
> 
> If you do need to coordinate services/access, you need fencing.

So what you're saying is we *cannot* run a pacemaker cluster without a
tiebreaker node *and* a network-controlled power socket.

I knew that, actually, that's why I hung on to heartbeat for as long as
I could. It'd be nice to have it spelled out in bold at the start of
every "explained from scratch" document on clusterlabs.org for the young
players.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
On 21/06/16 10:57 AM, Dmitri Maziuk wrote:
> On 2016-06-20 17:19, Digimer wrote:
> 
>> Nikhil indicated that they could switch where traffic went up-stream
>> without issue, if I understood properly.
> 
> They have some interesting setup, but that notwithstanding: if split
> brain happens some clients will connect to "old master" and some: to
> "new master", dep. on arp update. If there's a shared resource
> unavailable on one node, clients going there will error out. The other
> ones will not. It will work for some clients.
> 
> Cf. both nodes going into stonith deathmatch and killing each other: the
> service now is not available for all clients. What I don't get is the
> blanket assertion that this "more highly" available that option #1.
> 
> Dimitri

As I've explained many times (here and on IRC);

If you don't need to coordinate services/access, you don't need HA.

If you do need to coordinate services/access, you need fencing.

So if Nikhil really believes s/he doesn't need fencing and that
split-brains are OK, then drop HA. If that is not the case, then s/he
needs to implement fencing in pacemaker. It's pretty much that simple.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Node is silently unfenced if transition is very long

2016-06-21 Thread Digimer
On 21/06/16 12:19 PM, Ken Gaillot wrote:
> On 06/17/2016 07:05 AM, Vladislav Bogdanov wrote:
>> 03.05.2016 01:14, Ken Gaillot wrote:
>>> On 04/19/2016 10:47 AM, Vladislav Bogdanov wrote:
 Hi,

 Just found an issue with node is silently unfenced.

 That is quite large setup (2 cluster nodes and 8 remote ones) with
 a plenty of slowly starting resources (lustre filesystem).

 Fencing was initiated due to resource stop failure.
 lustre often starts very slowly due to internal recovery, and some such
 resources were starting in that transition where another resource
 failed to stop.
 And, as transition did not finish in time specified by the
 "failure-timeout" (set to 9 min), and was not aborted, that stop
 failure was successfully cleaned.
 There were transition aborts due to attribute changes, after that
 stop failure happened, but fencing
 was not initiated for some reason.
>>>
>>> Unfortunately, that makes sense with the current code. Failure timeout
>>> changes the node attribute, which aborts the transition, which causes a
>>> recalculation based on the new state, and the fencing is no longer
>>
>> Ken, could this one be considered to be fixed before 1.1.15 is released?
> 
> I'm planning to release 1.1.15 later today, and this won't make it in.
> 
> We do have several important open issues, including this one, but I
> don't want them to delay the release of the many fixes that are ready to
> go. I would only hold for a significant issue introduced this cycle, and
> none of the known issues appear to qualify.

I wonder if it would be worth appending a "known bugs/TODO" list to the
release announcements? Partly as a "heads-up" and partly as a way to
show folks what might be coming in .x+1.

>> I was just hit by the same in the completely different setup.
>> Two-node cluster, one node fails to stop a resource, and is fenced.
>> Right after that second node fails to activate clvm volume (different
>> story, need to investigate) and then fails to stop it. Node is scheduled
>> to be fenced, but it cannot be because first node didn't come up yet.
>> Any cleanup (automatic or manual) of a resource failed to stop clears
>> node state, removing "unclean" state from a node. That is probably not
>> what I could expect (resource cleanup is a node unfence)...
>> Honestly, this potentially leads to a data corruption...
>>
>> Also (probably not related) there was one more resource stop failure (in
>> that case - timeout) prior to failed stop mentioned above. And that stop
>> timeout did not lead to fencing by itself.
>>
>> I have logs (but not pe-inputs/traces/blackboxes) from both nodes, so
>> any additional information from them can be easily provided.
>>
>> Best regards,
>> Vladislav


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Node is silently unfenced if transition is very long

2016-06-21 Thread Ken Gaillot
On 06/17/2016 07:05 AM, Vladislav Bogdanov wrote:
> 03.05.2016 01:14, Ken Gaillot wrote:
>> On 04/19/2016 10:47 AM, Vladislav Bogdanov wrote:
>>> Hi,
>>>
>>> Just found an issue with node is silently unfenced.
>>>
>>> That is quite large setup (2 cluster nodes and 8 remote ones) with
>>> a plenty of slowly starting resources (lustre filesystem).
>>>
>>> Fencing was initiated due to resource stop failure.
>>> lustre often starts very slowly due to internal recovery, and some such
>>> resources were starting in that transition where another resource
>>> failed to stop.
>>> And, as transition did not finish in time specified by the
>>> "failure-timeout" (set to 9 min), and was not aborted, that stop
>>> failure was successfully cleaned.
>>> There were transition aborts due to attribute changes, after that
>>> stop failure happened, but fencing
>>> was not initiated for some reason.
>>
>> Unfortunately, that makes sense with the current code. Failure timeout
>> changes the node attribute, which aborts the transition, which causes a
>> recalculation based on the new state, and the fencing is no longer
> 
> Ken, could this one be considered to be fixed before 1.1.15 is released?

I'm planning to release 1.1.15 later today, and this won't make it in.

We do have several important open issues, including this one, but I
don't want them to delay the release of the many fixes that are ready to
go. I would only hold for a significant issue introduced this cycle, and
none of the known issues appear to qualify.

> I was just hit by the same in the completely different setup.
> Two-node cluster, one node fails to stop a resource, and is fenced.
> Right after that second node fails to activate clvm volume (different
> story, need to investigate) and then fails to stop it. Node is scheduled
> to be fenced, but it cannot be because first node didn't come up yet.
> Any cleanup (automatic or manual) of a resource failed to stop clears
> node state, removing "unclean" state from a node. That is probably not
> what I could expect (resource cleanup is a node unfence)...
> Honestly, this potentially leads to a data corruption...
> 
> Also (probably not related) there was one more resource stop failure (in
> that case - timeout) prior to failed stop mentioned above. And that stop
> timeout did not lead to fencing by itself.
> 
> I have logs (but not pe-inputs/traces/blackboxes) from both nodes, so
> any additional information from them can be easily provided.
> 
> Best regards,
> Vladislav

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Dmitri Maziuk

On 2016-06-20 17:19, Digimer wrote:


Nikhil indicated that they could switch where traffic went up-stream
without issue, if I understood properly.


They have some interesting setup, but that notwithstanding: if split 
brain happens some clients will connect to "old master" and some: to 
"new master", dep. on arp update. If there's a shared resource 
unavailable on one node, clients going there will error out. The other 
ones will not. It will work for some clients.


Cf. both nodes going into stonith deathmatch and killing each other: the 
service now is not available for all clients. What I don't get is the 
blanket assertion that this "more highly" available that option #1.


Dimitri

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org