Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-18 Thread Lentes, Bernd
- On Oct 16, 2017, at 10:57 PM, kgaillot kgail...@redhat.com wrote:


>> from the Changelog:
>> 
>> Changes since Pacemaker-1.1.15
>>   ...
>>   + pengine: do not fence a node in maintenance mode if it shuts down
>> cleanly
>>   ...
>> 
>> just saying ... may or may not be what you are seeing.
>> 
>> Short term "workaround" may be to do things differently.
>> Maybe just set the cluster wide maintenance mode, not per node?
> 
> Sounds right.
> 
> Another thing to keep in mind is that even if pacemaker doesn't fence
> the node, if you use DLM, DLM might fence the node (it doesn't know
> about or respect any pacemaker maintenance/unmanaged settings).
> 
> I'd stop pacemaker before stopping corosync, in any case. In
> maintenance mode, that should be fine. I don't think a running
> pacemaker would be able to reconnect to corosync after corosync comes
> back.
> 

As Ulrich already mentioned the suse openais init script is responsible for 
both, pacemaker and corosync.

I have DLM in combination with cLVM, maybe that's the culprit. I will test to 
stop the DLM and cLVM resource before doing maintenance and stop corosync, 
maybe then it's not fenced.
I'm thinking of stopping using DLM in conjunction with cLVM and a SAN. I read 
an article (http://www.admin-magazine.com/Articles/Live-Migration , see chapter 
"The Weakest Link")
saying that DLM is tricky and not completely stable. It mentioned that Bastian 
Blank, who seems to be a maintainer of the Debian team, deactivated cLVM in the 
debian kernel. But the article is from 2013, so i'm not pretty sure.
Maybe DRBD and no SAN, so no DLM would be the better solution.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-18 Thread Lentes, Bernd


- On Oct 16, 2017, at 9:27 PM, Digimer li...@alteeve.ca wrote:


> 
> I understood what you meant about it getting fenced after stopping
> corosync. What I am not clear on is if you are stopping corosync on the
> normal node, or the node that is in maintenance mode.
> 
> In either case, as I understand it, maintenance mode doesn't stop
> pacemaker, so it can still react to the sudden loss of membership.
> 
> I wonder; Why are you stopping corosync? If you want to stop the node,
> why not stop pacemaker entirely first?
> 

I did a /etc/init.d/openais stopped on that node i put in maintenance via "crm 
node maintenance "

I think on my SLES 11 SP4 the init script from openais is responsible for both: 
cluster (pacemaker) and communication (openais/corosync).
I didn't find a dedicated init script for pacemaker.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Ken Gaillot
On Mon, 2017-10-16 at 21:49 +0200, Lars Ellenberg wrote:
> On Mon, Oct 16, 2017 at 09:20:52PM +0200, Lentes, Bernd wrote:
> > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote:
> > > On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
> > > > i have the following behavior: I put a node in maintenance
> > > > mode, afterwards stop
> > > > corosync on that node with /etc/init.d/openais stop.
> > > > This node is immediately fenced. Is that expected behavior ? I
> > > > thought putting a
> > > > node into maintenance does mean the cluster does not care
> > > > anymore about that
> > > > node.
> > OS is SLES 11 SP4. That's not the most recent one.
> > Pacmekaer is 1.1.12.
> > I didn't plan to remove the node, but to do some maintenance on it.
> > 
> > If i put the node in standby, then i can invoke
> > "/etc/init.d/openais
> > stop" without that node getting fenced.
> > But then all resources on that node are stopped/migrated. If i
> > don't
> > want that, i thought maintenance is the right way.
> > Am i wrong ?
> > 
> > Ah, i just saw that i wasn't complete clear. The node is fenced
> > after
> > stopping openais, not after putting it into maintenance.
> > I did that via "crm node maintenance "
> 
> from the Changelog:
> 
> Changes since Pacemaker-1.1.15
>   ...
>   + pengine: do not fence a node in maintenance mode if it shuts down
> cleanly
>   ...
> 
> just saying ... may or may not be what you are seeing.
> 
> Short term "workaround" may be to do things differently.
> Maybe just set the cluster wide maintenance mode, not per node?

Sounds right.

Another thing to keep in mind is that even if pacemaker doesn't fence
the node, if you use DLM, DLM might fence the node (it doesn't know
about or respect any pacemaker maintenance/unmanaged settings).

I'd stop pacemaker before stopping corosync, in any case. In
maintenance mode, that should be fine. I don't think a running
pacemaker would be able to reconnect to corosync after corosync comes
back.

> What are you really trying to do,
> what is the reason you need it in maintenance-mode
> and stop pacemaker/corosync/openais/the clusterstack,
> but do not want to stop/migrate off the resources,
> as would be done with "standby"?
> 
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Digimer
On 2017-10-16 03:20 PM, Lentes, Bernd wrote:
> 
> 
> - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote:
> 
>> On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
>>> Hi,
>>>
>>> i have the following behavior: I put a node in maintenance mode, afterwards 
>>> stop
>>> corosync on that node with /etc/init.d/openais stop.
>>> This node is immediately fenced. Is that expected behavior ? I thought 
>>> putting a
>>> node into maintenance does mean the cluster does not care anymore about that
>>> node.
>>>
>>> OS on my nodes is SLES 11 SP4.
>>>
>>> Thanks.
>>>
>>>
>>> Bernd
>>
>> Well, if you stop corosync, it would appear to leave gracefully from
>> corosync's perspective so the other node should know that it didn't
>> fail. However, and I am not a pacemaker expert, I would guess that
>> pacemaker just saw the membership change that it wasn't expecting and
>> invoked a fence.
>>
>> If you plan to remove a node, it is probably best to stop pacemaker,
>> then stop corosync.
>>
>> Also, 'openais' is ld. Is this an old cluster? Corosync came out of
>> the openais project.
> 
> Well, OS is SLES 11 SP4. That's not the most recent one.
> Pacmekaer is 1.1.12. I didn't plan to remove the node, but to do some 
> maintenance on it.
> 
> If i put the node in standby, then i can invoke "/etc/init.d/openais stop" 
> without that node getting fenced.
> But then all resources on that node are stopped/migrated. If i don't want 
> that, i thought maintenance is the right way.
> Am i wrong ?
> 
> Ah, i just saw that i wasn't complete clear. The node is fenced after 
> stopping openais, not after putting it into maintenance.
> I did that via "crm node maintenance "
> 
> Bernd

I understood what you meant about it getting fenced after stopping
corosync. What I am not clear on is if you are stopping corosync on the
normal node, or the node that is in maintenance mode.

In either case, as I understand it, maintenance mode doesn't stop
pacemaker, so it can still react to the sudden loss of membership.

I wonder; Why are you stopping corosync? If you want to stop the node,
why not stop pacemaker entirely first?

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd



- On Oct 16, 2017, at 7:37 PM, emmanuel segura emi2f...@gmail.com wrote:

> I put a node in maintenance mode?

> do you mean you put the cluster in maintenance mode

I did "crm node maintenance ". From my understanding that means that i 
put the node in maintenance mode.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd


- On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote:

> On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
>> Hi,
>> 
>> i have the following behavior: I put a node in maintenance mode, afterwards 
>> stop
>> corosync on that node with /etc/init.d/openais stop.
>> This node is immediately fenced. Is that expected behavior ? I thought 
>> putting a
>> node into maintenance does mean the cluster does not care anymore about that
>> node.
>> 
>> OS on my nodes is SLES 11 SP4.
>> 
>> Thanks.
>> 
>> 
>> Bernd
> 
> Well, if you stop corosync, it would appear to leave gracefully from
> corosync's perspective so the other node should know that it didn't
> fail. However, and I am not a pacemaker expert, I would guess that
> pacemaker just saw the membership change that it wasn't expecting and
> invoked a fence.
> 
> If you plan to remove a node, it is probably best to stop pacemaker,
> then stop corosync.
> 
> Also, 'openais' is ld. Is this an old cluster? Corosync came out of
> the openais project.

Well, OS is SLES 11 SP4. That's not the most recent one.
Pacmekaer is 1.1.12. I didn't plan to remove the node, but to do some 
maintenance on it.

If i put the node in standby, then i can invoke "/etc/init.d/openais stop" 
without that node getting fenced.
But then all resources on that node are stopped/migrated. If i don't want that, 
i thought maintenance is the right way.
Am i wrong ?

Ah, i just saw that i wasn't complete clear. The node is fenced after stopping 
openais, not after putting it into maintenance.
I did that via "crm node maintenance "

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread emmanuel segura
I put a node in maintenance mode?

do you mean you put the cluster in maintenance mode

2017-10-16 19:24 GMT+02:00 Lentes, Bernd :

> Hi,
>
> i have the following behavior: I put a node in maintenance mode,
> afterwards stop corosync on that node with /etc/init.d/openais stop.
> This node is immediately fenced. Is that expected behavior ? I thought
> putting a node into maintenance does mean the cluster does not care anymore
> about that node.
>
> OS on my nodes is SLES 11 SP4.
>
> Thanks.
>
>
> Bernd
>
> --
> Bernd Lentes
>
> Systemadministration
> institute of developmental genetics
> Gebäude 35.34 - Raum 208
> HelmholtzZentrum München
> bernd.len...@helmholtz-muenchen.de
> phone: +49 (0)89 3187 1241
> fax: +49 (0)89 3187 2294
>
> no backup - no mercy
>
>
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons
> Enhsen
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Digimer
On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
> Hi,
> 
> i have the following behavior: I put a node in maintenance mode, afterwards 
> stop corosync on that node with /etc/init.d/openais stop.
> This node is immediately fenced. Is that expected behavior ? I thought 
> putting a node into maintenance does mean the cluster does not care anymore 
> about that node.
> 
> OS on my nodes is SLES 11 SP4.
> 
> Thanks.
> 
> 
> Bernd

Well, if you stop corosync, it would appear to leave gracefully from
corosync's perspective so the other node should know that it didn't
fail. However, and I am not a pacemaker expert, I would guess that
pacemaker just saw the membership change that it wasn't expecting and
invoked a fence.

If you plan to remove a node, it is probably best to stop pacemaker,
then stop corosync.

Also, 'openais' is ld. Is this an old cluster? Corosync came out of
the openais project.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd
Hi,

i have the following behavior: I put a node in maintenance mode, afterwards 
stop corosync on that node with /etc/init.d/openais stop.
This node is immediately fenced. Is that expected behavior ? I thought putting 
a node into maintenance does mean the cluster does not care anymore about that 
node.

OS on my nodes is SLES 11 SP4.

Thanks.


Bernd

-- 
Bernd Lentes 

Systemadministration 
institute of developmental genetics 
Gebäude 35.34 - Raum 208 
HelmholtzZentrum München 
bernd.len...@helmholtz-muenchen.de 
phone: +49 (0)89 3187 1241 
fax: +49 (0)89 3187 2294 

no backup - no mercy
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org