Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
- On Oct 16, 2017, at 10:57 PM, kgaillot kgail...@redhat.com wrote: >> from the Changelog: >> >> Changes since Pacemaker-1.1.15 >> ... >> + pengine: do not fence a node in maintenance mode if it shuts down >> cleanly >> ... >> >> just saying ... may or may not be what you are seeing. >> >> Short term "workaround" may be to do things differently. >> Maybe just set the cluster wide maintenance mode, not per node? > > Sounds right. > > Another thing to keep in mind is that even if pacemaker doesn't fence > the node, if you use DLM, DLM might fence the node (it doesn't know > about or respect any pacemaker maintenance/unmanaged settings). > > I'd stop pacemaker before stopping corosync, in any case. In > maintenance mode, that should be fine. I don't think a running > pacemaker would be able to reconnect to corosync after corosync comes > back. > As Ulrich already mentioned the suse openais init script is responsible for both, pacemaker and corosync. I have DLM in combination with cLVM, maybe that's the culprit. I will test to stop the DLM and cLVM resource before doing maintenance and stop corosync, maybe then it's not fenced. I'm thinking of stopping using DLM in conjunction with cLVM and a SAN. I read an article (http://www.admin-magazine.com/Articles/Live-Migration , see chapter "The Weakest Link") saying that DLM is tricky and not completely stable. It mentioned that Bastian Blank, who seems to be a maintainer of the Debian team, deactivated cLVM in the debian kernel. But the article is from 2013, so i'm not pretty sure. Maybe DRBD and no SAN, so no DLM would be the better solution. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
- On Oct 16, 2017, at 9:27 PM, Digimer li...@alteeve.ca wrote: > > I understood what you meant about it getting fenced after stopping > corosync. What I am not clear on is if you are stopping corosync on the > normal node, or the node that is in maintenance mode. > > In either case, as I understand it, maintenance mode doesn't stop > pacemaker, so it can still react to the sudden loss of membership. > > I wonder; Why are you stopping corosync? If you want to stop the node, > why not stop pacemaker entirely first? > I did a /etc/init.d/openais stopped on that node i put in maintenance via "crm node maintenance " I think on my SLES 11 SP4 the init script from openais is responsible for both: cluster (pacemaker) and communication (openais/corosync). I didn't find a dedicated init script for pacemaker. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
On Mon, 2017-10-16 at 21:49 +0200, Lars Ellenberg wrote: > On Mon, Oct 16, 2017 at 09:20:52PM +0200, Lentes, Bernd wrote: > > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > > > On 2017-10-16 01:24 PM, Lentes, Bernd wrote: > > > > i have the following behavior: I put a node in maintenance > > > > mode, afterwards stop > > > > corosync on that node with /etc/init.d/openais stop. > > > > This node is immediately fenced. Is that expected behavior ? I > > > > thought putting a > > > > node into maintenance does mean the cluster does not care > > > > anymore about that > > > > node. > > OS is SLES 11 SP4. That's not the most recent one. > > Pacmekaer is 1.1.12. > > I didn't plan to remove the node, but to do some maintenance on it. > > > > If i put the node in standby, then i can invoke > > "/etc/init.d/openais > > stop" without that node getting fenced. > > But then all resources on that node are stopped/migrated. If i > > don't > > want that, i thought maintenance is the right way. > > Am i wrong ? > > > > Ah, i just saw that i wasn't complete clear. The node is fenced > > after > > stopping openais, not after putting it into maintenance. > > I did that via "crm node maintenance " > > from the Changelog: > > Changes since Pacemaker-1.1.15 > ... > + pengine: do not fence a node in maintenance mode if it shuts down > cleanly > ... > > just saying ... may or may not be what you are seeing. > > Short term "workaround" may be to do things differently. > Maybe just set the cluster wide maintenance mode, not per node? Sounds right. Another thing to keep in mind is that even if pacemaker doesn't fence the node, if you use DLM, DLM might fence the node (it doesn't know about or respect any pacemaker maintenance/unmanaged settings). I'd stop pacemaker before stopping corosync, in any case. In maintenance mode, that should be fine. I don't think a running pacemaker would be able to reconnect to corosync after corosync comes back. > What are you really trying to do, > what is the reason you need it in maintenance-mode > and stop pacemaker/corosync/openais/the clusterstack, > but do not want to stop/migrate off the resources, > as would be done with "standby"? > -- Ken Gaillot___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
On 2017-10-16 03:20 PM, Lentes, Bernd wrote: > > > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > >> On 2017-10-16 01:24 PM, Lentes, Bernd wrote: >>> Hi, >>> >>> i have the following behavior: I put a node in maintenance mode, afterwards >>> stop >>> corosync on that node with /etc/init.d/openais stop. >>> This node is immediately fenced. Is that expected behavior ? I thought >>> putting a >>> node into maintenance does mean the cluster does not care anymore about that >>> node. >>> >>> OS on my nodes is SLES 11 SP4. >>> >>> Thanks. >>> >>> >>> Bernd >> >> Well, if you stop corosync, it would appear to leave gracefully from >> corosync's perspective so the other node should know that it didn't >> fail. However, and I am not a pacemaker expert, I would guess that >> pacemaker just saw the membership change that it wasn't expecting and >> invoked a fence. >> >> If you plan to remove a node, it is probably best to stop pacemaker, >> then stop corosync. >> >> Also, 'openais' is ld. Is this an old cluster? Corosync came out of >> the openais project. > > Well, OS is SLES 11 SP4. That's not the most recent one. > Pacmekaer is 1.1.12. I didn't plan to remove the node, but to do some > maintenance on it. > > If i put the node in standby, then i can invoke "/etc/init.d/openais stop" > without that node getting fenced. > But then all resources on that node are stopped/migrated. If i don't want > that, i thought maintenance is the right way. > Am i wrong ? > > Ah, i just saw that i wasn't complete clear. The node is fenced after > stopping openais, not after putting it into maintenance. > I did that via "crm node maintenance " > > Bernd I understood what you meant about it getting fenced after stopping corosync. What I am not clear on is if you are stopping corosync on the normal node, or the node that is in maintenance mode. In either case, as I understand it, maintenance mode doesn't stop pacemaker, so it can still react to the sudden loss of membership. I wonder; Why are you stopping corosync? If you want to stop the node, why not stop pacemaker entirely first? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
- On Oct 16, 2017, at 7:37 PM, emmanuel segura emi2f...@gmail.com wrote: > I put a node in maintenance mode? > do you mean you put the cluster in maintenance mode I did "crm node maintenance ". From my understanding that means that i put the node in maintenance mode. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
- On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > On 2017-10-16 01:24 PM, Lentes, Bernd wrote: >> Hi, >> >> i have the following behavior: I put a node in maintenance mode, afterwards >> stop >> corosync on that node with /etc/init.d/openais stop. >> This node is immediately fenced. Is that expected behavior ? I thought >> putting a >> node into maintenance does mean the cluster does not care anymore about that >> node. >> >> OS on my nodes is SLES 11 SP4. >> >> Thanks. >> >> >> Bernd > > Well, if you stop corosync, it would appear to leave gracefully from > corosync's perspective so the other node should know that it didn't > fail. However, and I am not a pacemaker expert, I would guess that > pacemaker just saw the membership change that it wasn't expecting and > invoked a fence. > > If you plan to remove a node, it is probably best to stop pacemaker, > then stop corosync. > > Also, 'openais' is ld. Is this an old cluster? Corosync came out of > the openais project. Well, OS is SLES 11 SP4. That's not the most recent one. Pacmekaer is 1.1.12. I didn't plan to remove the node, but to do some maintenance on it. If i put the node in standby, then i can invoke "/etc/init.d/openais stop" without that node getting fenced. But then all resources on that node are stopped/migrated. If i don't want that, i thought maintenance is the right way. Am i wrong ? Ah, i just saw that i wasn't complete clear. The node is fenced after stopping openais, not after putting it into maintenance. I did that via "crm node maintenance " Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
I put a node in maintenance mode? do you mean you put the cluster in maintenance mode 2017-10-16 19:24 GMT+02:00 Lentes, Bernd: > Hi, > > i have the following behavior: I put a node in maintenance mode, > afterwards stop corosync on that node with /etc/init.d/openais stop. > This node is immediately fenced. Is that expected behavior ? I thought > putting a node into maintenance does mean the cluster does not care anymore > about that node. > > OS on my nodes is SLES 11 SP4. > > Thanks. > > > Bernd > > -- > Bernd Lentes > > Systemadministration > institute of developmental genetics > Gebäude 35.34 - Raum 208 > HelmholtzZentrum München > bernd.len...@helmholtz-muenchen.de > phone: +49 (0)89 3187 1241 > fax: +49 (0)89 3187 2294 > > no backup - no mercy > > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe > Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons > Enhsen > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- .~. /V\ // \\ /( )\ ^`~'^ ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
On 2017-10-16 01:24 PM, Lentes, Bernd wrote: > Hi, > > i have the following behavior: I put a node in maintenance mode, afterwards > stop corosync on that node with /etc/init.d/openais stop. > This node is immediately fenced. Is that expected behavior ? I thought > putting a node into maintenance does mean the cluster does not care anymore > about that node. > > OS on my nodes is SLES 11 SP4. > > Thanks. > > > Bernd Well, if you stop corosync, it would appear to leave gracefully from corosync's perspective so the other node should know that it didn't fail. However, and I am not a pacemaker expert, I would guess that pacemaker just saw the membership change that it wasn't expecting and invoked a fence. If you plan to remove a node, it is probably best to stop pacemaker, then stop corosync. Also, 'openais' is ld. Is this an old cluster? Corosync came out of the openais project. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?
Hi, i have the following behavior: I put a node in maintenance mode, afterwards stop corosync on that node with /etc/init.d/openais stop. This node is immediately fenced. Is that expected behavior ? I thought putting a node into maintenance does mean the cluster does not care anymore about that node. OS on my nodes is SLES 11 SP4. Thanks. Bernd -- Bernd Lentes Systemadministration institute of developmental genetics Gebäude 35.34 - Raum 208 HelmholtzZentrum München bernd.len...@helmholtz-muenchen.de phone: +49 (0)89 3187 1241 fax: +49 (0)89 3187 2294 no backup - no mercy Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org