Re: [ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
On 8/12/19 9:24 PM, Klaus Wenninger wrote: [...] > If you shutdown solely pacemaker one-by-one on all nodes > and these shutdowns are considered graceful then you are > not gonna experience any reboots (e.g. 3 node cluster). While revisit what you said, then run `systemctl stop pacemaker` one by one. At this point corosync is still running on all nodes. ( NOTE: make sure no "StopWhenUnneded=yes" in corosync.service to get hijacked when stopping pacemaker. ) > Afterwards you can shutdown corosync one-by-one as well > without experiencing reboots as without the cib-connection > sbd isn't gonna check for quorum anymore (all resources > down so no need to reboot in case of quorum-loss - extra > care has to be taken care of with unmanaged resources but > that isn't particular with sbd). > Then, `systemctl stop corosync` one by one. Nice! disk-less sbd does play the trick as above. All nodes stay! Thanks, Roger ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Restoring network connection breaks cluster services
Momcilo On Wed, Aug 7, 2019 at 1:00 PM Klaus Wenninger wrote: On 8/7/19 12:26 PM, Momcilo Medic wrote: We have three node cluster that is setup to stop resources on lost quorum. Failure (network going down) handling is done properly, but recovery doesn't seem to work. What do you mean by 'network going down'? Loss of link? Does the IP persist on the interface in that case? Yes, we simulate faulty cable by turning switch ports down and up. In such a case, the IP does not persist on the interface. What corosync version you have? Corosync was really bad in handling ifdown (removal of ip) properly till version 3 with knet which solved problem completely and 2.4.5, where it is so-so for udpu (udp is still affected). Solution is ether upgrade corosync or configure system to keep ip intact. Honza That there are issue reconnecting the CPG-API sounds strange to me. Already the fact that something has to be reconnected. I got it that your nodes were persistently up during the network-disconnection. Although I would have expected fencing to kick in at least on those which are part of the non-quorate cluster-partition. Maybe a few words more on your scenario (fening-setup e.g.) would help to understand what is going on. We don't use any fencing mechanisms, we rely on quorum to run the services. In more detail, we run three node Linbit LINSTOR storage that is hyperconverged. Meaning, we run clustered storage on the virtualization hypervisors. We use pcs in order to have linstor-controller service in high availabilty mode. Policy for no quorum is to stop the resources. In such hyperconverged setup, we can't fence a node without impact. It may happen that network instability causes primary node to no longer be primary. In that case, we don't want running VMs to go down with the ship, as there was no impact for them. However, we would like to have high-availability of that service upon network restoration, without manual actions. Klaus What happens is, services crash when we re-enable network connection. From journal: ``` ... Jul 12 00:27:32 itaftestkvmls02.dc.itaf.eu corosync[9069]: corosync: totemsrp.c:1328: memb_consensus_agreed: Assertion `token_memb_entries >= 1' failed. Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu attrd[9104]:error: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu stonith-ng[9100]:error: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu systemd[1]: corosync.service: Main process exited, code=dumped, status=6/ABRT Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu cib[9098]:error: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu systemd[1]: corosync.service: Failed with result 'core-dump'. Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu pacemakerd[9087]:error: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu systemd[1]: pacemaker.service: Main process exited, code=exited, status=107/n/a Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu systemd[1]: pacemaker.service: Failed with result 'exit-code'. Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu systemd[1]: Stopped Pacemaker High Availability Cluster Manager. Jul 12 00:27:33 itaftestkvmls02.dc.itaf.eu lrmd[9102]: warning: new_event_notification (9102-9107-7): Bad file descriptor (9) ... ``` Pacemaker's log shows no relevant info. This is from corosync's log: ``` Jul 12 00:27:33 [9107] itaftestkvmls02.dc.itaf.eu crmd: info: qb_ipcs_us_withdraw:withdrawing server sockets Jul 12 00:27:33 [9104] itaftestkvmls02.dc.itaf.eu attrd:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 [9100] itaftestkvmls02.dc.itaf.eu stonith-ng:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 [9098] itaftestkvmls02.dc.itaf.eucib:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 [9087] itaftestkvmls02.dc.itaf.eu pacemakerd:error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Jul 12 00:27:33 [9104] itaftestkvmls02.dc.itaf.eu attrd: info: qb_ipcs_us_withdraw:withdrawing server sockets Jul 12 00:27:33 [9087] itaftestkvmls02.dc.itaf.eu pacemakerd: info: crm_xml_cleanup:Cleaning up memory from libxml2 Jul 12 00:27:33 [9107] itaftestkvmls02.dc.itaf.eu crmd: info: crm_xml_cleanup:Cleaning up memory from libxml2 Jul 12 00:27:33 [9100] itaftestkvmls02.dc.itaf.eu stonith-ng: info: qb_ipcs_us_withdraw:withdrawing server sockets Jul 12 00:27:33 [9104] itaftestkvmls02.dc.itaf.eu attrd: info: crm_xml_cleanup:Cleaning up memory from libxml2 Jul 12 00:27:33 [9098] itaftestkvmls02.dc.itaf.eucib: info: qb_ipcs_us_withdraw:withdrawing server sockets Jul 12 00:27:33 [9100] itaftestkvmls02.dc
Re: [ClusterLabs] [EXTERNAL] Users Digest, Vol 55, Issue 19
erfekt ist wer keine Fehler macht > Also sind Tote perfekt > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich > Bassler, Kerstin Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > > > -- > > Message: 2 > Date: Mon, 12 Aug 2019 12:24:02 +0530 > From: Shital A > To: pgsql-gene...@postgresql.com, Users@clusterlabs.org > Subject: [ClusterLabs] Postgres HA - pacemaker RA do not support auto > failback > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > Hello, > > Postgres version : 9.6 > OS:Rhel 7.6 > > We are working on HA setup for postgres cluster of two nodes in > active-passive mode. > > Installed: > Pacemaker 1.1.19 > Corosync 2.4.3 > > The pacemaker agent with this installation doesn't support automatic > failback. What I mean by that is explained below: > 1. Cluster is setup like A - B with A as master. > 2. Kill services on A, node B will come up as master. > 3. node A is ready to join the cluster, we have to delete the lock file it > creates on any one of the node and execute the cleanup command to get the > node back as standby > > Step 3 is manual so HA is not achieved in real sense. > > Please help to check: > 1. Is there any version of the resouce agent which supports automatic > failback? To avoid generation of lock file and deleting it. > > 2. If there is no such support, if we need such functionality, do we have > to modify existing code? > > How this can be achieved. Please suggest. > Thanks. > > Thanks. > -- next part -- > An HTML attachment was scrubbed... > URL: > <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html> > > -- > > Message: 3 > Date: Mon, 12 Aug 2019 17:47:02 + > From: Chris Walker > To: Cluster Labs - All topics related to open-source clustering > welcomed > Subject: Re: [ClusterLabs] why is node fenced ? > Message-ID: > Content-Type: text/plain; charset="utf-8" > > When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for > example, > > Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: > pcmk_quorum_notification: Quorum retained | membership=1320 members=1 > > after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and > STONITHed as part of startup fencing. > > There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw > ha-idg-1 either, so it appears that there was no communication at all between > the two nodes. > > I'm not sure exactly why the nodes did not see one another, but there are > indications of network issues around this time > > 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now > running without any active interface! > > so perhaps that's related. > > HTH, > Chris > > > ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" > bernd.len...@helmholtz-muenchen.de> wrote: > > Hi, > > last Friday (9th of August) i had to install patches on my two-node > cluster. > I put one of the nodes (ha-idg-2) into standby (crm node standby > ha-idg-2), patched it, rebooted, > started the cluster (systemctl start pacemaker) again, put the node again > online, everything fine. > > Then i wanted to do the same procedure with the other node (ha-idg-1). > I put it in standby, patched it, rebooted, started pacemaker again. > But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. > I know that nodes which are unclean need to be shutdown, that's logical. > > But i don't know from where the conclusion comes that the node is unclean > respectively why it is unclean, > i searched in the logs and didn't find any hint. > > I put the syslog and the pacemaker log on a seafile share, i'd be very > thankful if you'll have a look. > https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ > > Here the cli history of the commands: > > 17:03:04 crm node standby ha-idg-2 > 17:07:15 zypper up (install Updates on ha-idg-2) > 17:17:30 systemctl reboot > 17:25:21 systemct
Re: [ClusterLabs] Master/slave failover does not work as expected
thankful if you'll have a look. https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ Here the cli history of the commands: 17:03:04 crm node standby ha-idg-2 17:07:15 zypper up (install Updates on ha-idg-2) 17:17:30 systemctl reboot 17:25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut f?r Entwicklungsgenetik Geb?ude 35.34 - Raum 208 HelmholtzZentrum m?nchen bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de> phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg Perfekt ist wer keine Fehler macht Also sind Tote perfekt Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de<http://www.helmholtz-muenchen.de> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 -- Message: 2 Date: Mon, 12 Aug 2019 12:24:02 +0530 From: Shital A mailto:brightuser2...@gmail.com>> To: pgsql-gene...@postgresql.com<mailto:pgsql-gene...@postgresql.com>, Users@clusterlabs.org<mailto:Users@clusterlabs.org> Subject: [ClusterLabs] Postgres HA - pacemaker RA do not support auto failback Message-ID: mailto:camp7vw_kf2em_buh_fpbznc9z6pvvx+7rxjymhfmcozxuwg...@mail.gmail.com>> Content-Type: text/plain; charset="utf-8" Hello, Postgres version : 9.6 OS:Rhel 7.6 We are working on HA setup for postgres cluster of two nodes in active-passive mode. Installed: Pacemaker 1.1.19 Corosync 2.4.3 The pacemaker agent with this installation doesn't support automatic failback. What I mean by that is explained below: 1. Cluster is setup like A - B with A as master. 2. Kill services on A, node B will come up as master. 3. node A is ready to join the cluster, we have to delete the lock file it creates on any one of the node and execute the cleanup command to get the node back as standby Step 3 is manual so HA is not achieved in real sense. Please help to check: 1. Is there any version of the resouce agent which supports automatic failback? To avoid generation of lock file and deleting it. 2. If there is no such support, if we need such functionality, do we have to modify existing code? How this can be achieved. Please suggest. Thanks. Thanks. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html> -- Message: 3 Date: Mon, 12 Aug 2019 17:47:02 + From: Chris Walker mailto:cwal...@cray.com>> To: Cluster Labs - All topics related to open-source clustering welcomed mailto:users@clusterlabs.org>> Subject: Re: [ClusterLabs] why is node fenced ? Message-ID: mailto:eafef777-5a49-4c06-a2f6-8711f528b...@cray.com>> Content-Type: text/plain; charset="utf-8" When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example, Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1 after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing. There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes. I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface! so perhaps that's related. HTH, Chris ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" mailto:users-boun...@clusterlabs.org%20on%20behalf%20of%20bernd.len...@helmholtz-muenchen.de>> wrote: Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in
Re: [ClusterLabs] [EXTERNAL] Users Digest, Vol 55, Issue 21
there is no such support, if we need such functionality, do we have to modify existing code? How this can be achieved. Please suggest. Thanks. Thanks. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html> -- Message: 3 Date: Mon, 12 Aug 2019 17:47:02 + From: Chris Walker mailto:cwal...@cray.com>> To: Cluster Labs - All topics related to open-source clustering welcomed mailto:users@clusterlabs.org>> Subject: Re: [ClusterLabs] why is node fenced ? Message-ID: mailto:eafef777-5a49-4c06-a2f6-8711f528b...@cray.com>> Content-Type: text/plain; charset="utf-8" When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example, Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1 after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing. There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes. I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface! so perhaps that's related. HTH, Chris ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" mailto:users-boun...@clusterlabs.org%20on%20behalf%20of%20bernd.len...@helmholtz-muenchen.de>> wrote: Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in standby, patched it, rebooted, started pacemaker again. But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. I know that nodes which are unclean need to be shutdown, that's logical. But i don't know from where the conclusion comes that the node is unclean respectively why it is unclean, i searched in the logs and didn't find any hint. I put the syslog and the pacemaker log on a seafile share, i'd be very thankful if you'll have a look. https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ Here the cli history of the commands: 17:03:04 crm node standby ha-idg-2 17:07:15 zypper up (install Updates on ha-idg-2) 17:17:30 systemctl reboot 17:25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut f?r Entwicklungsgenetik Geb?ude 35.34 - Raum 208 HelmholtzZentrum m?nchen bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de> phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg Perfekt ist wer keine Fehler macht Also sind Tote perfekt Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de<http://www.helmholtz-muenchen.de> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ -- Message: 4 Date: Mon, 12 Aug 2019 23:09:31 +0300 From: Andrei Borzenkov mailto:arvidj...@gmail.com>> To: Cluster Labs - All topics related to open-source clustering welcomed mailto:users@clusterlabs.org>> Cc: Venkata Reddy Chappavarapu mailto:venkata.chappavar...@harmonicinc.com>> Subject: Re: [ClusterLabs] Master/slave failover does not work as expected Message-ID: mailto:CAA91j0WxSxt_eVmUvXgJ_0goBkBw69r3o-VesRvGc6atg6o=j...@
Re: [ClusterLabs] Querying failed rersource operations from the CIB
On Mon, 2019-08-12 at 11:15 +0200, Ulrich Windl wrote: > Hi! > > Back in December 2011 I had written a script to retrieve all failed > resource operations by using "cibadmin -Q -o lrm_resources" as data > base. I was querying lrm_rsc_op for op-status != 0. > In a newer release this does not seems to work anymore. > > I see resource IDs ending with "_last_0", "_monitor_6", and > "_last_failure_0", but even in the "_last_failure_0" the op-status is > "0" (rc-code="7"). > Is this some bug, or is it a feature? That is: When will op-status be > != 0? rc-code is the result of the action itself (i.e. the resource agent), whereas op-status is the result of pacemaker's attempt to execute the agent. If pacemaker was able to successfully initiate the resource agent and get a reply back, then op-status will be 0, regardless of the rc-code reported by the agent. op-status will be nonzero when it couldn't get a result from the agent -- the agent is not installed on the node, the agent timed out, the connection to the local executor or Pacemaker Remote was lost, the action was requested while the node was shutting down, etc. There's also a special op-status (193) that indicates an action is pending (i.e. it has been initiated and we're waiting for it to complete). This is only seen when record-pending is true. > crm_mon still reports a resource failure like this: > Failed Resource Actions: > * prm_nfs_server_monitor_6 on h11 'not running' (7): call=738, > status=complete, exitreason='', > last-rc-change='Mon Aug 12 04:52:23 2019', queued=0ms, exec=0ms > > (it seems the nfs server monitor does this under load in SLES12 SP4, > and I wonder where to look for the reason) > BTW: "lrm_resources" is not documented, and the structure seemes to > change. Can I restrict the output to LRM data? One possibility is to run crm_mon with --as-xml and parse the failed actions from that output. The schema is distributed as crm_mon.rng. > Regards, > Ulrich -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] why is node fenced ?
On Mon, 2019-08-12 at 18:09 +0200, Lentes, Bernd wrote: > Hi, > > last Friday (9th of August) i had to install patches on my two-node > cluster. > I put one of the nodes (ha-idg-2) into standby (crm node standby ha- > idg-2), patched it, rebooted, > started the cluster (systemctl start pacemaker) again, put the node > again online, everything fine. > > Then i wanted to do the same procedure with the other node (ha-idg- > 1). > I put it in standby, patched it, rebooted, started pacemaker again. > But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. > I know that nodes which are unclean need to be shutdown, that's > logical. > > But i don't know from where the conclusion comes that the node is > unclean respectively why it is unclean, > i searched in the logs and didn't find any hint. The key messages are: Aug 09 17:43:27 [6326] ha-idg-1 crmd: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (2ms) Aug 09 17:43:27 [6326] ha-idg-1 crmd: warning: do_log: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped That indicates the newly rebooted node didn't hear from the other node within 20s, and so assumed it was dead. The new node had quorum, but never saw the other node's corosync, so I'm guessing you have two_node and/or wait_for_all disabled in corosync.conf, and/or you have no-quorum-policy=ignore in pacemaker. I'd recommend two_node: 1 in corosync.conf, with no explicit wait_for_all or no-quorum-policy setting. That would ensure a rebooted/restarted node doesn't get initial quorum until it has seen the other node. > I put the syslog and the pacemaker log on a seafile share, i'd be > very thankful if you'll have a look. > https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ > > Here the cli history of the commands: > > 17:03:04 crm node standby ha-idg-2 > 17:07:15 zypper up (install Updates on ha-idg-2) > 17:17:30 systemctl reboot > 17:25:21 systemctl start pacemaker.service > 17:25:47 crm node online ha-idg-2 > 17:26:35 crm node standby ha-idg1- > 17:30:21 zypper up (install Updates on ha-idg-1) > 17:37:32 systemctl reboot > 17:43:04 systemctl start pacemaker.service > 17:44:00 ha-idg-1 is fenced > > Thanks. > > Bernd > > OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 > > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup
On Mon, 2019-08-12 at 17:46 +0200, Ulrich Windl wrote: > Hi! > > I just noticed that a "crm resource cleanup " caused some > unexpected behavior and the syslog message: > crmd[7281]: warning: new_event_notification (7281-97955-15): Broken > pipe (32) > > It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker- > 1.1.19+20180928.0d2680780-1.8.x86_64). > > The cleanup was due to a failed monitor. As an unexpected consequence > of this cleanup, CRM seemed to restart the complete resource (and > dependencies), even though it was running. I assume the monitor failure was old, and recovery had already completed? If not, recovery might have been initiated before the clean- up was recorded. > I noticed that a manual "crm_resource -C -r -N " command > has the same effect (multiple resources are "Cleaned up", resources > are restarted seemingly before the "probe" is done.). Can you verify whether the probes were done? The DC should log a message when each _monitor_0 result comes in. > Actually the manual says when cleaning up a single primitive, the > whole group is cleaned up, unless using --force. Well ,I don't like > this default, as I expect any status change from probe would > propagate to the group anyway... In 1.1, clean-up always wipes the history of the affected resources, regardless of whether the history is for success or failure. That means all the cleaned resources will be reprobed. In 2.0, clean-up by default wipes the history only if there's a failed action (--refresh/-R is required to get the 1.1 behavior). That lessens the impact of the "default to whole group" behavior. I think the original idea was that a group indicates that the resources are closely related, so changing the status of one member might affect what status the others report. > Regards, > Ulrich -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Master/slave failover does not work as expected
On Mon, 2019-08-12 at 23:09 +0300, Andrei Borzenkov wrote: > > > On Mon, Aug 12, 2019 at 4:12 PM Michael Powell < > michael.pow...@harmonicinc.com> wrote: > > At 07:44:49, the ss agent discovers that the master instance has > > failed on node mgraid…-0 as a result of a failed ssadm request in > > response to an ss_monitor() operation. It issues a crm_master -Q > > -D command with the intent of demoting the master and promoting the > > slave, on the other node, to master. The ss_demote() function > > finds that the application is no longer running and returns > > OCF_NOT_RUNNING (7). In the older product, this was sufficient to > > promote the other instance to master, but in the current product, > > that does not happen. Currently, the failed application is > > restarted, as expected, and is promoted to master, but this takes > > 10’s of seconds. > > > > > > Did you try to disable resource stickiness for this ms? Stickiness shouldn't affect where the master role is placed, just whether the resource instances should stay on their current nodes (independently of whether their role is staying the same or changing). Are there any constraints that apply to the master role? Another possibility is that you are mixing crm_master with and without --lifetime=reboot (which controls whether the master attribute is transient or permanent). Transient should really be the default but isn't for historical reasons. It's a good idea to always use -- lifetime=reboot. You could double-check with "cibadmin -Q|grep master-" and see if there is more than one entry per node. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Master/slave failover does not work as expected
25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut f?r Entwicklungsgenetik Geb?ude 35.34 - Raum 208 HelmholtzZentrum m?nchen bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenchen.de> phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg Perfekt ist wer keine Fehler macht Also sind Tote perfekt Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de<http://www.helmholtz-muenchen.de> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 -- Message: 2 Date: Mon, 12 Aug 2019 12:24:02 +0530 From: Shital A mailto:brightuser2...@gmail.com>> To: pgsql-gene...@postgresql.com<mailto:pgsql-gene...@postgresql.com>, Users@clusterlabs.org<mailto:Users@clusterlabs.org> Subject: [ClusterLabs] Postgres HA - pacemaker RA do not support auto failback Message-ID: mailto:camp7vw_kf2em_buh_fpbznc9z6pvvx+7rxjymhfmcozxuwg...@mail.gmail.com>> Content-Type: text/plain; charset="utf-8" Hello, Postgres version : 9.6 OS:Rhel 7.6 We are working on HA setup for postgres cluster of two nodes in active-passive mode. Installed: Pacemaker 1.1.19 Corosync 2.4.3 The pacemaker agent with this installation doesn't support automatic failback. What I mean by that is explained below: 1. Cluster is setup like A - B with A as master. 2. Kill services on A, node B will come up as master. 3. node A is ready to join the cluster, we have to delete the lock file it creates on any one of the node and execute the cleanup command to get the node back as standby Step 3 is manual so HA is not achieved in real sense. Please help to check: 1. Is there any version of the resouce agent which supports automatic failback? To avoid generation of lock file and deleting it. 2. If there is no such support, if we need such functionality, do we have to modify existing code? How this can be achieved. Please suggest. Thanks. Thanks. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html> -- Message: 3 Date: Mon, 12 Aug 2019 17:47:02 + From: Chris Walker mailto:cwal...@cray.com>> To: Cluster Labs - All topics related to open-source clustering welcomed mailto:users@clusterlabs.org>> Subject: Re: [ClusterLabs] why is node fenced ? Message-ID: mailto:eafef777-5a49-4c06-a2f6-8711f528b...@cray.com>> Content-Type: text/plain; charset="utf-8" When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example, Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1 after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing. There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes. I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface! so perhaps that's related. HTH, Chris ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" mailto:users-boun...@clusterlabs.org%20on%20behalf%20of%20bernd.len...@helmholtz-muenchen.de>> wrote: Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in standby, patched it, rebooted, started pacemaker again. But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. I know that nodes which are unclean need to be shutdown, that's logical. But i don't know from where t
Re: [ClusterLabs] [EXTERNAL] Users Digest, Vol 55, Issue 19
istergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 -- Message: 2 Date: Mon, 12 Aug 2019 12:24:02 +0530 From: Shital A mailto:brightuser2...@gmail.com>> To: pgsql-gene...@postgresql.com<mailto:pgsql-gene...@postgresql.com>, Users@clusterlabs.org<mailto:Users@clusterlabs.org> Subject: [ClusterLabs] Postgres HA - pacemaker RA do not support auto failback Message-ID: mailto:camp7vw_kf2em_buh_fpbznc9z6pvvx+7rxjymhfmcozxuwg...@mail.gmail.com>> Content-Type: text/plain; charset="utf-8" Hello, Postgres version : 9.6 OS:Rhel 7.6 We are working on HA setup for postgres cluster of two nodes in active-passive mode. Installed: Pacemaker 1.1.19 Corosync 2.4.3 The pacemaker agent with this installation doesn't support automatic failback. What I mean by that is explained below: 1. Cluster is setup like A - B with A as master. 2. Kill services on A, node B will come up as master. 3. node A is ready to join the cluster, we have to delete the lock file it creates on any one of the node and execute the cleanup command to get the node back as standby Step 3 is manual so HA is not achieved in real sense. Please help to check: 1. Is there any version of the resouce agent which supports automatic failback? To avoid generation of lock file and deleting it. 2. If there is no such support, if we need such functionality, do we have to modify existing code? How this can be achieved. Please suggest. Thanks. Thanks. -- next part -- An HTML attachment was scrubbed... URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/737a010e/attachment-0001.html> -- Message: 3 Date: Mon, 12 Aug 2019 17:47:02 + From: Chris Walker mailto:cwal...@cray.com>> To: Cluster Labs - All topics related to open-source clustering welcomed mailto:users@clusterlabs.org>> Subject: Re: [ClusterLabs] why is node fenced ? Message-ID: mailto:eafef777-5a49-4c06-a2f6-8711f528b...@cray.com>> Content-Type: text/plain; charset="utf-8" When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example, Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1 after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing. There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes. I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface! so perhaps that's related. HTH, Chris ?On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" mailto:users-boun...@clusterlabs.org%20on%20behalf%20of%20bernd.len...@helmholtz-muenchen.de>> wrote: Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in standby, patched it, rebooted, started pacemaker again. But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. I know that nodes which are unclean need to be shutdown, that's logical. But i don't know from where the conclusion comes that the node is unclean respectively why it is unclean, i searched in the logs and didn't find any hint. I put the syslog and the pacemaker log on a seafile share, i'd be very thankful if you'll have a look. https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ Here the cli history of the commands: 17:03:04 crm node standby ha-idg-2 17:07:15 zypper up (install Updates on ha-idg-2) 17:17:30 systemctl reboot 17:25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut f?r Entwicklungsgenetik Geb?ude 35.34 - Raum 208 HelmholtzZentrum m?nchen bernd.len...@helmholtz-muenchen.de<mailto:bernd.len...@helmholtz-muenche
Re: [ClusterLabs] Restoring network connection breaks cluster services
On 07/08/19 16:06 +0200, Momcilo Medic wrote: > On Wed, Aug 7, 2019 at 1:00 PM Klaus Wenninger wrote: > >> On 8/7/19 12:26 PM, Momcilo Medic wrote: >> >>> We have three node cluster that is setup to stop resources on lost >>> quorum. Failure (network going down) handling is done properly, >>> but recovery doesn't seem to work. >> >> What do you mean by 'network going down'? >> Loss of link? Does the IP persist on the interface >> in that case? > > Yes, we simulate faulty cable by turning switch ports down and up. > In such a case, the IP does not persist on the interface. > >> That there are issue reconnecting the CPG-API sounds strange to me. >> Already the fact that something has to be reconnected. I got it >> that your nodes were persistently up during the >> network-disconnection. Although I would have expected fencing to >> kick in at least on those which are part of the non-quorate >> cluster-partition. Maybe a few words more on your scenario >> (fening-setup e.g.) would help to understand what is going on. > > We don't use any fencing mechanisms, we rely on quorum to run the > services. In more detail, we run three node Linbit LINSTOR storage > that is hyperconverged. Meaning, we run clustered storage on the > virtualization hypervisors. > > We use pcs in order to have linstor-controller service in high > availabilty mode. Policy for no quorum is to stop the resources. > > In such hyperconverged setup, we can't fence a node without impact. > It may happen that network instability causes primary node to no > longer be primary. In that case, we don't want running VMs to go > down with the ship, as there was no impact for them. > > However, we would like to have high-availability of that service > upon network restoration, without manual actions. This spurred a train of thought that is admittedly not immediately helpful in this case: * * * 1. the word "converged" is a fitting word for how we'd like the cluster stack to appear (from the outside), but what we have is that some circumstances are not clearly articulated across the components meaning that there's no way for users to express the preferences in simple terms and in a non-conflicting and unambiguous ways when 2+ components' realms combine together -- high level tools like pcs may attempt to rectify that to some extent, but they fall short when there are no surfaces to glue (at least unambiguously, see also parallel thread about shutting the cluster down in the presence of sbd) it seems to me that the very circumstance that was hit here is exactly where corosync authors decided that it's rare and obnoxious to indicate up the chain for a detached destiny reasoning (which pacemaker normaly performs) enough that they rather stop right there (and in a well-behaved cluster configuration hence ask to be fenced) all is actually sound, until one starts to make compromises like here was done, with ditching of the fencing (think: sanity assurance) layer, relying fully on no-quorum-policy=stop, naively thinking that one 100% covered, but with purely pacemaker hat on, we -- the pacemaker dev&maint -- can't really give you such a guarantee, because we have no visibility into said "bail out" shortcuts that corosyncs makes for such rare circumstances -- you shall refer to corosync documentation, but it's not covered there (man pages) AFAIK (if it was _all_ indicated to pacemaker, just standard response on quorum loss could be carried out, not resorting to anything more drastic like here) 2. based on said missing explicit and clear inter-component signalling (1.) and the logs provided, it's fair to bring an argument that pacemaker had an opportunity to see, barring said explicit API signalling, that corosync died, but then, the major assumed case is: - corosync crashed or was explicitly killed (perhaps to test the claimed HA resiliency towards the outer world) - broken pacemaker-corosync communication consistency (did some messages fall through the cracks?) i.e., cluster endangering scenarios, not something to keep alive at all costs, better to try to stabilize the environment first, no to speak about chances with "miracles awaiting" strategy 3. despite 2.. there was a decision with systemd-enabled systems to actually pursue said "at all costs" (althought implicitly mitigated when the restart cycles would be happening in the rapid pace) - it's all then in the hands in slightly non-deterministic timing (token loss timeout window hit/miss, although perhaps not in this very case if the state within the protocol would be a clear indicator for other corosync peers) - I'd actually assume the pacemaker would be restarted in said scenario (unless one fiddled with the pacemaker service file, that is), and just prior to that, corosync would be forcibly started anew as well - is th
Re: [ClusterLabs] Master/slave failover does not work as expected
On Mon, Aug 12, 2019 at 4:12 PM Michael Powell < michael.pow...@harmonicinc.com> wrote: > At 07:44:49, the ss agent discovers that the master instance has failed on > node *mgraid…-0* as a result of a failed *ssadm* request in response to > an *ss_monitor()* operation. It issues a *crm_master -Q -D* command with > the intent of demoting the master and promoting the slave, on the other > node, to master. The *ss_demote()* function finds that the application > is no longer running and returns *OCF_NOT_RUNNING* (7). In the older > product, this was sufficient to promote the other instance to master, but > in the current product, that does not happen. Currently, the failed > application is restarted, as expected, and is promoted to master, but this > takes 10’s of seconds. > > > Did you try to disable resource stickiness for this ms? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] why is node fenced ?
When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for example, Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info: pcmk_quorum_notification: Quorum retained | membership=1320 members=1 after ~20s (dc-deadtime parameter), ha-idg-2 is marked 'unclean' and STONITHed as part of startup fencing. There is nothing in ha-idg-2's HA logs around 17:43 indicating that it saw ha-idg-1 either, so it appears that there was no communication at all between the two nodes. I'm not sure exactly why the nodes did not see one another, but there are indications of network issues around this time 2019-08-09T17:42:16.427947+02:00 ha-idg-2 kernel: [ 1229.245533] bond1: now running without any active interface! so perhaps that's related. HTH, Chris On 8/12/19, 12:09 PM, "Users on behalf of Lentes, Bernd" wrote: Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in standby, patched it, rebooted, started pacemaker again. But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. I know that nodes which are unclean need to be shutdown, that's logical. But i don't know from where the conclusion comes that the node is unclean respectively why it is unclean, i searched in the logs and didn't find any hint. I put the syslog and the pacemaker log on a seafile share, i'd be very thankful if you'll have a look. https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ Here the cli history of the commands: 17:03:04 crm node standby ha-idg-2 17:07:15 zypper up (install Updates on ha-idg-2) 17:17:30 systemctl reboot 17:25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut für Entwicklungsgenetik Gebäude 35.34 - Raum 208 HelmholtzZentrum münchen bernd.len...@helmholtz-muenchen.de phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg Perfekt ist wer keine Fehler macht Also sind Tote perfekt Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Postgres HA - pacemaker RA do not support auto failback
Hello, Postgres version : 9.6 OS:Rhel 7.6 We are working on HA setup for postgres cluster of two nodes in active-passive mode. Installed: Pacemaker 1.1.19 Corosync 2.4.3 The pacemaker agent with this installation doesn't support automatic failback. What I mean by that is explained below: 1. Cluster is setup like A - B with A as master. 2. Kill services on A, node B will come up as master. 3. node A is ready to join the cluster, we have to delete the lock file it creates on any one of the node and execute the cleanup command to get the node back as standby Step 3 is manual so HA is not achieved in real sense. Please help to check: 1. Is there any version of the resouce agent which supports automatic failback? To avoid generation of lock file and deleting it. 2. If there is no such support, if we need such functionality, do we have to modify existing code? How this can be achieved. Please suggest. Thanks. Thanks. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] why is node fenced ?
Hi, last Friday (9th of August) i had to install patches on my two-node cluster. I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2), patched it, rebooted, started the cluster (systemctl start pacemaker) again, put the node again online, everything fine. Then i wanted to do the same procedure with the other node (ha-idg-1). I put it in standby, patched it, rebooted, started pacemaker again. But then ha-idg-1 fenced ha-idg-2, it said the node is unclean. I know that nodes which are unclean need to be shutdown, that's logical. But i don't know from where the conclusion comes that the node is unclean respectively why it is unclean, i searched in the logs and didn't find any hint. I put the syslog and the pacemaker log on a seafile share, i'd be very thankful if you'll have a look. https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/ Here the cli history of the commands: 17:03:04 crm node standby ha-idg-2 17:07:15 zypper up (install Updates on ha-idg-2) 17:17:30 systemctl reboot 17:25:21 systemctl start pacemaker.service 17:25:47 crm node online ha-idg-2 17:26:35 crm node standby ha-idg1- 17:30:21 zypper up (install Updates on ha-idg-1) 17:37:32 systemctl reboot 17:43:04 systemctl start pacemaker.service 17:44:00 ha-idg-1 is fenced Thanks. Bernd OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1 -- Bernd Lentes Systemadministration Institut für Entwicklungsgenetik Gebäude 35.34 - Raum 208 HelmholtzZentrum münchen bernd.len...@helmholtz-muenchen.de phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg Perfekt ist wer keine Fehler macht Also sind Tote perfekt Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup
Hi! I just noticed that a "crm resource cleanup " caused some unexpected behavior and the syslog message: crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32) It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker-1.1.19+20180928.0d2680780-1.8.x86_64). The cleanup was due to a failed monitor. As an unexpected consequence of this cleanup, CRM seemed to restart the complete resource (and dependencies), even though it was running. I noticed that a manual "crm_resource -C -r -N " command has the same effect (multiple resources are "Cleaned up", resources are restarted seemingly before the "probe" is done.). Actually the manual says when cleaning up a single primitive, the whole group is cleaned up, unless using --force. Well ,I don't like this default, as I expect any status change from probe would propagate to the group anyway... Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
On 8/12/19 3:24 PM, Klaus Wenninger wrote: > On 8/12/19 2:30 PM, Yan Gao wrote: >> Hi Klaus, >> >> On 8/12/19 1:39 PM, Klaus Wenninger wrote: >>> On 8/9/19 9:06 PM, Yan Gao wrote: On 8/9/19 6:40 PM, Andrei Borzenkov wrote: > 09.08.2019 16:34, Yan Gao пишет: >> Hi, >> >> With disk-less sbd, it's fine to stop cluster service from the cluster >> nodes all at the same time. >> >> But if to stop the nodes one by one, for example with a 3-node cluster, >> after stopping the 2nd node, the only remaining node resets itself with: >> > That is sort of documented in SBD manual page: > > --><-- > However, while the cluster is in such a degraded state, it can > neither successfully fence nor be shutdown cleanly (as taking the > cluster below the quorum threshold will immediately cause all remaining > nodes to self-fence). > --><-- > > SBD in shared-nothing mode is basically always in such degraded state > and cannot tolerate loss of quorum. Well, the context here is it loses quorum *expectedly* since the other nodes gracefully shut down. >> Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk:debug: >> notify_parent: Not notifying parent: state transient (2) >> Aug 09 14:30:20 opensuse150-1 sbd[1080]:cluster:debug: >> notify_parent: Notifying parent: healthy >> Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: >> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: >> 0) >> >> I can think of the way to manipulate quorum with last_man_standing and >> potentially also auto_tie_breaker, not to mention >> last_man_standing_window would also be a factor... But is there a better >> solution? >> > Lack of cluster wide shutdown mode was mentioned more than once on this > list. I guess the only workaround is to use higher level tools which > basically simply try to stop cluster on all nodes at once. It is still > susceptible to race condition. Gracefully stopping nodes one by one on purpose is still a reasonable need though ... >>> If you do the teardown as e.g. pcs is doing it - first tear down >>> pacemaker-instances and then corosync/sbd - it is at >>> least possible to tear down the pacemaker-instances one-by one >>> without risking a reboot due to quorum-loss. >>> With kind of current sbd having in >>> - >>> https://github.com/ClusterLabs/sbd/commit/824fe834c67fb7bae7feb87607381f9fa8fa2945 >>> - >>> https://github.com/ClusterLabs/sbd/commit/79b778debfee5b4ab2d099b2bfc7385f45597f70 >>> - >>> https://github.com/ClusterLabs/sbd/commit/a716a8ddd3df615009bcff3bd96dd9ae64cb5f68 >>> this should be pretty robust although we are still thinking >>> (probably together with some heartbeat to pacemakerd >>> that assures pacemakerd is checking liveness of sub-daemons >>> properly) of having a cleaner way to detect graceful >>> pacemaker-shutdown. >> These are all good improvements, thanks! >> >> But in this case the remaining node is not shutting down yet, or it's >> intentionally not being shut down :-) Loss of quorum is as expected, so >> is following no-quorum-policy, but self-reset is probably too much? > Hmm ... not sure if I can follow ... > If you shutdown solely pacemaker one-by-one on all nodes > and these shutdowns are considered graceful then you are > not gonna experience any reboots (e.g. 3 node cluster). > Afterwards you can shutdown corosync one-by-one as well > without experiencing reboots as without the cib-connection > sbd isn't gonna check for quorum anymore (all resources > down so no need to reboot in case of quorum-loss - extra > care has to be taken care of with unmanaged resources but > that isn't particular with sbd). I meant if users would like shut down only 2 out of 3 nodes in the cluster and keep the last one online and alive, it's simply not possible for now, although the loss of quorum is expected. Regards, Yan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
On 8/12/19 2:30 PM, Yan Gao wrote: > Hi Klaus, > > On 8/12/19 1:39 PM, Klaus Wenninger wrote: >> On 8/9/19 9:06 PM, Yan Gao wrote: >>> On 8/9/19 6:40 PM, Andrei Borzenkov wrote: 09.08.2019 16:34, Yan Gao пишет: > Hi, > > With disk-less sbd, it's fine to stop cluster service from the cluster > nodes all at the same time. > > But if to stop the nodes one by one, for example with a 3-node cluster, > after stopping the 2nd node, the only remaining node resets itself with: > That is sort of documented in SBD manual page: --><-- However, while the cluster is in such a degraded state, it can neither successfully fence nor be shutdown cleanly (as taking the cluster below the quorum threshold will immediately cause all remaining nodes to self-fence). --><-- SBD in shared-nothing mode is basically always in such degraded state and cannot tolerate loss of quorum. >>> Well, the context here is it loses quorum *expectedly* since the other >>> nodes gracefully shut down. >>> > Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk:debug: > notify_parent: Not notifying parent: state transient (2) > Aug 09 14:30:20 opensuse150-1 sbd[1080]:cluster:debug: > notify_parent: Notifying parent: healthy > Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: > Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: > 0) > > I can think of the way to manipulate quorum with last_man_standing and > potentially also auto_tie_breaker, not to mention > last_man_standing_window would also be a factor... But is there a better > solution? > Lack of cluster wide shutdown mode was mentioned more than once on this list. I guess the only workaround is to use higher level tools which basically simply try to stop cluster on all nodes at once. It is still susceptible to race condition. >>> Gracefully stopping nodes one by one on purpose is still a reasonable >>> need though ... >> If you do the teardown as e.g. pcs is doing it - first tear down >> pacemaker-instances and then corosync/sbd - it is at >> least possible to tear down the pacemaker-instances one-by one >> without risking a reboot due to quorum-loss. >> With kind of current sbd having in >> - >> https://github.com/ClusterLabs/sbd/commit/824fe834c67fb7bae7feb87607381f9fa8fa2945 >> - >> https://github.com/ClusterLabs/sbd/commit/79b778debfee5b4ab2d099b2bfc7385f45597f70 >> - >> https://github.com/ClusterLabs/sbd/commit/a716a8ddd3df615009bcff3bd96dd9ae64cb5f68 >> this should be pretty robust although we are still thinking >> (probably together with some heartbeat to pacemakerd >> that assures pacemakerd is checking liveness of sub-daemons >> properly) of having a cleaner way to detect graceful >> pacemaker-shutdown. > These are all good improvements, thanks! > > But in this case the remaining node is not shutting down yet, or it's > intentionally not being shut down :-) Loss of quorum is as expected, so > is following no-quorum-policy, but self-reset is probably too much? Hmm ... not sure if I can follow ... If you shutdown solely pacemaker one-by-one on all nodes and these shutdowns are considered graceful then you are not gonna experience any reboots (e.g. 3 node cluster). Afterwards you can shutdown corosync one-by-one as well without experiencing reboots as without the cib-connection sbd isn't gonna check for quorum anymore (all resources down so no need to reboot in case of quorum-loss - extra care has to be taken care of with unmanaged resources but that isn't particular with sbd). Klaus > > Regards, >Yan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Gracefully stop nodes one by one with disk-less sbd
On 8/12/19 8:42 AM, Ulrich Windl wrote: > Hi! > > One motivation to stop all nodes at the same time is to avoid needless moving > of resources, like the following: > You stop node A, then resources are stopped on A and started elsewhere > You stop node B, and resources are stopped and moved to remaining nodes > ...until the last node stops, or quorum prevents cluster operation (effect > depends on further settings) This could potentially achieved by first putting all the nodes into standby mode with an atomic request. crmsh doesn't support this from "crm node standy" interface so far, but "crm configure edit" definitely can do this. Regards, Yan > > Unfortunately (AFAIK) there's not command to "stop the cluster" yet. > A "stop cluster" command would stop all resources on all nodes, then stop the > nodes (and lower layers) in a way that there is no "quorum lost" or fencing > going on. > > Regards, > Ulrich > Yan Gao schrieb am 09.08.2019 um 15:34 in Nachricht > : >> Hi, >> >> With disk‑less sbd, it's fine to stop cluster service from the cluster >> nodes all at the same time. >> >> But if to stop the nodes one by one, for example with a 3‑node cluster, >> after stopping the 2nd node, the only remaining node resets itself with: >> >> Aug 09 14:30:20 opensuse150‑1 sbd[1079]: pcmk:debug: >> notify_parent: Not notifying parent: state transient (2) >> Aug 09 14:30:20 opensuse150‑1 sbd[1080]:cluster:debug: >> notify_parent: Notifying parent: healthy >> Aug 09 14:30:20 opensuse150‑1 sbd[1078]: warning: inquisitor_child: >> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) >> >> I can think of the way to manipulate quorum with last_man_standing and >> potentially also auto_tie_breaker, not to mention >> last_man_standing_window would also be a factor... But is there a better >> solution? >> >> Thanks, >> Yan >> ___ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
Hi Klaus, On 8/12/19 1:39 PM, Klaus Wenninger wrote: > On 8/9/19 9:06 PM, Yan Gao wrote: >> On 8/9/19 6:40 PM, Andrei Borzenkov wrote: >>> 09.08.2019 16:34, Yan Gao пишет: Hi, With disk-less sbd, it's fine to stop cluster service from the cluster nodes all at the same time. But if to stop the nodes one by one, for example with a 3-node cluster, after stopping the 2nd node, the only remaining node resets itself with: >>> That is sort of documented in SBD manual page: >>> >>> --><-- >>> However, while the cluster is in such a degraded state, it can >>> neither successfully fence nor be shutdown cleanly (as taking the >>> cluster below the quorum threshold will immediately cause all remaining >>> nodes to self-fence). >>> --><-- >>> >>> SBD in shared-nothing mode is basically always in such degraded state >>> and cannot tolerate loss of quorum. >> Well, the context here is it loses quorum *expectedly* since the other >> nodes gracefully shut down. >> >>> Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk:debug: notify_parent: Not notifying parent: state transient (2) Aug 09 14:30:20 opensuse150-1 sbd[1080]:cluster:debug: notify_parent: Notifying parent: healthy Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) I can think of the way to manipulate quorum with last_man_standing and potentially also auto_tie_breaker, not to mention last_man_standing_window would also be a factor... But is there a better solution? >>> Lack of cluster wide shutdown mode was mentioned more than once on this >>> list. I guess the only workaround is to use higher level tools which >>> basically simply try to stop cluster on all nodes at once. It is still >>> susceptible to race condition. >> Gracefully stopping nodes one by one on purpose is still a reasonable >> need though ... > If you do the teardown as e.g. pcs is doing it - first tear down > pacemaker-instances and then corosync/sbd - it is at > least possible to tear down the pacemaker-instances one-by one > without risking a reboot due to quorum-loss. > With kind of current sbd having in > - > https://github.com/ClusterLabs/sbd/commit/824fe834c67fb7bae7feb87607381f9fa8fa2945 > - > https://github.com/ClusterLabs/sbd/commit/79b778debfee5b4ab2d099b2bfc7385f45597f70 > - > https://github.com/ClusterLabs/sbd/commit/a716a8ddd3df615009bcff3bd96dd9ae64cb5f68 > this should be pretty robust although we are still thinking > (probably together with some heartbeat to pacemakerd > that assures pacemakerd is checking liveness of sub-daemons > properly) of having a cleaner way to detect graceful > pacemaker-shutdown. These are all good improvements, thanks! But in this case the remaining node is not shutting down yet, or it's intentionally not being shut down :-) Loss of quorum is as expected, so is following no-quorum-policy, but self-reset is probably too much? Regards, Yan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
On 8/9/19 9:06 PM, Yan Gao wrote: > On 8/9/19 6:40 PM, Andrei Borzenkov wrote: >> 09.08.2019 16:34, Yan Gao пишет: >>> Hi, >>> >>> With disk-less sbd, it's fine to stop cluster service from the cluster >>> nodes all at the same time. >>> >>> But if to stop the nodes one by one, for example with a 3-node cluster, >>> after stopping the 2nd node, the only remaining node resets itself with: >>> >> That is sort of documented in SBD manual page: >> >> --><-- >> However, while the cluster is in such a degraded state, it can >> neither successfully fence nor be shutdown cleanly (as taking the >> cluster below the quorum threshold will immediately cause all remaining >> nodes to self-fence). >> --><-- >> >> SBD in shared-nothing mode is basically always in such degraded state >> and cannot tolerate loss of quorum. > Well, the context here is it loses quorum *expectedly* since the other > nodes gracefully shut down. > >> >> >>> Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk:debug: >>> notify_parent: Not notifying parent: state transient (2) >>> Aug 09 14:30:20 opensuse150-1 sbd[1080]:cluster:debug: >>> notify_parent: Notifying parent: healthy >>> Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: >>> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) >>> >>> I can think of the way to manipulate quorum with last_man_standing and >>> potentially also auto_tie_breaker, not to mention >>> last_man_standing_window would also be a factor... But is there a better >>> solution? >>> >> Lack of cluster wide shutdown mode was mentioned more than once on this >> list. I guess the only workaround is to use higher level tools which >> basically simply try to stop cluster on all nodes at once. It is still >> susceptible to race condition. > Gracefully stopping nodes one by one on purpose is still a reasonable > need though ... If you do the teardown as e.g. pcs is doing it - first tear down pacemaker-instances and then corosync/sbd - it is at least possible to tear down the pacemaker-instances one-by one without risking a reboot due to quorum-loss. With kind of current sbd having in - https://github.com/ClusterLabs/sbd/commit/824fe834c67fb7bae7feb87607381f9fa8fa2945 - https://github.com/ClusterLabs/sbd/commit/79b778debfee5b4ab2d099b2bfc7385f45597f70 - https://github.com/ClusterLabs/sbd/commit/a716a8ddd3df615009bcff3bd96dd9ae64cb5f68 this should be pretty robust although we are still thinking (probably together with some heartbeat to pacemakerd that assures pacemakerd is checking liveness of sub-daemons properly) of having a cleaner way to detect graceful pacemaker-shutdown. Klaus > > Regards, >Yan > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Antw: Re: Antw: Re: Gracefully stop nodes one by one with disk-less sbd
>>> Roger Zhou schrieb am 12.08.2019 um 10:55 in Nachricht <7249e013-1256-675a-3cea-3572f4615...@suse.com>: > On 8/12/19 2:48 PM, Ulrich Windl wrote: > Andrei Borzenkov schrieb am 09.08.2019 um 18:40 in >> Nachricht <217d10d8-022c-eaf6-28ae-a4f58b2f9...@gmail.com>: >>> 09.08.2019 16:34, Yan Gao пишет: > > [...] > >>> >>> Lack of cluster wide shutdown mode was mentioned more than once on this >>> list. I guess the only workaround is to use higher level tools which >>> basically simply try to stop cluster on all nodes at once. > > I try to think of ssh/pssh to the involved nodes and stop diskless SBD > daemons. However, SBD is not able to be teared down on it own. It is > deeply tied up with pacemaker and corosync and has to be stop all > together. Or, to hack SBD dependency otherwise. > >>> It is still >>> susceptible to race condition. >> >> Are there any concrete plans to implement a clean solution? >> > > I can think of Yet Another Feature to disable diskless SBD on-purpose. > eg. to let SBD understands "stonith-enabled=false" at the cluster wide. Hi! I imagine that some new mechanism would be needed to have non-persistent or self-resetting attribute changes in the CIB: For example if you do a "resource restart" and the node where the command runs is fenced during the "stop" phase, the resource remains stopped until started manually. This is because the "restart" is implemented as sequential non-atomic "stop, then start". Similar for a "cluster stop": There is a attribute "stop-all-resources" (AFAIR). A "cluster stop" could temporarily set this to get all resources on all nodes stopped. Then the pacemakers and corosyncs and sbds should stop. On restart each node should start up normally... BTW: HP-UX ServiceGuard had not only a command to stop the cluster, but also one to start the cluster: I imagine that it could play nice with pacemaker as well: The command would first start all the SBDs, corosyncs, and pacemakers, and once the DC is selected, resources would start without needless shuffling (migration) resources between nodes joining the cluster. Regards, Ulrich > > > Cheers, > Roger ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Querying failed rersource operations from the CIB
Hi! Back in December 2011 I had written a script to retrieve all failed resource operations by using "cibadmin -Q -o lrm_resources" as data base. I was querying lrm_rsc_op for op-status != 0. In a newer release this does not seems to work anymore. I see resource IDs ending with "_last_0", "_monitor_6", and "_last_failure_0", but even in the "_last_failure_0" the op-status is "0" (rc-code="7"). Is this some bug, or is it a feature? That is: When will op-status be != 0? crm_mon still reports a resource failure like this: Failed Resource Actions: * prm_nfs_server_monitor_6 on h11 'not running' (7): call=738, status=complete, exitreason='', last-rc-change='Mon Aug 12 04:52:23 2019', queued=0ms, exec=0ms (it seems the nfs server monitor does this under load in SLES12 SP4, and I wonder where to look for the reason) BTW: "lrm_resources" is not documented, and the structure seemes to change. Can I restrict the output to LRM data? Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Gracefully stop nodes one by one with disk-less sbd
On 8/12/19 2:48 PM, Ulrich Windl wrote: Andrei Borzenkov schrieb am 09.08.2019 um 18:40 in > Nachricht <217d10d8-022c-eaf6-28ae-a4f58b2f9...@gmail.com>: >> 09.08.2019 16:34, Yan Gao пишет: [...] >> >> Lack of cluster wide shutdown mode was mentioned more than once on this >> list. I guess the only workaround is to use higher level tools which >> basically simply try to stop cluster on all nodes at once. I try to think of ssh/pssh to the involved nodes and stop diskless SBD daemons. However, SBD is not able to be teared down on it own. It is deeply tied up with pacemaker and corosync and has to be stop all together. Or, to hack SBD dependency otherwise. >> It is still >> susceptible to race condition. > > Are there any concrete plans to implement a clean solution? > I can think of Yet Another Feature to disable diskless SBD on-purpose. eg. to let SBD understands "stonith-enabled=false" at the cluster wide. Cheers, Roger ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Strange lost quorum with qdevice
Andrei Borzenkov napsal(a): Отправлено с iPhone 12 авг. 2019 г., в 8:46, Jan Friesse написал(а): Олег Самойлов napsal(a): 9 авг. 2019 г., в 9:25, Jan Friesse написал(а): Please do not set dpd_interval that high. dpd_interval on qnetd side is not about how often is the ping is sent. Could you please retry your test with dpd_interval=1000? I'm pretty sure it will work then. Honza Yep. As far as I undestand dpd_interval of qnetd, timeout and sync_timeout of qdevice is somehow linked. By default they are dpd_interval=10, timeout=10, sync_timeout=30. And you advised to change them proportionally. Yes, timeout and sync_timeout should be changed proportionally. dpd_interval is different story. https://github.com/ClusterLabs/sbd/pull/76#issuecomment-486952369 But mechanic how they are depend on each other is mysterious and is not documented. Let me try to bring some light in there: - dpd_interval is qnetd variable how often qnetd walks thru the list of all clients (qdevices) and checks timestamp of last sent message. If diff between current timestamp and last sent message timestamp is larger than 2 * timeout sent by client then client is considered as death. - interval - affects how often qdevice sends heartbeat to corosync (this is half of the interval) about its liveness and also how often it sends heartbeat to qnetd (0.8 * interval). On corosync side this is used as a timeout after which qdevice daemon is considered death and its votes are no longer valid. - sync_timeout - Not used by qdevice/qnetd. Used by corosync during sync phase. If corosync doesn't get reply by qdevice till this timeout it considers qdevice daemon death and continues sync process. Looking at logs on the beginning of this thread as well as logs in linked github issue, it appears that corosync does not do anything during sync_timeout, in particular does *not* ask qdevice and device does not ask qnetd. corosync is waiting for qdevice to call votequorum_qdevice_poll function. qdevice asks qnetd (how else could it get the vote?). I rechecked test with 20-60 combination. I get the same problem on 16th failure simultation. The qnetd return vote exactly in the same second, when qdevice expects, but slightly less. So the node lost quorum, got vote slightly later, but don't get quorum may be due to 'wait for all' option. That matches above observation. As soon as corosync is unfrozen, it asks qnetd which returns its vote. Actually if you would take a look to log it is evident that qdevice asked qnetd for a vote, but result was to wait for reply, because qnetd couldn't give proper answer till it gets complete information from all nodes. So I still do not understand what is supposed to happen during sync_timeout and whether observed behavior is intentional. So far it looks just like artificial delay. Not at all. It's just a bug in timeouts. I have a proper fix in my mind, but it's not just about lowering limits. Honza I retried the default 10-30 combination. I got the same problem on the first failure simulation. Qnetd send vote after 1 second, then expected. Combination is 1-3 (dpd_interval=1, timeout=1, sync_timeout=3). The same problem on 11th failore simulation. The qnetd return vote exactly in the same second, when qdevice expects, but slightly less. So the node lost quorum, got vote slightly later, but don't get quorum may be due to 'wait for all' option. And node is watchdoged later due to lack of quorum. It was probably not evident from my reply, but what I meant was to change just dpd_interval. Could you please recheck with dpd_interval=1, timeout=20, sync_timeout=60? Honza So, my conclusions: 1. IMHO may be this bug depend not on absolute value of dpd_interval, on proportion between dpd_interval of qnetd and timeout, sync_timeout of qdevice. Because this options, I can not predict how to change them to work around this behaviour. 2. IMHO "wait for all" also bugged. According on documentation it must fire only on the start of cluster, but looked like it fire every time when quorum (or all votes) is lost. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Increasing fence timeout
You should be able to increase this timeout by running: pcs stonith update shell_timeout=10 Oyvind On 08/08/19 12:13 -0600, Casey & Gina wrote: Hi, I'm currently running into periodic premature killing of nodes due to the fence monitor timeout being set to 5 seconds. Here is an example message from the logs: fence_vmware_rest[22334] stderr: [ Exception: Operation timed out after 5001 milliseconds with 0 bytes received ] How can I increase this timeout using PCS? Thank you, -- Casey ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Gracefully stop nodes one by one with disk-less sbd
Отправлено с iPhone 12 авг. 2019 г., в 9:48, Ulrich Windl написал(а): Andrei Borzenkov schrieb am 09.08.2019 um 18:40 in > Nachricht <217d10d8-022c-eaf6-28ae-a4f58b2f9...@gmail.com>: >> 09.08.2019 16:34, Yan Gao пишет: >>> Hi, >>> >>> With disk-less sbd, it's fine to stop cluster service from the cluster >>> nodes all at the same time. >>> >>> But if to stop the nodes one by one, for example with a 3-node cluster, >>> after stopping the 2nd node, the only remaining node resets itself with: >>> >> >> That is sort of documented in SBD manual page: >> >> --><-- >> However, while the cluster is in such a degraded state, it can >> neither successfully fence nor be shutdown cleanly (as taking the >> cluster below the quorum threshold will immediately cause all remaining >> nodes to self-fence). >> --><-- >> >> SBD in shared-nothing mode is basically always in such degraded state >> and cannot tolerate loss of quorum. > > So with a shared device it'S different? Yes, as long as shared device is accessible. > I was wondering whether > "no-quorum-policy=freeze" would still work with the recent sbd... > It will with shared device. >> >> >> >>> Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk:debug: >>> notify_parent: Not notifying parent: state transient (2) >>> Aug 09 14:30:20 opensuse150-1 sbd[1080]:cluster:debug: >>> notify_parent: Notifying parent: healthy >>> Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: >>> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: > 0) >>> >>> I can think of the way to manipulate quorum with last_man_standing and >>> potentially also auto_tie_breaker, not to mention >>> last_man_standing_window would also be a factor... But is there a better >>> solution? >>> >> >> Lack of cluster wide shutdown mode was mentioned more than once on this >> list. I guess the only workaround is to use higher level tools which >> basically simply try to stop cluster on all nodes at once. It is still >> susceptible to race condition. > > Are there any concrete plans to implement a clean solution? > >> ___ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/