Classified as: {OPEN}
[~]$ systemctl status corosync ● corosync.service - Corosync Cluster Engine Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled) Active: failed (Result: signal) since Thu 2024-04-18 14:58:42 UTC; 53min ago Docs: man:corosync man:corosync.conf man:corosync_overview Process: 2027251 ExecStop=/usr/sbin/corosync-cfgtool -H --force (code=exited, status=0/SUCCESS) Process: 1324906 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=killed, signal=KILL) Main PID: 1324906 (code=killed, signal=KILL) Apr 18 13:16:04 - corosync[1324906]: [QUORUM] Sync joined[1]: 1 Apr 18 13:16:04 - corosync[1324906]: [TOTEM ] A new membership (1.1c8) was formed. Members joined: 1 Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2 Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2 Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2 Apr 18 13:16:04 - corosync[1324906]: [QUORUM] Members[1]: 1 Apr 18 13:16:04 - corosync[1324906]: [MAIN ] Completed service synchronization, ready to provide service. Apr 18 13:16:04 - systemd[1]: Started Corosync Cluster Engine. Apr 18 14:58:42 - systemd[1]: corosync.service: Main process exited, code=killed, status=9/KILL Apr 18 14:58:42 - systemd[1]: corosync.service: Failed with result 'signal'. [~]$ De : Klaus Wenninger <kwenn...@redhat.com> Envoyé : jeudi 18 avril 2024 17:43 À : Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org> Cc : Ken Gaillot <kgail...@redhat.com>; NOLIBOS Christophe <christophe.noli...@thalesgroup.com> Objet : Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix On Thu, Apr 18, 2024 at 5:07 PM NOLIBOS Christophe via Users <users@clusterlabs.org <mailto:users@clusterlabs.org> > wrote: Classified as: {OPEN} I'm using RedHat 8.8 (4.18.0-477.21.1.el8_8.x86_64). When I kill Corosync, no new corosync process is created and pacemaker is in failure. The only solution is to restart the pacemaker service. [~]$ pcs status Error: unable to get cib [~]$ [~]$systemctl status pacemaker ● pacemaker.service - Pacemaker High Availability Cluster Manager Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2024-04-18 13:16:04 UTC; 1h 43min ago Docs: man:pacemakerd https://clusterlabs.org/pacemaker/doc/ Main PID: 1324923 (pacemakerd) Tasks: 91 Memory: 132.1M CGroup: /system.slice/pacemaker.service ... Apr 18 14:59:02 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:03 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:04 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:05 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:06 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:07 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:08 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:09 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:10 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY Apr 18 14:59:11 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY [~]$ Well if corosync isn't there that this is to be expected and pacemaker won't recover corosync. Can you check what systemd thinks about corosync (status/journal). Klaus {OPEN} -----Message d'origine----- De : Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com> > Envoyé : jeudi 18 avril 2024 16:40 À : Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org> > Cc : NOLIBOS Christophe <christophe.noli...@thalesgroup.com <mailto:christophe.noli...@thalesgroup.com> > Objet : Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix What OS are you using? Does it use systemd? What does happen when you kill Corosync? On Thu, 2024-04-18 at 13:13 +0000, NOLIBOS Christophe via Users wrote: > Classified as: {OPEN} > > Dear All, > > I have a question about the "pacemakerd: recover properly from > Corosync crash" fix implemented in version 2.1.2. > I have observed the issue when testing pacemaker version 2.0.5, just > by killing the ‘corosync’ process: Corosync was not recovered. > > I am using now pacemaker version 2.1.5-8. > Doing the same test, I have the same result: Corosync is still not > recovered. > > Please confirm the "pacemakerd: recover properly from Corosync crash" > fix implemented in version 2.1.2 covers this scenario. > If it is, did I miss something in the configuration of my cluster? > > Best Regard. > > Christophe. > > > > {OPEN} > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com> > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ {OPEN}
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/