[ClusterLabs] Pacemaker 2.1.7 final release now available

2023-12-19 Thread Ken Gaillot
Hi all, Source code for Pacemaker version 2.1.7 is available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7 This is primarily a bug fix release. See the ChangeLog or the link above for details. Many thanks to all contributors of source code to this release, including

Re: [ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Ken Gaillot
Correct. You want to enable pcsd to start at boot. Also, after starting pcsd the first time on a node, authorize it from the first node with "pcs host auth -u hacluster". On Tue, 2023-12-19 at 22:42 +0200, Tiaan Wessels wrote: > So i run the pcs add command for every new node on the first

Re: [ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Tiaan Wessels
So i run the pcs add command for every new node on the first original node, not on the node being added? Only corosync, pacemaker and pcsd needs to run on the node to be added and the commands being run on the original node will speak to these on the new node? On Tue, 19 Dec 2023, 21:39 Ken

Re: [ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Ken Gaillot
On Tue, 2023-12-19 at 17:03 +0200, Tiaan Wessels wrote: > Hi, > Is it possible to build a corosync pacemaker cluster on redhat9 one > node at a time? In other words, when I'm finished with the first node > and reboot it, all services are started on it. Then i build a second > node to integrate

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Andrei Borzenkov
On 19.12.2023 21:42, Artem wrote: Andrei and Klaus thanks for prompt reply and clarification! As I understand, design and behavior of Pacemaker is tightly coupled with the stonith concept. But isn't it too rigid? If you insist on shooting yourself in the foot, pacemaker gives you the gun. It

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Vladislav Bogdanov
What if node (especially vm) freezes for several minutes and then continues to write to a shared disk where other nodes already put their data? In my opinion, fencing, preferably two-level, is mandatory for lustre, trust me, I'd developed whole HA stack for both Exascaler and PangeaFS. We've

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Artem
Andrei and Klaus thanks for prompt reply and clarification! As I understand, design and behavior of Pacemaker is tightly coupled with the stonith concept. But isn't it too rigid? Is there a way to leverage self-monitoring or pingd rules to trigger isolated node to umount its FS? Like vSphere High

[ClusterLabs] colocate Redis - weird

2023-12-19 Thread lejeczek via Users
hi guys, Is this below not the weirdest thing? -> $ pcs constraint ref PGSQL-PAF-5435 Resource: PGSQL-PAF-5435   colocation-HA-10-1-1-84-PGSQL-PAF-5435-clone-INFINITY   colocation-REDIS-6385-clone-PGSQL-PAF-5435-clone-INFINITY   order-PGSQL-PAF-5435-clone-HA-10-1-1-84-Mandatory  

[ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Tiaan Wessels
Hi, Is it possible to build a corosync pacemaker cluster on redhat9 one node at a time? In other words, when I'm finished with the first node and reboot it, all services are started on it. Then i build a second node to integrate into the cluster and once done, pcs status shows two nodes on-line ?

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Klaus Wenninger
On Tue, Dec 19, 2023 at 10:00 AM Andrei Borzenkov wrote: > On Tue, Dec 19, 2023 at 10:41 AM Artem wrote: > ... > > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] > (update_resource_action_runnable)warning: OST4_stop_0 on lustre4 is > unrunnable (node is offline) > > Dec

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Andrei Borzenkov
On Tue, Dec 19, 2023 at 10:41 AM Artem wrote: ... > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] > (update_resource_action_runnable)warning: OST4_stop_0 on lustre4 is > unrunnable (node is offline) > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] >