[ClusterLabs] Multiple processes appending to the same log file questions (Was: Pacemaker detail log directory permissions)

2019-04-30 Thread Jan Pokorný
[let's move this to developers@cl.o, please drop users on response unless you are only subscribed there, I tend to only respond to the lists] On 30/04/19 13:55 +0200, Jan Pokorný wrote: > On 30/04/19 07:55 +0200, Ulrich Windl wrote: > Jan Pokorný schrieb am 29.04.2019 um 17:22 > in

Re: [ClusterLabs] Timeout stopping corosync-qdevice service

2019-04-30 Thread Andrei Borzenkov
30.04.2019 9:51, Jan Friesse пишет: > >> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it >> installs SIGTERM handler that does not exit and only closes some socket. >> May be this should trigger termination of main loop, but somehow it does >> not. > > Yep, this is exactly

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Олег Самойлов
> 30 апр. 2019 г., в 19:38, Andrei Borzenkov написал(а): > > 30.04.2019 19:34, Олег Самойлов пишет: >> >>> No. I simply want reliable way to shutdown the whole cluster (for >>> maintenance). >> >> Official way is `pcs cluster stop --all`. > > pcs is just one of multiple high level tools. I

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Andrei Borzenkov
30.04.2019 19:34, Олег Самойлов пишет: > >> No. I simply want reliable way to shutdown the whole cluster (for >> maintenance). > > Official way is `pcs cluster stop --all`. pcs is just one of multiple high level tools. I am interested in plumbing, not porcelain. > But it’s not always worked as

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Олег Самойлов
> No. I simply want reliable way to shutdown the whole cluster (for > maintenance). Official way is `pcs cluster stop --all`. But it’s not always worked as expected for me. ___ Manage your subscription:

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Andrei Borzenkov
30.04.2019 13:43, Олег Самойлов пишет: > May be you will be interesting in `allow_downscale: 1` option > > https://www.systutorials.com/docs/linux/man/5-votequorum/ > Apart from "THIS FEATURE IS INCOMPLETE AND CURRENTLY UNSUPPORTED"? :) Not really, the question is not about dynamic cluster

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Andrei Borzenkov
30.04.2019 9:53, Digimer пишет: > On 2019-04-30 12:07 a.m., Andrei Borzenkov wrote: >> As soon as majority of nodes are stopped, the remaining nodes are out of >> quorum and watchdog reboot kicks in. >> >> What is the correct procedure to ensure nodes are stopped in clean way? >> Short of

[ClusterLabs] corosync.conf: A fatal syntax error that's not detected

2019-04-30 Thread Ulrich Windl
Hi! Trying to upgrade one corosync 1 cluster (SLES11 SP4) to corosync 2 (SLES12 SP4) resulted in a two-node cluster that happily fences each node, and little else. A first investigation indicated that I simply placed the "transport: updu" line within each "interface" instead of globally. It

Re: [ClusterLabs] Corosync unable to reach consensus for membership

2019-04-30 Thread Jan Friesse
Prasad, Hello : I have a 3 node corosync and pacemaker cluster and the nodes are: Online: [ SG-azfw2-189 SG-azfw2-190 SG-azfw2-191 ] Full list of resources: Master/Slave Set: ms_mysql [p_mysql] Masters: [ SG-azfw2-189 ] Slaves: [ SG-azfw2-190 SG-azfw2-191 ] For my network

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-30 Thread Jan Pokorný
On 30/04/19 07:55 +0200, Ulrich Windl wrote: Jan Pokorný schrieb am 29.04.2019 um 17:22 in Nachricht <20190429152200.ga19...@redhat.com>: >> On 29/04/19 14:58 +0200, Jan Pokorný wrote: >>> On 29/04/19 08:20 +0200, Ulrich Windl wrote: >>> Jan Pokorný schrieb am 25.04.2019 um 18:49

[ClusterLabs] Corosync unable to reach consensus for membership

2019-04-30 Thread Prasad Nagaraj
Hello : I have a 3 node corosync and pacemaker cluster and the nodes are: Online: [ SG-azfw2-189 SG-azfw2-190 SG-azfw2-191 ] Full list of resources: Master/Slave Set: ms_mysql [p_mysql] Masters: [ SG-azfw2-189 ] Slaves: [ SG-azfw2-190 SG-azfw2-191 ] For my network partition test, I

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Олег Самойлов
May be you will be interesting in `allow_downscale: 1` option https://www.systutorials.com/docs/linux/man/5-votequorum/ > 30 апр. 2019 г., в 7:07, Andrei Borzenkov написал(а): > > As soon as majority of nodes are stopped, the remaining nodes are out of > quorum and watchdog reboot kicks in. >

Re: [ClusterLabs] How to correctly stop cluster with active stonith watchdog?

2019-04-30 Thread Digimer
On 2019-04-30 12:07 a.m., Andrei Borzenkov wrote: > As soon as majority of nodes are stopped, the remaining nodes are out of > quorum and watchdog reboot kicks in. > > What is the correct procedure to ensure nodes are stopped in clean way? > Short of disabling stonith-watchdog-timeout before

Re: [ClusterLabs] Timeout stopping corosync-qdevice service

2019-04-30 Thread Jan Friesse
Andrei, 29.04.2019 14:32, Jan Friesse пишет: Andrei, I setup qdevice in openSUSE Tumbleweed and while it works as expected I Is it corosync-qdevice or corosync-qnetd daemon? corosync-qdevice cannot stop it - it always results in timeout and service finally gets killed by systemd. Is