Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Ken Gaillot
On Mon, 2017-10-16 at 21:49 +0200, Lars Ellenberg wrote: > On Mon, Oct 16, 2017 at 09:20:52PM +0200, Lentes, Bernd wrote: > > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > > > On 2017-10-16 01:24 PM, Lentes, Bernd wrote: > > > > i have the following behavior: I put a node in

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Digimer
On 2017-10-16 03:20 PM, Lentes, Bernd wrote: > > > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > >> On 2017-10-16 01:24 PM, Lentes, Bernd wrote: >>> Hi, >>> >>> i have the following behavior: I put a node in maintenance mode, afterwards >>> stop >>> corosync on that node

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd
- On Oct 16, 2017, at 7:37 PM, emmanuel segura emi2f...@gmail.com wrote: > I put a node in maintenance mode? > do you mean you put the cluster in maintenance mode I did "crm node maintenance ". From my understanding that means that i put the node in maintenance mode. Bernd

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd
- On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote: > On 2017-10-16 01:24 PM, Lentes, Bernd wrote: >> Hi, >> >> i have the following behavior: I put a node in maintenance mode, afterwards >> stop >> corosync on that node with /etc/init.d/openais stop. >> This node is

Re: [ClusterLabs] Regression in Filesystem RA

2017-10-16 Thread Lars Ellenberg
On Mon, Oct 16, 2017 at 08:09:21PM +0200, Dejan Muhamedagic wrote: > Hi, > > On Thu, Oct 12, 2017 at 03:30:30PM +0900, Christian Balzer wrote: > > > > Hello, > > > > 2nd post in 10 years, lets see if this one gets an answer unlike the first > > one... Do you want to make me check for the old

Re: [ClusterLabs] Regression in Filesystem RA

2017-10-16 Thread Dejan Muhamedagic
Hi, On Thu, Oct 12, 2017 at 03:30:30PM +0900, Christian Balzer wrote: > > Hello, > > 2nd post in 10 years, lets see if this one gets an answer unlike the first > one... > > One of the main use cases for pacemaker here are DRBD replicated > active/active mailbox servers (dovecot/exim) on Debian

Re: [ClusterLabs] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Lars Ellenberg
On Tue, Sep 26, 2017 at 07:17:15AM +, Eric Robinson wrote: > > I don't know the tool, but isn't the expectation a bit high that the tool > > will trim > > the correct blocks throuch drbd->LVM/mdadm->device? Why not use the tool > > on the affected devices directly? > > > > I did, and the

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread emmanuel segura
I put a node in maintenance mode? do you mean you put the cluster in maintenance mode 2017-10-16 19:24 GMT+02:00 Lentes, Bernd : > Hi, > > i have the following behavior: I put a node in maintenance mode, > afterwards stop corosync on that node with

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Digimer
On 2017-10-16 01:24 PM, Lentes, Bernd wrote: > Hi, > > i have the following behavior: I put a node in maintenance mode, afterwards > stop corosync on that node with /etc/init.d/openais stop. > This node is immediately fenced. Is that expected behavior ? I thought > putting a node into

[ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Lentes, Bernd
Hi, i have the following behavior: I put a node in maintenance mode, afterwards stop corosync on that node with /etc/init.d/openais stop. This node is immediately fenced. Is that expected behavior ? I thought putting a node into maintenance does mean the cluster does not care anymore about that

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Ken Gaillot
On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote: > Hi, > > I have a cluster with two ocf:heartbeat:anything resources each one > running as a clone in all nodes of the cluster. For some reason when > one of them fails to start the other one stops. There is not any > constrain configured or

[ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Gerard Garcia
Hi, I have a cluster with two ocf:heartbeat:anything resources each one running as a clone in all nodes of the cluster. For some reason when one of them fails to start the other one stops. There is not any constrain configured or any kind of relation between them. Is it possible that there is

Re: [ClusterLabs] corosync race condition when node leaves immediately after joining

2017-10-16 Thread Jonathan Davies
On 16/10/17 15:58, Jan Friesse wrote: Jonathan, On 13/10/17 17:24, Jan Friesse wrote: I've done a bit of digging and am getting closer to the root cause of the race. We rely on having votequorum_sync_init called twice -- once when node 1 joins (with member_list_entries=2) and once when

Re: [ClusterLabs] corosync race condition when node leaves immediately after joining

2017-10-16 Thread Jan Friesse
Jonathan, On 13/10/17 17:24, Jan Friesse wrote: I've done a bit of digging and am getting closer to the root cause of the race. We rely on having votequorum_sync_init called twice -- once when node 1 joins (with member_list_entries=2) and once when node 1 leaves (with

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 2

2017-10-16 Thread Ken Gaillot
The second release candidate for Pacemaker version 1.1.18 is now available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.18- rc2 This release fixes a few minor regressions introduced in rc1, plus a few long-standing minor bugs. For details, see the ChangeLog. Any

Re: [ClusterLabs] corosync race condition when node leaves immediately after joining

2017-10-16 Thread Jonathan Davies
On 13/10/17 17:24, Jan Friesse wrote: I've done a bit of digging and am getting closer to the root cause of the race. We rely on having votequorum_sync_init called twice -- once when node 1 joins (with member_list_entries=2) and once when node 1 leaves (with member_list_entries=1). This is

Re: [ClusterLabs] Mysql upgrade in DRBD setup

2017-10-16 Thread Attila Megyeri
hi Ken, My problem with the scenario you described is the following: On the central side, if I use M-S replication, the master binlog information will be different on the master and the slave. Therefore, if a failover occurs, remote sites will have difficulties with the "change master"