Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-03 Thread Andrei Borzenkov
On 01.03.2021 16:45, Jan Friesse wrote: > Andrei, > >> On 01.03.2021 15:45, Jan Friesse wrote: >>> Andrei, >>> On 01.03.2021 12:26, Jan Friesse wrote: >> > > Thanks for digging into logs. I believe Eric is hitting > https://github.com/corosync/corosync-qdevice/issues/10

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-03 Thread Strahil Nikolov
When you change the token, you might consider adjusting the consensus timeout (see man corosync.conf). Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home:

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-03 Thread Jan Friesse
Eric, -Original Message- From: Users On Behalf Of Jan Friesse Sent: Monday, March 1, 2021 3:27 AM To: Cluster Labs - All topics related to open-source clustering welcomed ... ha1 lost connection to qnetd so it gives up all hope immediately. ha2 retains connection to qnetd so it

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-02 Thread Eric Robinson
> -Original Message- > From: Users On Behalf Of Jan Friesse > Sent: Monday, March 1, 2021 3:27 AM > To: Cluster Labs - All topics related to open-source clustering welcomed > ; Andrei Borzenkov > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse
Andrei, On 01.03.2021 15:45, Jan Friesse wrote: Andrei, On 01.03.2021 12:26, Jan Friesse wrote: Thanks for digging into logs. I believe Eric is hitting https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, but may take some time to get into distributions) - it also

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Andrei Borzenkov
On 01.03.2021 15:45, Jan Friesse wrote: > Andrei, > >> On 01.03.2021 12:26, Jan Friesse wrote: >>> >>> Thanks for digging into logs. I believe Eric is hitting >>> https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, >>> but may take some time to get into distributions) - it

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse
Andrei, On 01.03.2021 12:26, Jan Friesse wrote: Thanks for digging into logs. I believe Eric is hitting https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, but may take some time to get into distributions) - it also contains workaround. I tested corosync-qnetd at

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Andrei Borzenkov
On 01.03.2021 12:26, Jan Friesse wrote: >> > > Thanks for digging into logs. I believe Eric is hitting > https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, > but may take some time to get into distributions) - it also contains > workaround. > I tested corosync-qnetd at

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse
On 27.02.2021 22:12, Andrei Borzenkov wrote: On 27.02.2021 17:08, Eric Robinson wrote: I agree, one node is expected to go out of quorum. Still the question is, why didn't 001db01b take over the services? I just remembered that 001db01b has services running on it, and those services did

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-28 Thread Valentin Vidić
On Sun, Feb 28, 2021 at 07:45:27AM +, Strahil Nikolov wrote: > As this is in Asure and they support shared disks , I think that a simple SBD > could solve the stonith case. Also fence_azure_arm: Azure Resource Manager :) -- Valentin ___ Manage

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-28 Thread Eric Robinson
> -Original Message- > From: Users On Behalf Of Valentin Vidic > Sent: Sunday, February 28, 2021 4:37 AM > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went > Down Anyway? > > On Sun, Feb 28, 2021 at 07:45:27A

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-28 Thread Strahil Nikolov
As this is in Asure and they support shared disks , I think that a simple SBD could solve the stonith case. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home:

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-28 Thread Andrei Borzenkov
On 27.02.2021 22:12, Andrei Borzenkov wrote: > On 27.02.2021 17:08, Eric Robinson wrote: >> >> I agree, one node is expected to go out of quorum. Still the question is, >> why didn't 001db01b take over the services? I just remembered that 001db01b >> has services running on it, and those

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-27 Thread Andrei Borzenkov
On 27.02.2021 17:08, Eric Robinson wrote: > > I agree, one node is expected to go out of quorum. Still the question is, why > didn't 001db01b take over the services? I just remembered that 001db01b has > services running on it, and those services did not stop, so it seems that > 001db01b did

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-27 Thread Eric Robinson
> -Original Message- > From: Users On Behalf Of Andrei > Borzenkov > Sent: Saturday, February 27, 2021 12:55 AM > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went > Down Anyway? > > On 27.02.2021

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Andrei Borzenkov
On 27.02.2021 09:05, Eric Robinson wrote: >> -Original Message- >> From: Users On Behalf Of Andrei >> Borzenkov >> Sent: Friday, February 26, 2021 1:25 PM >> To: users@clusterlabs.org >> Subject: Re: [ClusterLabs] Our 2-Node Cluster with a

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Eric Robinson
> -Original Message- > From: Users On Behalf Of Andrei > Borzenkov > Sent: Friday, February 26, 2021 1:25 PM > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went > Down Anyway? > > On 26.02.2021

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Andrei Borzenkov
On 26.02.2021 21:58, Eric Robinson wrote: >> -Original Message- >> From: Users On Behalf Of Andrei >> Borzenkov >> Sent: Friday, February 26, 2021 11:27 AM >> To: users@clusterlabs.org >> Subject: Re: [ClusterLabs] Our 2-Node Cluster with a

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Eric Robinson
> -Original Message- > From: Users On Behalf Of Andrei > Borzenkov > Sent: Friday, February 26, 2021 11:27 AM > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went > Down Anyway? > > 26.02.2021 19:19, Eric Rob

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Andrei Borzenkov
26.02.2021 20:23, Eric Robinson пишет: >> -Original Message- >> From: Digimer >> Sent: Friday, February 26, 2021 10:35 AM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson >> Subject: Re: [ClusterLabs

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Digimer
On 2021-02-26 12:23 p.m., Eric Robinson wrote: >> -Original Message- >> From: Digimer >> Sent: Friday, February 26, 2021 10:35 AM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson >> Subject:

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Andrei Borzenkov
26.02.2021 19:19, Eric Robinson пишет: > At 5:16 am Pacific time Monday, one of our cluster nodes failed and its mysql > services went down. The cluster did not automatically recover. > > We're trying to figure out: > > > 1. Why did it fail? Pacemaker only registered loss of connection

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Eric Robinson
> -Original Message- > From: Digimer > Sent: Friday, February 26, 2021 10:35 AM > To: Cluster Labs - All topics related to open-source clustering welcomed > ; Eric Robinson > Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went > Down Anyway? &g

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Digimer
On 2021-02-26 11:19 a.m., Eric Robinson wrote: > At 5:16 am Pacific time Monday, one of our cluster nodes failed and its > mysql services went down. The cluster did not automatically recover. > > We’re trying to figure out: > > 1. Why did it fail? > 2. Why did it not automatically recover? >