Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Harvey Shepherd
I initially thought it was only in one direction, but it actually isn't. It's just that occasionally if the timing is just right then the failover manages to succeed. Besides, I don't think that has any bearing on why Pacemaker is trying to restart the failed resource instance before promoting

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Andrei Borzenkov
02.07.2019 2:30, Harvey Shepherd пишет: >> The "transition summary" is just a resource-by-resource list, not the >> order things will be done. The "executing cluster transition" section >> is the order things are being done. > > Thanks Ken. I think that's where the problem is originating. If you

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Harvey Shepherd
> The "transition summary" is just a resource-by-resource list, not the > order things will be done. The "executing cluster transition" section > is the order things are being done. Thanks Ken. I think that's where the problem is originating. If you look at the "executing cluster transition"

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
On Sun, 2019-06-30 at 11:13 +, Harvey Shepherd wrote: > >> There is an ordering constraint - everything must be started after > the king resource. But even if this constraint didn't exist I don't > see that it should logically make any difference due to all the non- > clone resources being

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
On Sat, 2019-06-29 at 03:01 +, Harvey Shepherd wrote: > Thank you so much Ken, your explanation of the crm_simulate output is > really helpful. Regarding your suggestion of setting a migration- > threshold of 1 for the king resource, I did in fact have that in > place as a workaround. But

[ClusterLabs] Antw: Re: Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Ulrich Windl
>>> Jan Pokorný schrieb am 01.07.2019 um 14:42 in Nachricht <20190701124215.gn31...@redhat.com>: > On 01/07/19 13:26 +0200, Ulrich Windl wrote: > Jan Pokorný schrieb am 27.06.2019 um 12:02 > in Nachricht <20190627100209.gf31...@redhat.com>: >>> On 25/06/19 12:20 ‑0500, Ken Gaillot wrote:

Re: [ClusterLabs] Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Jan Pokorný
On 01/07/19 13:26 +0200, Ulrich Windl wrote: Jan Pokorný schrieb am 27.06.2019 um 12:02 in Nachricht <20190627100209.gf31...@redhat.com>: >> On 25/06/19 12:20 ‑0500, Ken Gaillot wrote: >>> On Tue, 2019‑06‑25 at 11:06 +, Somanath Jeeva wrote: >>> Addressing the root cause, I'd first

[ClusterLabs] Antw: Re: PCSD - High Memory Usage

2019-07-01 Thread Ulrich Windl
Would running pcsd unter valgrind be an option? In addition to checking for leaks, it can also provide some memory usage statistics (who is using how much)... >>> Tomas Jelinek schrieb am 27.06.2019 um 15:30 in Nachricht <363f827e-d05d-309f-7ab6-c43e268df...@redhat.com>: > Hi, > > We (pcs

[ClusterLabs] Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Ulrich Windl
>>> Jan Pokorný schrieb am 27.06.2019 um 12:02 in Nachricht <20190627100209.gf31...@redhat.com>: > On 25/06/19 12:20 ‑0500, Ken Gaillot wrote: >> On Tue, 2019‑06‑25 at 11:06 +, Somanath Jeeva wrote: >> Addressing the root cause, I'd first make sure corosync is running at >> real‑time priority

[ClusterLabs] Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Ulrich Windl
>>> Somanath Jeeva schrieb am 25.06.2019 um 13:06 in Nachricht > I have not configured fencing in our setup . However I would like to know if > the split brain can be avoided when high CPU occurs. It seems you like to ride a bicycle with crossed arms while trying to avoid to fall ;-) > >

[ClusterLabs] Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 24.06.2019 um 16:57 in Nachricht <95f51b52283d05bbd948e4508c406d7ccb64.ca...@redhat.com>: > On Mon, 2019‑06‑24 at 08:52 +0200, Jan Friesse wrote: >> Somanath, >> >> > Hi All, >> > >> > I have a two node cluster with multicast (udp) transport . The >> > multicast

[ClusterLabs] Antw: Re: Two node cluster goes into split brain scenario during CPU intensive tasks

2019-07-01 Thread Ulrich Windl
>>> Jan Friesse schrieb am 24.06.2019 um 08:52 in Nachricht : > Somanath, > >> Hi All, >> >> I have a two node cluster with multicast (udp) transport . The multicast IP > used in 224.1.1.1 . > > Would you mind to give a try to UDPU (unicast)? For two node cluster > there is going to be no

[ClusterLabs] Antw: Re: two virtual domains start and stop every 15 minutes

2019-07-01 Thread Ulrich Windl
To me it looks like a broken migration configuration. >>> "Lentes, Bernd" schrieb am 19.06.2019 um 18:46 in Nachricht <1654529492.1465807.1560962767193.javamail.zim...@helmholtz-muenchen.de>: > ‑ On Jun 15, 2019, at 4:30 PM, Bernd Lentes bernd.lentes@helmholtz‑muenchen.de > wrote: > >>

[ClusterLabs] Antw: Re: PostgreSQL PAF failover issue

2019-07-01 Thread Ulrich Windl
>>> Tiemen Ruiten schrieb am 14.06.2019 um 16:43 in Nachricht : > Right, so I may have been too fast to give up. I set maintenance mode back > on and promoted ph-sql-04 manually. Unfortunately I don't have the logs of > ph-sql-03 anymore because I reinitialized it. > > You mention that demote