Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-23 Thread Jan Friesse
Prasad, Hi - My systems are single core cpu VMs running on azure platform. I am Ok, now it make sense. I don't think you get too much guarantees in the cloud environment so quite a large scheduling pause simply can happen. Also single core CPU is kind of "unsupported" today. running

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-22 Thread Prasad Nagaraj
Hi - My systems are single core cpu VMs running on azure platform. I am running MySQL on the nodes that do generate high io load. And my bad , I meant to say 'High CPU load detected' logged by crmd and not corosync. Corosync logs messages like 'Corosync main process was not scheduled for.'

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-22 Thread Ferenc Wágner
Jan Friesse writes: > Is that system VM or physical machine? Because " Corosync main process > was not scheduled for..." is usually happening on VMs where hosts are > highly overloaded. Or when physical hosts use BMC watchdogs. But Prasad didn't encounter such logs in the setup at hand, as far

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-22 Thread Jan Friesse
Prasad, Thanks Ken and Ulrich. There is definitely high IO on the system with sometimes IOWAIT s of upto 90% I have come across some previous posts that IOWAIT is also considered as CPU load by Corosync. Is this true ? Does having high IO may lead corosync complain as in " Corosync main process

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-21 Thread Prasad Nagaraj
Thanks Ken and Ulrich. There is definitely high IO on the system with sometimes IOWAIT s of upto 90% I have come across some previous posts that IOWAIT is also considered as CPU load by Corosync. Is this true ? Does having high IO may lead corosync complain as in " Corosync main process was not

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-21 Thread Ken Gaillot
On Tue, 2018-08-21 at 15:29 +0200, Ulrich Windl wrote: > > > > Prasad Nagaraj schrieb am > > > > 21.08.2018 um 11:42 in > > Nachricht > : > > Hi Ken - Thanks for you response. > > > > We do have seen messages in other cases like > > corosync [MAIN  ] Corosync main process was not scheduled for

[ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-21 Thread Ulrich Windl
>>> Prasad Nagaraj schrieb am 21.08.2018 um 11:42 >>> in Nachricht : > Hi Ken - Thanks for you response. > > We do have seen messages in other cases like > corosync [MAIN ] Corosync main process was not scheduled for 17314.4746 ms > (threshold is 8000. ms). Consider token timeout increase.