Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-12 Thread Strahil Nikolov
Don't forget to increase the consensus! Best Regards, Strahil Nikolov На 11 юни 2020 г. 22:11:09 GMT+03:00, Howard написа: >This is interesting. So it seems that 13,000 ms or 13 seconds is how >long >the VM was frozen during the snapshot backup and 0.8 seconds is the >threshold. We will be

Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-12 Thread Strahil Nikolov
And I forgot to ask ... Are you using memory-based snapshot ? It shouldn't take so long. Best Regards, Strahil Nikolov На 12 юни 2020 г. 7:10:38 GMT+03:00, Strahil Nikolov написа: >Don't forget to increase the consensus! > >Best Regards, >Strahil Nikolov > >На 11 юни 2020 г. 22:11:09

Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-11 Thread Howard
This is interesting. So it seems that 13,000 ms or 13 seconds is how long the VM was frozen during the snapshot backup and 0.8 seconds is the threshold. We will be disabling the snapshot backups and may increase the token timeout a bit since these systems are not so critical. Thanks Honza for

Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-11 Thread Jan Friesse
Howard, Good morning. Thanks for reading. We have a requirement to provide high availability for PostgreSQL 10. I have built a two node cluster with a quorum device as the third vote, all running on RHEL 8. Here are the versions installed: [postgres@srv2 cluster]$ rpm -qa|grep

Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-10 Thread Strahil Nikolov
What is your corosync.conf timeouts (especially token & consensus)? Last time I did live migration of RHEL 7 node with the default values, the cluster fenced it - thus I set it to 10s for token and I also raised the consensus (check 'man corosync.conf') above the default. Also, start your

Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-10 Thread Howard
Hi everyone. As a followup, I found that the vms were having snapshot backup at the time of the disconnects which I think freezes IO. We'll be addressing that. Is there anything else in the log that can be improved. Thanks, Howard On Wed, Jun 10, 2020 at 10:06 AM Howard wrote: > Good

[ClusterLabs] New user needs some help stabilizing the cluster

2020-06-10 Thread Howard
Good morning. Thanks for reading. We have a requirement to provide high availability for PostgreSQL 10. I have built a two node cluster with a quorum device as the third vote, all running on RHEL 8. Here are the versions installed: [postgres@srv2 cluster]$ rpm -qa|grep