Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Ken Gaillot
On 11/04/2015 12:55 PM, Digimer wrote: > On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: >> Hi, >> >> I have a cluster of 32 nodes, and after some tuning was able to have it >> started and running, > > This is not supported by RH for a reasons; it's hard to get the timing > right. SUSE supports up

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Radoslaw Garbacz
Thank you Ken and Digimer for all your suggestions. On Wed, Nov 4, 2015 at 2:32 PM, Ken Gaillot wrote: > On 11/04/2015 12:55 PM, Digimer wrote: > > On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: > >> Hi, > >> > >> I have a cluster of 32 nodes, and after some tuning was able

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Digimer
On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: > Hi, > > I have a cluster of 32 nodes, and after some tuning was able to have it > started and running, This is not supported by RH for a reasons; it's hard to get the timing right. SUSE supports up to 32 nodes, but they must be doing some serious

[ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Radoslaw Garbacz
Hi, I have a cluster of 32 nodes, and after some tuning was able to have it started and running, but it does not recover from a node disconnect-connect failure. It regains quorum, but CIB does not recover to a synchronized state and "cibadmin -Q" times out. Is there anything with corosync or