On Fri, 2018-10-12 at 15:51 +0100, lejeczek wrote: > hi guys, > I have a 3-node cluser(centos 7.5), 2 nodes seems fine but > third(or probably something else in between) is not right. > I see this: > > $ pcs status --all > Cluster name: CC > Stack: corosync > Current DC: whale.private (version > 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum > Last updated: Fri Oct 12 15:40:39 2018 > Last change: Fri Oct 12 15:14:57 2018 by root via > crm_resource on whale.private > > 3 nodes configured > 8 resources configured (1 DISABLED) > > Online: [ rental.private whale.private ] > OFFLINE: [ rider.private ] > > and that third node logs: > > [TOTEM ] FAILED TO RECEIVE > [TOTEM ] A new membership (10.5.6.100:2504344) was formed. > Members left: 2 4 > [TOTEM ] Failed to receive the leave message. failed: 2 4 > [QUORUM] Members[1]: 1 > [MAIN ] Completed service synchronization, ready to > provide service. > [TOTEM ] A new membership (10.5.6.49:2504348) was formed. > Members joined: 2 4 > [TOTEM ] FAILED TO RECEIVE > > and it just keeps going like that. > Sometimes reboot(or stop of services + wait + start) of that > third node would help. > But, I get this situation almost every time a node gets > (orderly) shut down or reboot. > Network-wise, connectivity, seem okey. Where to start? > > many thanks, L
Odd. I'd recommend turning on debug logging in corosync.conf, and posting the log here. Hopefully one of the corosync developers can chime in at that point. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org