On 8/9/19 6:40 PM, Andrei Borzenkov wrote: > 09.08.2019 16:34, Yan Gao пишет: >> Hi, >> >> With disk-less sbd, it's fine to stop cluster service from the cluster >> nodes all at the same time. >> >> But if to stop the nodes one by one, for example with a 3-node cluster, >> after stopping the 2nd node, the only remaining node resets itself with: >> > > That is sort of documented in SBD manual page: > > --><-- > However, while the cluster is in such a degraded state, it can > neither successfully fence nor be shutdown cleanly (as taking the > cluster below the quorum threshold will immediately cause all remaining > nodes to self-fence). > --><-- > > SBD in shared-nothing mode is basically always in such degraded state > and cannot tolerate loss of quorum. Well, the context here is it loses quorum *expectedly* since the other nodes gracefully shut down.
> > > >> Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk: debug: >> notify_parent: Not notifying parent: state transient (2) >> Aug 09 14:30:20 opensuse150-1 sbd[1080]: cluster: debug: >> notify_parent: Notifying parent: healthy >> Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: >> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) >> >> I can think of the way to manipulate quorum with last_man_standing and >> potentially also auto_tie_breaker, not to mention >> last_man_standing_window would also be a factor... But is there a better >> solution? >> > > Lack of cluster wide shutdown mode was mentioned more than once on this > list. I guess the only workaround is to use higher level tools which > basically simply try to stop cluster on all nodes at once. It is still > susceptible to race condition. Gracefully stopping nodes one by one on purpose is still a reasonable need though ... Regards, Yan _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/