On Tue, 26 Jan 2021 16:15:55 +0100 Tomas Jelinek <tojel...@redhat.com> wrote:
> Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a): > > On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote: > >> Hi Digimer, > >> > >> On Sun, 24 Jan 2021 15:31:22 -0500 > >> Digimer <li...@alteeve.ca> wrote: > >> [...] > >>> I had a test server (srv01-test) running on node 1 (el8-a01n01), > >>> and on > >>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. > >>> > >>> It appears like pacemaker asked the VM to migrate to node 2 > >>> instead of > >>> stopping it. Once the server was on node 2, I couldn't use 'pcs > >>> resource > >>> disable <vm>' as it returned that that resource was unmanaged, and > >>> the > >>> cluster shut down was hung. When I directly stopped the VM and then > >>> did > >>> a 'pcs resource cleanup', the cluster shutdown completed. > >> > >> As actions during a cluster shutdown cannot be handled in the same > >> transition > >> for each nodes, I usually add a step to disable all resources using > >> property > >> "stop-all-resources" before shutting down the cluster: > >> > >> pcs property set stop-all-resources=true > >> pcs cluster stop --all > >> > >> But it seems there's a very new cluster property to handle that > >> (IIRC, one or > >> two releases ago). Look at "shutdown-lock" doc: > >> > >> [...] > >> some users prefer to make resources highly available only for > >> failures, with > >> no recovery for clean shutdowns. If this option is true, resources > >> active on a > >> node when it is cleanly shut down are kept "locked" to that node > >> (not allowed > >> to run elsewhere) until they start again on that node after it > >> rejoins (or > >> for at most shutdown-lock-limit, if set). > >> [...] > >> > >> [...] > >>> So as best as I can tell, pacemaker really did ask for a > >>> migration. Is > >>> this the case? > >> > >> AFAIK, yes, because each cluster shutdown request is handled > >> independently at > >> node level. There's a large door open for all kind of race conditions > >> if > >> requests are handled with some random lags on each nodes. > > > > I'm going to guess that's what happened. > > > > The basic issue is that there is no "cluster shutdown" in Pacemaker, > > only "node shutdown". I'm guessing "pcs cluster stop --all" sends > > shutdown requests for each node in sequence (probably via systemd), and > > if the nodes are quick enough, one could start migrating off resources > > before all the others get their shutdown request. > > Pcs is doing its best to stop nodes in parallel. The first > implementation of this was done back in 2015: > https://bugzilla.redhat.com/show_bug.cgi?id=1180506 > Since then, we moved to using curl for network communication, which also > handles parallel cluster stop. Obviously, this doesn't ensure the stop > command arrives to and is processed on all nodes at the exactly same time. > > Basically, pcs sends 'stop pacemaker' request to all nodes in parallel > and waits for it to finish on all nodes. Then it sends 'stop corosync' > request to all nodes in parallel. How about adding a step to set/remove "stop-all-resources" on cluster shutdown/start ? This step could either be optional with a new cli argument, or added when --all is given for these commands. Thoughts? _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/