On February 27, 2020 7:00:36 PM GMT+02:00, Ken Gaillot <kgail...@redhat.com> wrote: >On Thu, 2020-02-27 at 17:28 +0100, Jehan-Guillaume de Rorthais wrote: >> On Thu, 27 Feb 2020 09:48:23 -0600 >> Ken Gaillot <kgail...@redhat.com> wrote: >> >> > On Thu, 2020-02-27 at 15:01 +0100, Jehan-Guillaume de Rorthais >> > wrote: >> > > On Thu, 27 Feb 2020 12:24:46 +0100 >> > > "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de> wrote: >> > > >> > > > > > > Jehan-Guillaume de Rorthais <j...@dalibo.com> schrieb am >> > > > > > > 27.02.2020 um >> > > > >> > > > 11:05 in >> > > > Nachricht <20200227110502.3624cb87@firost>: >> > > > >> > > > [...] >> > > > > What about something like "lock‑location=bool" and >> > > > >> > > > For "lock-location" I would assume the value is a "location". I >> > > > guess you >> > > > wanted a "use-lock-location" Boolean value. >> > > >> > > Mh, maybe "lock-current-location" would better reflect what I >> > > meant. >> > > >> > > The point is to lock the resource on the node currently running >> > > it. >> > >> > Though it only applies for a clean node shutdown, so that has to be >> > in >> > the name somewhere. The resource isn't locked during normal cluster >> > operation (it can move for resource or node failures, load >> > rebalancing, >> > etc.). >> >> Well, I was trying to make the new feature a bit wider than just the >> narrow shutdown feature. >> >> Speaking about shutdown, what is the status of clean shutdown of the >> cluster >> handled by Pacemaker? Currently, I advice to stop resources >> gracefully (eg. >> using pcs resource disable [...]) before shutting down each nodes >> either by hand >> or using some higher level tool (eg. pcs cluster stop --all). > >I'm not sure why that would be necessary. It should be perfectly fine >to stop pacemaker in any order without disabling resources. > >Start-up is actually more of an issue ... if you start corosync and >pacemaker on nodes one by one, and you're not quick enough, then once >quorum is reached, the cluster will fence all the nodes that haven't >yet come up. So on start-up, it makes sense to start corosync on all >nodes, which will establish membership and quorum, then start pacemaker >on all nodes. Obviously that can't be done within pacemaker so that has >to be done manually or by a higher-level tool. > >> Shouldn't this feature be discussed in this context as well? >> >> [...] >> > > > > it would lock the resource location (unique or clones) until >> > > > > the >> > > > > operator unlock it or the "lock‑location‑timeout" expire. No >> > > > > matter what >> > > > > happen to the resource, maintenance mode or not. >> > > > > >> > > > > At a first look, it looks to peer nicely with >> > > > > maintenance‑mode >> > > > > and avoid resource migration after node reboot. >> > >> > Maintenance mode is useful if you're updating the cluster stack >> > itself >> > -- put in maintenance mode, stop the cluster services (leaving the >> > managed services still running), update the cluster services, start >> > the >> > cluster services again, take out of maintenance mode. >> > >> > This is useful if you're rebooting the node for a kernel update >> > (for >> > example). Apply the update, reboot the node. The cluster takes care >> > of >> > everything else for you (stop the services before shutting down and >> > do >> > not recover them until the node comes back). >> >> I'm a bit lost. If resource doesn't move during maintenance mode, >> could you detail a scenario where we should ban it explicitly from >> other node to >> secure its current location when getting out of maintenance? Isn't it > >Sorry, I was unclear -- I was contrasting maintenance mode with >shutdown locks. > >You wouldn't need a ban with maintenance mode. However maintenance mode >leaves any active resources running. That means the node shouldn't be >rebooted in maintenance mode, because those resources will not be >cleanly stopped. > >With shutdown locks, the active resources are cleanly stopped. That >does require a ban of some sort because otherwise the resources will be >recovered on another node. > >> excessive >> precaution? Is it just to avoid is to move somewhere else when >> exiting >> maintenance-mode? If the resource has a preferred node, I suppose the >> location >> constraint should take care of this, isn't it? > >Having a preferred node doesn't prevent the resource from starting >elsewhere if the preferred node is down (or in standby, or otherwise >ineligible to run the resource). Even a +INFINITY constraint allows >recovery elsewhere if the node is not available. To keep a resource >from being recovered, you have to put a ban (-INFINITY location >constraint) on any nodes that could otherwise run it. > >> > > > I wonder: Where is it different from a time-limited "ban" >> > > > (wording >> > > > also exists >> > > > already)? If you ban all resources from running on a specific >> > > > node, >> > > > resources >> > > > would be move away, and when booting the node, resources won't >> > > > come >> > > > back. >> > >> > It actually is equivalent to this process: >> > >> > 1. Determine what resources are active on the node about to be shut >> > down. >> > 2. For each of those resources, configure a ban (location >> > constraint >> > with -INFINITY score) using a rule where node name is not the node >> > being shut down. >> > 3. Apply the updates and reboot the node. The cluster will stop the >> > resources (due to shutdown) and not start them anywhere else (due >> > to >> > the bans). >> >> In maintenance mode, this would not move either. > >The problem with maintenance mode for this scenario is that the reboot >would uncleanly terminate any active resources. > >> > 4. Wait for the node to rejoin and the resources to start on it >> > again, >> > then remove all the bans. >> > >> > The advantage is automation, and in particular the sysadmin >> > applying >> > the updates doesn't need to even know that the host is part of a >> > cluster. >> >> Could you elaborate? I suppose the operator still need to issue a >> command to >> set the shutdown‑lock before reboot, isn't it? > >Ah, no -- this is intended as a permanent cluster configuration >setting, always in effect. > >> Moreover, if shutdown‑lock is just a matter of setting ±infinity >> constraint on >> nodes, maybe a higher level tool can take care of this? > >In this case, the operator applying the reboot may not even know what >pacemaker is, much less what command to run. The goal is to fully >automate the process so a cluster-aware administrator does not need to >be present. > >I did consider a number of alternative approaches, but they all had >problematic corner cases. For a higher-level tool or anything external >to pacemaker, one such corner case is a "time-of-check/time-of-use" >problem -- determining the list of active resources has to be done >separately from configuring the bans, and it's possible the list could >change in the meantime. > >> > > This is the standby mode. >> > >> > Standby mode will stop all resources on a node, but it doesn't >> > prevent >> > recovery elsewhere. >> >> Yes, I was just commenting on Ulrich's description (history context >> crop'ed >> here). >-- >Ken Gaillot <kgail...@redhat.com> > >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/
Hi Ken, Can you tell me the logic of that feature? So far it looks like: 1. Mark resources/groups that will be affected by the feature 2. Resources/groups are stopped (target-role=stopped) 3. Node exits the cluster cleanly when no resources are running any more 4. The node rejoins the cluster after the reboot 5. A positive (on the rebooted node) & negative (ban on the rest of the nodes) constraints are created for the marked in step 1 resources 6. target-role is set back to started and the resources are back and running 7. When each resource group (or standalone resource) is back online - the mark in step 1 is removed and any location constraints (cli-ban & cli-prefer) are removed for the resource/group. Yet, if that feature will attract more end users (or even enterprises) - I think that it will be positive for the stack. Best Regards, Strahil Nikolov _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/