On Fri, 2022-02-11 at 08:07 +0100, Ulrich Windl wrote: > > > > Jehan-Guillaume de Rorthais <j...@dalibo.com> schrieb am > > > > 10.02.2022 um > 16:40 in > Nachricht <20220210164000.2e395a37@karst>: > > On Thu, 10 Feb 2022 22:15:07 +0800 > > Roger Zhou via Users <users@clusterlabs.org> wrote: > > > > > On 2/9/22 17:46, Lentes, Bernd wrote: > > > > > > > > ----- On Feb 7, 2022, at 4:13 PM, Jehan-Guillaume de Rorthais > > > > j...@dalibo.com wrote: > > > > > > > > > On Mon, 7 Feb 2022 14:24:44 +0100 (CET) > > > > > "Lentes, Bernd" <bernd.len...@helmholtz-muenchen.de> wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > i'm currently changing a bit in my cluster because i > > > > > > realized that my > > > > > > configuration for a power outtage didn't work as i > > > > > > expected. My idea > is > > > > > > currently: > > > > > > - first stop about 20 VirtualDomains, which are my > > > > > > services. This will > > > > > > surely takes some minutes. I'm thinking of stopping each > > > > > > with a time > > > > > > difference of about 20 seconds for not getting to much IO > > > > > > load. and > then > > > > > > ... > > > > > > This part is tricky. At one hand, it is good thinking to throttle > > > IO load. > > > > > > On the other hand, as Jehan and Ulrich mentioned, `crm resource > > > stop <rsc>` > > > introduces "target‑role=Stopped" for each VirtualDomain, and have > > > to do > `crm > > > resource start <rsc>` to changed it back to "target‑role=Started" > > > to start > > > them after the power outage. > > > > I wonder if after the cluster shutdown complete, the target- > > role=Stopped > > could > > be removed/edited offline with eg. crmadmin? That would make > > VirtualDomain > > startable on boot. > > It has also discussed before: "restart" is implemented by "first > change role > to stopped, then change role to started". > If the performing node is fenced due to a stop failure, the resource > is never > started. > So what's needed is a transient (i.e.: not saved in CIB) "restart" > operation, > that reverts to the previous state (started, most likely) if the the > node > performing it dies. > Now transfer this to "stop-all-resources": The role attribute in the > CIB would > never be changed, but maybe just all the LRMs would stop their > resources, > eventually shutting down and when the node comes up again, the > previous state > will be re-established.
Setting node standby as a transient attribute works very much like that. When the node reboots, transient attributes are wiped, so it's out of standby when it rejoins. > > > I suppose this would not be that simple as it would require to > > update it on > > all > > nodes, taking care of the CIB version, hash, etc... But maybe some > > tooling > > could take care of this? > > > > Last, if Bernd need to stop gracefully the VirtualDomain paying > > attention > to > > the I/O load, maybe he doesn't want them start automatically on > > boot for > the > > exact same reason anyway? > > But you can limit the number of concurrent invocations and > migrations, right? > Unfortunately I cannot remember the the parameter. batch-limit is the number of actions that can be initiated simultaneously across the whole cluster, and migration-limit is the number of live migration actions that can be initiated simultaneously on one node (regardless of whether it's the "from" or "to" node). > > If not, that could be some interesting enhancement: > Like the utilization counting "static" resource consumption, one > could have a > dynamic resource consumption (counting semaphore-like) that is > consumed while > an operation on an instance naming that resource is being performed. > So when you name your resource "concurrent_vm_ops" and asign that to > every vm > configuration, eventually initalizing the resource to siome thing > like 2 or 3, > then you could limit the concurrent VM invocations. Likewise, for > less heave > instances you could use more relaxed settings or no restrictions at > all... > > Regards, > Ulrich > You can accomplish something similar with an ordering constraint with kind=Serialize. In the case of "start vm1 then start vm2" with kind=Serialize, it means that vm1 and vm2 will not be started simultaneously, but neither actually requires the other or has to be done in a specific order. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/