On 2012-12-06T20:04:20, Andrew Beekhof <and...@beekhof.net> wrote: > >> Does that make sense though? > >> You've not achieved anything a restart wouldn't have done. > >> The choice to move the VM should be up to the VM. > > If the fail-count of a nagios resource reaches its own > > migration-threshold, the colocated VM should migrate with it anyway, > > shouldn't it? > > But moving a nagios resource makes no sense.
Exactly; we would want to move the container/parent. > Because its running inside the guest, which would have already moved > if it was the right thing to do. No, that's not a given. The VM might be "healthy" (as in, the kernel is running), but a service being monitored within it may not have sufficient resources/CPU/IO/network or even connectivity problems on a given host, to the point where trying to restart it on another hypervisor makes sense. But migration-threshold on the nagios primitive combined with a mandatory colocation constraint will take care of that already, if an admin wants to configure such. I agree that, for the most part, people will not do that but keep restarting VMs. > > I like the concept of "failure-delegate". If we introduce it, it sounds > > more like a resource's meta/op attribute to me, rather than into order > > constraint or group. What do you think? > Yes. It would be a resource meta attribute. Hmmm. OK, I think I see where this is going. We already have on-fail settings. How would these play together? Would it even make sense to have on-fail="restart-container"? (Or a nicer wording.) Hmmm. That might work. We allow a "container" to be specified as a meta attribute. If set, on-fail would default to restart container for most actions. But admins could actually modify it - say, they might want to set monitor on-fail="ignore" to just get notified. And when we move forward to whiteboxes, we could have start/monitor/promote/demote on-fail="restart" (like now) and stop on-fail="restart-container". That appears reasonably neat? Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org