On 09/20/2016 07:51 PM, Andrew Beekhof wrote: > > > On Wed, Sep 21, 2016 at 6:25 AM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > Hi everybody, > > Currently, Pacemaker's on-fail property allows you to configure how the > cluster reacts to operation failures. The default "restart" means try to > restart on the same node, optionally moving to another node once > migration-threshold is reached. Other possibilities are "ignore", > "block", "stop", "fence", and "standby". > > Occasionally, we get requests to have something like migration-threshold > for values besides restart. For example, try restarting the resource on > the same node 3 times, then fence. > > I'd like to get your feedback on two alternative approaches we're > considering. > > ### > > Our first proposed approach would add a new hard-fail-threshold > operation property. If specified, the cluster would first try restarting > the resource on the same node, > > > Well, just as now, it would be _allowed_ to start on the same node, but > this is not guaranteed. > > > before doing the on-fail handling. > > For example, you could configure a promote operation with > hard-fail-threshold=3 and on-fail=fence, to fence the node after 3 > failures. > > > One point that's not settled is whether failures of *any* operation > would count toward the 3 failures (which is how migration-threshold > works now), or only failures of the specified operation. > > > I think if hard-fail-threshold is per-op, then only failures of that > operation should count. > > > > Currently, if a start fails (but is retried successfully), then a > promote fails (but is retried successfully), then a monitor fails, the > resource will move to another node if migration-threshold=3. We could > keep that behavior with hard-fail-threshold, or only count monitor > failures toward monitor's hard-fail-threshold. Each alternative has > advantages and disadvantages. > > ### > > The second proposed approach would add a new on-restart-fail resource > property. > > Same as now, on-fail set to anything but restart would be done > immediately after the first failure. A new value, "ban", would > immediately move the resource to another node. (on-fail=ban would behave > like on-fail=restart with migration-threshold=1.) > > When on-fail=restart, and restarting on the same node doesn't work, the > cluster would do the on-restart-fail handling. on-restart-fail would > allow the same values as on-fail (minus "restart"), and would default to > "ban". > > > I do wish you well tracking "is this a restart" across demote -> stop -> > start -> promote in 4 different transitions :-) > > > > So, if you want to fence immediately after any promote failure, you > would still configure on-fail=fence; if you want to try restarting a few > times first, you would configure on-fail=restart and > on-restart-fail=fence. > > This approach keeps the current threshold behavior -- failures of any > operation count toward the threshold. We'd rename migration-threshold to > something like hard-fail-threshold, since it would apply to more than > just migration, but unlike the first approach, it would stay a resource > property. > > ### > > Comparing the two approaches, the first is more flexible, but also more > complex and potentially confusing. > > > More complex to implement or more complex to configure?
I was thinking more complex in behavior, so perhaps harder to follow / expect. For example, "After two start failures, fence this node; after three promote failures, put the node in standby; but if a monitor failure is the third operation failure of any type, then move the resource to another node." Granted, someone would have to inflict that on themselves :) but another sysadmin / support tech / etc. who had to deal with the config later might have trouble following it. To keep the current default behavior, the default would be complicated, too: "1 for start and stop operations, and 0 for other operations" where "0 is equivalent to 1 except when on-fail=restart, in which case migration-threshold will be used instead". And then add to that tracking fail-count per node+resource+operation combination, with the associated status output and cleanup options. "crm_mon -f" currently shows failures like: * Node node1: rsc1: migration-threshold=3 fail-count=1 last-failure='Wed Sep 21 15:12:59 2016' What should that look like with per-op thresholds and fail-counts? I'm not saying it's a bad idea, just that it's more complicated than it first sounds, so it's worth thinking through the implications. > With either approach, we would deprecate the start-failure-is-fatal > cluster property. start-failure-is-fatal=true would be equivalent to > hard-fail-threshold=1 with the first approach, and on-fail=ban with the > second approach. This would be both simpler and more useful -- it allows > the value to be set differently per resource. > -- > Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com>> _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org