Le Mon, 16 May 2016 13:15:11 +1000, Andrew Beekhof <[email protected]> a écrit :
> > > On 28 Apr 2016, at 7:26 PM, Jehan-Guillaume de Rorthais <[email protected]> > > wrote: > > > > Hello all, > > > > According to the developers guide, when calling demote on a stopped > > resources, the RA should returns a soft error: > > > > http://www.linux-ha.org/doc/dev-guides/_literal_demote_literal_action.html > > > > « > > foobar_monitor > > rc=$? > > case "$rc" in > > [...] > > "$OCF_NOT_RUNNING") > > # Currently not running. Getting a demote action > > # in this state is unexpected. Exit with an error > > # and let the cluster manager recover. > > ocf_log err "Resource is currently not running" > > exit $OCF_ERR_GENERIC > > ;; > > [...] > > » > > > > But to recover a master resource that is fount not running, PEngine produce > > a transition with the following actions: demote -> stop -> start -> promote. > > > > If we follow the dev guide, the recover action is not possible on a > > stopped master as the first action of the transition will always fail, > > leading to a migration and a -inf score on the old master node. > > > > My first though was «why doing a demote -> stop that breaks everything when > > it knows the resource is already stopped?!» > > > > If I understand correctly, I guess PEngine **must** produce such a > > transition so the notify actions are triggered should other leaving clone > > need to process them. Is it right? > > Yes, also because in theory there could be some cleanup that needs to happen. > > > If this is right, then maybe we should relax a bit what is > > written in the ocf dev guide? > > I would change that block use to > > exit $OCF_NOT_RUNNING > > Because we don’t know for sure that the stop will happen I suppose returning OCF_NOT_RUNNING from the demote action would break the current transition as the CRM is expecting a OCF_SUCCESS, isn't it? Or does the CRM conclude it does not need to run the next stop action? I am worried about breaking a transition as we rely on notify vars to detect recover action of a slave, a master or a master move. For a master or a slave recover, we need to run some cleanup action on PostgreSQL suie. If we break the original transition, the new transition **might** (if the new transition is actually different) look like a normal master start->promote. Regards, _______________________________________________ Developers mailing list [email protected] http://clusterlabs.org/mailman/listinfo/developers
