Re: [ClusterLabs] Antw: Re: crmsh configure delete for constraints

Ferenc Wágner Wed, 10 Feb 2016 02:59:58 -0800

Vladislav Bogdanov <bub...@hoster-ok.com> writes:

> If pacemaker has got an error on start, it will run stop with the same
> set of parameters anyways. And will get error again if that one was
> from validation and RA does not differentiate validation for start and
> stop. And then circular fencing over the whole cluster is triggered
> for no reason.
>
> Of course, for safety, RA could save its state if start was successful
> and skip validation on stop only if that state is not found. Otherwise
> removed binary or config file would result in resource running on
> several nodes.


What would happen if we made the start operation return OCF_NOT_RUNNING
if validation fails?  Or more broadly: if the start operation knows that
the resource is not running, thus a stop opration would do no good.
>From Pacemaker Explained B.4: "The cluster will not attempt to stop a
resource that returns this for any action."  The probes could still
return OCF_ERR_CONFIGURED, putting real info into the logs, the stop
failure could still lead to fencing, protecting data integrity, but
circular fencing would not happen.  I hope.

By the way, what are the reasons to run stop after a failed start?  To
clean up halfway-started resources?  Besides OCF_ERR_GENERIC, the other
error codes pretty much guarrantee that the resource can not be active.
-- 
Regards,
Feri.

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Re: crmsh configure delete for constraints

Reply via email to