y's Topics:
>
> 1. Re: Pacemaker auto restarts disabled groups (Ian Underhill)
>2. Re: Pacemaker auto restarts disabled groups (Ken Gaillot)
>
>
> ------
>
> Message: 1
> Date: Thu, 8 Nov 2018 12:14:33
seems this issue has been raised before, but has gone quite, with no
solution
https://lists.clusterlabs.org/pipermail/users/2017-October/006544.html
I know my resource agents successfully return the correct status to the
start\stop\monitor requests
On Thu, Nov 8, 2018 at 11:40 AM Ian Underhill
Sometimes Im seeing that a resource group that is in the process of being
disable is auto restarted by pacemaker.
When issuing pcs disable command to disable different resource groups at
the same time (on different nodes, at the group level) the result is that
sometimes the resource is stopped
Im trying to design a resource layout that has different "dislikes"
colocation scores between the various resources within the cluster.
1) When I start to have multiple colocation dependencies from a single
resource, strange behaviour starts to happen, in scenarios where resource
have to bunch
im guessing this is just a "feature", but something that will probably stop
me using groups
Scenario1 (working):
1) Two nodes (1,2) within a cluster (default-stickiness = INFINITY)
2) Two resources (A,B) in a cluster running on different nodes
3) colocation constraint between resources of A->B
when a resource fails on a node I would like to mark the node unhealthy, so
other resources dont start up on it.
I believe I can achieve this, ignoring the concept of fencing at the moment.
I have tried to set my cluster to have a node-health-strategy as only_green.
However trying to manually
im trying to understand the behaviour of pacemaker when a resource monitor
returns OCF_NOT_RUNNING instead of OCF_ERR_GENERIC, and does pacemaker
really care.
The documentation states that a return code OCF_NOT_RUNNING from a monitor
will not result in a stop being called on that resource, as it
requirement:
when a resource fails perform an actionm, run a script on all nodes within
the cluster, before the resource is relocated. i.e. information gathering
why the resource failed.
what I have looked into:
1) Use the monitor call within the resource to SSH to all nodes, again SSH
config