Hi All,

I'm running Pacemaker 2.0.2 on a two node cluster. It runs one master/slave 
resource (I'll refer to it as the king resource) and about 20 other resources 
which are a mixture of:


- resources that only run on the king resource master node (colocation 
constraint with a score of INFINITY)

- clone resources that run on both nodes

- two other master/slave resources where the masters runs on the same node as 
the king resource master (colocation constraint with a score of INFINITY)


I'll refer to the above set of resources as servant resources.


All servant resources have a resource-stickiness of zero and the king resource 
has a resource-stickiness of 100. There is an ordering constraint that the king 
resource must start before all servant resources. The king resource is 
controlled by an OCF script that uses crm_master to set the preferred master 
for the king resource (current master has value 100, current slave is 5, 
unassigned role or resource failure is 1) - I've verified that these values are 
being set as expected upon promotion/demotion/failure etc, via the logs. That's 
pretty much all of the configuration - there is no configuration around node 
preferences and migration-threshold is zero for everything.


What I'm trying to achieve is fairly simple:


1. If any servant resource fails on either node, it is simply restarted. These 
resources should never failover onto the other node because of colocation with 
the king resource, and they should not contribute in any way to deciding 
whether the king resource should failover (which is why they have a 
resource-stickiness of zero).

2. If the slave instance of the king resource fails, it should simply be 
restarted and again no failover should occur.

3. If the master instance of the king resource fails, then its slave instance 
should immediately be promoted, and the failed instance should be restarted. 
Failover of all servant resources should then occur due to the colocation 
dependency.


It's number 3 above that I'm having trouble with. If I kill the master king 
resource instance it behaves as I expect - everything fails over and the king 
resource is restarted on the new slave. If I then kill the master instance of 
the king resource again however, instead of failing back over to its original 
node, it restarts and promotes back to master on the same node. This is not 
what I want.


The relevant output from crm_simulate for the two tests is shown below. Can 
anyone suggest what might be going wrong? Whilst I really like the concept of 
crm_simulate, I can't find a good description of how to interpret the output 
and I don't understand the difference between clone_color and native_color, or 
the difference between "promotion scores" and the various instances of 
"allocation scores", nor does it really tell me what is contributing to the 
scores. Where does the -INFINITY allocation score come from for example?


Thanks,

Harvey



FIRST KING RESOURCE MASTER FAILURE (CORRECT BEHAVIOUR - MASTER NODE FAILOVER 
OCCURS)


 Clone Set: ms_king_resource [king_resource] (promotable)
     king_resource      (ocf::aviat:king-resource-ocf):    FAILED Master 
secondary
clone_color: ms_king_resource allocation score on primary: 0
clone_color: ms_king_resource allocation score on secondary: 0
clone_color: king_resource:0 allocation score on primary: 0
clone_color: king_resource:0 allocation score on secondary: 101
clone_color: king_resource:1 allocation score on primary: 200
clone_color: king_resource:1 allocation score on secondary: 0
native_color: king_resource:1 allocation score on primary: 200
native_color: king_resource:1 allocation score on secondary: 0
native_color: king_resource:0 allocation score on primary: -INFINITY
native_color: king_resource:0 allocation score on secondary: 101
king_resource:1 promotion score on primary: 100
king_resource:0 promotion score on secondary: 1
 * Recover    king_resource:0      ( Master -> Slave secondary )
 * Promote    king_resource:1      (   Slave -> Master primary )
 * Resource action: king_resource   cancel=10000 on secondary
 * Resource action: king_resource   cancel=11000 on primary
 * Pseudo action:   ms_king_resource_pre_notify_demote_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_demote_0
 * Pseudo action:   ms_king_resource_demote_0
 * Resource action: king_resource   demote on secondary
 * Pseudo action:   ms_king_resource_demoted_0
 * Pseudo action:   ms_king_resource_post_notify_demoted_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_demoted_0
 * Pseudo action:   ms_king_resource_pre_notify_stop_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
 * Pseudo action:   ms_king_resource_stop_0
 * Resource action: king_resource   stop on secondary
 * Pseudo action:   ms_king_resource_stopped_0
 * Pseudo action:   ms_king_resource_post_notify_stopped_0
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_stopped_0
 * Pseudo action:   ms_king_resource_pre_notify_start_0
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
 * Pseudo action:   ms_king_resource_start_0
 * Resource action: king_resource   start on secondary
 * Pseudo action:   ms_king_resource_running_0
 * Pseudo action:   ms_king_resource_post_notify_running_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
 * Pseudo action:   ms_king_resource_pre_notify_promote_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
 * Pseudo action:   ms_king_resource_promote_0
 * Resource action: king_resource   promote on primary
 * Pseudo action:   ms_king_resource_promoted_0
 * Pseudo action:   ms_king_resource_post_notify_promoted_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
 * Resource action: king_resource   monitor=11000 on secondary
 * Resource action: king_resource   monitor=10000 on primary
 Clone Set: ms_king_resource [king_resource] (promotable)



SECOND KING RESOURCE MASTER FAILURE (INCORRECT BEHAVIOUR - SAME NODE IS 
PROMOTED TO MASTER)


 Clone Set: ms_king_resource [king_resource] (promotable)
     king_resource      (ocf::aviat:king-resource-ocf):    FAILED Master primary
clone_color: ms_king_resource allocation score on primary: 0
clone_color: ms_king_resource allocation score on secondary: 0
clone_color: king_resource:0 allocation score on primary: 0
clone_color: king_resource:0 allocation score on secondary: 200
clone_color: king_resource:1 allocation score on primary: 101
clone_color: king_resource:1 allocation score on secondary: 0
native_color: king_resource:0 allocation score on primary: 0
native_color: king_resource:0 allocation score on secondary: 200
native_color: king_resource:1 allocation score on primary: 101
native_color: king_resource:1 allocation score on secondary: -INFINITY
king_resource:1 promotion score on primary: 1
king_resource:0 promotion score on secondary: 1
 * Recover    king_resource:1     ( Master primary )
 * Pseudo action:   ms_king_resource_pre_notify_demote_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_demote_0
 * Pseudo action:   ms_king_resource_demote_0
 * Resource action: king_resource   demote on primary
 * Pseudo action:   ms_king_resource_demoted_0
 * Pseudo action:   ms_king_resource_post_notify_demoted_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_demoted_0
 * Pseudo action:   ms_king_resource_pre_notify_stop_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
 * Pseudo action:   ms_king_resource_stop_0
 * Resource action: king_resource   stop on primary
 * Pseudo action:   ms_king_resource_stopped_0
 * Pseudo action:   ms_king_resource_post_notify_stopped_0
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_stopped_0
 * Pseudo action:   ms_king_resource_pre_notify_start_0
 * Resource action: king_resource   notify on secondary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
 * Pseudo action:   ms_king_resource_start_0
 * Resource action: king_resource   start on primary
 * Pseudo action:   ms_king_resource_running_0
 * Pseudo action:   ms_king_resource_post_notify_running_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
 * Pseudo action:   ms_king_resource_pre_notify_promote_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
 * Pseudo action:   ms_king_resource_promote_0
 * Resource action: king_resource   promote on primary
 * Pseudo action:   ms_king_resource_promoted_0
 * Pseudo action:   ms_king_resource_post_notify_promoted_0
 * Resource action: king_resource   notify on secondary
 * Resource action: king_resource   notify on primary
 * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
 * Resource action: king_resource   monitor=10000 on primary
 Clone Set: ms_king_resource [king_resource] (promotable)


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to