Re: [ClusterLabs] Problems with master/slave failovers

Harvey Shepherd Fri, 28 Jun 2019 07:29:32 -0700

I've done some further analysis on why Pacemaker seems to refuse to failover 
the king master resource in the configuration that I described earlier. Based 
on analysis of the stored transition files (pe-input.bz2 files), Pacemaker 
makes about 8 attempts to actually perform the desired failover, but on each 
occasion the initiated transitions are aborted and the following log is 
produced:



Jun 28 14:13:35 ctr_qemu pacemaker-controld  [1224] (abort_transition_graph)    
notice: Transition 15 aborted by status-2-master-king_resource doing create 
master-king_resource=5: Transient attribute change | cib=0.4.253 
source=abort_unless_down:331 
path=/cib/status/node_state[@id='2']/transient_attributes[@id='2']/instance_attributes[@id='status-2']
 complete=false


Finally the original killed master manages to restart and then Pacemaker just 
decides to re-promote it to master rather than failing over. Could these 
transitions be being aborted due to them taking too long to complete? If so, is 
there a configuration option I can set to increase the timeout?


Thanks,

Harvey


________________________________
From: Users <users-boun...@clusterlabs.org> on behalf of Harvey Shepherd 
<harvey.sheph...@aviatnet.com>
Sent: Friday, 28 June 2019 7:36 p.m.
To: Cluster Labs - All topics related to open-source clustering welcomed
Subject: EXTERNAL: Re: [ClusterLabs] Problems with master/slave failovers

Thanks for your reply Andrei. Whilst I understand what you say about the 
difficulties of diagnosing issues without all of the info, it's a compromise 
between a mailing list posting being very verbose in which case nobody wants to 
read it, and containing enough relevant information for someone to be able to 
help. With 20+ resources involved during a failover there are literally 
thousands of logs generated, and it would be pointless to post them all.

I've tried to focus in on the king resource only to keep things simple, as that 
is the only resource that can initiate a failover. I provided the real master 
scores and transition decisions made by pacemaker at the times that I killed 
the king master resource by showing the crm_simulator output from both tests, 
and the CIB config is ss described. As I mentioned, migration-threshold is set 
to zero for all resources, so it shouldn't prevent a second failover.

Regarding the resource agent return codes, the failure is detected by the 10s 
king resource master instance monitor operation, which returns OCF_ERR_GENERIC 
because the resource is expected to be running and isn't (the OCF resource 
agent developers guide states that monitor should only return OCF_NOT_RUNNING 
if there is no error condition that caused the resource to stop).

What would be really helpful would be if you or someone else could help me 
decipher the crm_simulate output:

1. What is the difference between clone_color and native_color?
2. What is the difference between "promotion scores" and "allocation scores" 
and why does the output show several instances of each?
3. How does pacemaker use those scores to decide whether to failover?
4. Why is there a -INFINITY score on one node?

Thanks again for your help.



On 28 Jun 2019 6:46 pm, Andrei Borzenkov <arvidj...@gmail.com> wrote:
On Fri, Jun 28, 2019 at 7:24 AM Harvey Shepherd
<harvey.sheph...@aviatnet.com> wrote:
>
> Hi All,
>
>
> I'm running Pacemaker 2.0.2 on a two node cluster. It runs one master/slave 
> resource (I'll refer to it as the king resource) and about 20 other resources 
> which are a mixture of:
>
>
> - resources that only run on the king resource master node (colocation 
> constraint with a score of INFINITY)
>
> - clone resources that run on both nodes
>
> - two other master/slave resources where the masters runs on the same node as 
> the king resource master (colocation constraint with a score of INFINITY)
>
>
> I'll refer to the above set of resources as servant resources.
>
>
> All servant resources have a resource-stickiness of zero and the king 
> resource has a resource-stickiness of 100. There is an ordering constraint 
> that the king resource must start before all servant resources. The king 
> resource is controlled by an OCF script that uses crm_master to set the 
> preferred master for the king resource (current master has value 100, current 
> slave is 5, unassigned role or resource failure is 1) - I've verified that 
> these values are being set as expected upon promotion/demotion/failure etc, 
> via the logs. That's pretty much all of the configuration - there is no 
> configuration around node preferences and migration-threshold is zero for 
> everything.
>
>
> What I'm trying to achieve is fairly simple:
>
>
> 1. If any servant resource fails on either node, it is simply restarted. 
> These resources should never failover onto the other node because of 
> colocation with the king resource, and they should not contribute in any way 
> to deciding whether the king resource should failover (which is why they have 
> a resource-stickiness of zero).
>
> 2. If the slave instance of the king resource fails, it should simply be 
> restarted and again no failover should occur.
>
> 3. If the master instance of the king resource fails, then its slave instance 
> should immediately be promoted, and the failed instance should be restarted. 
> Failover of all servant resources should then occur due to the colocation 
> dependency.
>
>
> It's number 3 above that I'm having trouble with. If I kill the master king 
> resource instance it behaves as I expect - everything fails over and the king 
> resource is restarted on the new slave. If I then kill the master instance of 
> the king resource again however, instead of failing back over to its original 
> node, it restarts and promotes back to master on the same node. This is not 
> what I want.
>

migration-threshold is the first thing that comes in mind. Another
possibility is hard error returned by resource agent that forces
resource off node.

But please realize that without actual configuration and logs at the
time undesired behavior happens it just becomes game of riddles.

>
> The relevant output from crm_simulate for the two tests is shown below. Can 
> anyone suggest what might be going wrong? Whilst I really like the concept of 
> crm_simulate, I can't find a good description of how to interpret the output 
> and I don't understand the difference between clone_color and native_color, 
> or the difference between "promotion scores" and the various instances of 
> "allocation scores", nor does it really tell me what is contributing to the 
> scores. Where does the -INFINITY allocation score come from for example?
>
>
> Thanks,
>
> Harvey
>
>
>
> FIRST KING RESOURCE MASTER FAILURE (CORRECT BEHAVIOUR - MASTER NODE FAILOVER 
> OCCURS)
>
>
>  Clone Set: ms_king_resource [king_resource] (promotable)
>      king_resource      (ocf::aviat:king-resource-ocf):    FAILED Master 
> secondary
> clone_color: ms_king_resource allocation score on primary: 0
> clone_color: ms_king_resource allocation score on secondary: 0
> clone_color: king_resource:0 allocation score on primary: 0
> clone_color: king_resource:0 allocation score on secondary: 101
> clone_color: king_resource:1 allocation score on primary: 200
> clone_color: king_resource:1 allocation score on secondary: 0
> native_color: king_resource:1 allocation score on primary: 200
> native_color: king_resource:1 allocation score on secondary: 0
> native_color: king_resource:0 allocation score on primary: -INFINITY
> native_color: king_resource:0 allocation score on secondary: 101
> king_resource:1 promotion score on primary: 100
> king_resource:0 promotion score on secondary: 1
>  * Recover    king_resource:0      ( Master -> Slave secondary )
>  * Promote    king_resource:1      (   Slave -> Master primary )
>  * Resource action: king_resource   cancel=10000 on secondary
>  * Resource action: king_resource   cancel=11000 on primary
>  * Pseudo action:   ms_king_resource_pre_notify_demote_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_demote_0
>  * Pseudo action:   ms_king_resource_demote_0
>  * Resource action: king_resource   demote on secondary
>  * Pseudo action:   ms_king_resource_demoted_0
>  * Pseudo action:   ms_king_resource_post_notify_demoted_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_demoted_0
>  * Pseudo action:   ms_king_resource_pre_notify_stop_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
>  * Pseudo action:   ms_king_resource_stop_0
>  * Resource action: king_resource   stop on secondary
>  * Pseudo action:   ms_king_resource_stopped_0
>  * Pseudo action:   ms_king_resource_post_notify_stopped_0
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_stopped_0
>  * Pseudo action:   ms_king_resource_pre_notify_start_0
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
>  * Pseudo action:   ms_king_resource_start_0
>  * Resource action: king_resource   start on secondary
>  * Pseudo action:   ms_king_resource_running_0
>  * Pseudo action:   ms_king_resource_post_notify_running_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
>  * Pseudo action:   ms_king_resource_pre_notify_promote_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
>  * Pseudo action:   ms_king_resource_promote_0
>  * Resource action: king_resource   promote on primary
>  * Pseudo action:   ms_king_resource_promoted_0
>  * Pseudo action:   ms_king_resource_post_notify_promoted_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
>  * Resource action: king_resource   monitor=11000 on secondary
>  * Resource action: king_resource   monitor=10000 on primary
>  Clone Set: ms_king_resource [king_resource] (promotable)
>
>
> SECOND KING RESOURCE MASTER FAILURE (INCORRECT BEHAVIOUR - SAME NODE IS 
> PROMOTED TO MASTER)
>
>
>  Clone Set: ms_king_resource [king_resource] (promotable)
>      king_resource      (ocf::aviat:king-resource-ocf):    FAILED Master 
> primary
> clone_color: ms_king_resource allocation score on primary: 0
> clone_color: ms_king_resource allocation score on secondary: 0
> clone_color: king_resource:0 allocation score on primary: 0
> clone_color: king_resource:0 allocation score on secondary: 200
> clone_color: king_resource:1 allocation score on primary: 101
> clone_color: king_resource:1 allocation score on secondary: 0
> native_color: king_resource:0 allocation score on primary: 0
> native_color: king_resource:0 allocation score on secondary: 200
> native_color: king_resource:1 allocation score on primary: 101
> native_color: king_resource:1 allocation score on secondary: -INFINITY
> king_resource:1 promotion score on primary: 1
> king_resource:0 promotion score on secondary: 1
>  * Recover    king_resource:1     ( Master primary )
>  * Pseudo action:   ms_king_resource_pre_notify_demote_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_demote_0
>  * Pseudo action:   ms_king_resource_demote_0
>  * Resource action: king_resource   demote on primary
>  * Pseudo action:   ms_king_resource_demoted_0
>  * Pseudo action:   ms_king_resource_post_notify_demoted_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_demoted_0
>  * Pseudo action:   ms_king_resource_pre_notify_stop_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
>  * Pseudo action:   ms_king_resource_stop_0
>  * Resource action: king_resource   stop on primary
>  * Pseudo action:   ms_king_resource_stopped_0
>  * Pseudo action:   ms_king_resource_post_notify_stopped_0
>  * Resource action: king_resource   notify on secondary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_stopped_0
>  * Pseudo action:   ms_king_resource_pre_notify_start_0
>  * Resource action: king_resource   notify on secondary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_start_0
>  * Pseudo action:   ms_king_resource_start_0
>  * Resource action: king_resource   start on primary
>  * Pseudo action:   ms_king_resource_running_0
>  * Pseudo action:   ms_king_resource_post_notify_running_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
>  * Pseudo action:   ms_king_resource_pre_notify_promote_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
>  * Pseudo action:   ms_king_resource_promote_0
>  * Resource action: king_resource   promote on primary
>  * Pseudo action:   ms_king_resource_promoted_0
>  * Pseudo action:   ms_king_resource_post_notify_promoted_0
>  * Resource action: king_resource   notify on secondary
>  * Resource action: king_resource   notify on primary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
>  * Resource action: king_resource   monitor=10000 on primary
>  Clone Set: ms_king_resource [king_resource] (promotable)
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Problems with master/slave failovers

Reply via email to