Re: [ClusterLabs] Master/slave failover does not work as expected

Jan Pokorný Tue, 13 Aug 2019 13:01:28 -0700

On 13/08/19 09:44 +0200, Ulrich Windl wrote:
>>>> Harvey Shepherd <harvey.sheph...@aviatnet.com> schrieb am 12.08.2019 um 
>>>> 23:38
> in Nachricht <ec767e3d-0cde-42c2-a8de-72ffce859...@email.android.com>:
>> I've been experiencing exactly the same issue. Pacemaker prioritises 
>> restarting the failed resource over maintaining a master instance. In my 
>> case 
>> I used crm_simulate to analyse the actions planned and taken by pacemaker 
>> during resource recovery. It showed that the system did plan to failover the 
>> master instance, but it was near the bottom of the action list. Higher 
>> priority was given to restarting the failed instance, consequently when that 
>> had occurred, it was easier just to promote the same instance rather than 
>> failing over.
> 
> That's interesting: Maybe usually it's actually faster to restart a
> failed (master) process rather than promoting a slave to master,
> possibly demoting the old master to slave, etc.
> 
> But most obviously while there is a (possible) resource utilization
> for resources, there is none for operations (AFAIK): If one could
> configure "operation costs" (maybe as rules), the cluster could
> prefer the transition with least costs. Unfortunately it will make
> things more complicated.
> 
> I could even imagine if you set the cost for "stop" to infinity, the
> cluster will not even try to stop the resource, but will fence the
> node instead...


Very courageous and highly nontrivial if you think about the
scalability impact (when at it, not that these wouldn't be mitigable
to some extent, switching single brain/DC into segmented multi-leader
approach met with hierarchical scheduling -- there are usually some
clusters [pun intended] of resources rather than each one coinciding
with all the others when the total count goes up).

Anyway, thanks for sharing the ideas, Ulrich, not just now :-)

-- 
Jan (Poki)

pgpiEPvKday33.pgp
Description: PGP signature

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Master/slave failover does not work as expected

Reply via email to