Re: [ClusterLabs] Establishing Timeouts

Klaus Wenninger Mon, 10 Oct 2016 11:08:09 -0700

On 10/10/2016 06:58 PM, Eric Robinson wrote:
> Thanks for the clarification. So what's the easiest way to ensure that the 
> cluster waits a desired timeout before deciding that a re-convergence is 
> necessary?


By raising the token (lost) timeout I would say.

Please correct my (Chrissie) but I see the
token (lost) timout somehow as resilience against
static delays + jitter on top and the
token_retransmits_before_loss_const as resilience
against packet-loss.

>
> --
> Eric Robinson
>    
>
> -----Original Message-----
> From: Christine Caulfield [mailto:ccaul...@redhat.com] 
> Sent: Monday, October 10, 2016 4:34 AM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Establishing Timeouts
>
> On 10/10/16 05:51, Eric Robinson wrote:
>> I have about a dozen corosync+pacemaker clusters and I am just now getting 
>> around to understanding timeouts.
>>
>> Most of my corosync.conf files look something like this:
>>
>>         version:        2
>>         token:          5000
>>         token_retransmits_before_loss_const: 10
>>         join:           1000
>>         consensus:      7500
>>         vsftype:        none
>>         max_messages:   20
>>         secauth:        off
>>         threads:        0
>>         clear_node_high_bit: yes
>>         rrp_mode: active
>>
>> If I understand this correctly, this means the node will wait 50 seconds 
>> (5000ms x 10) before deciding that a cluster reconfig is necessary (perhaps 
>> after a link failure). Is that correct?
>>
> No that's not correct. the token timeout is 5 seconds in your example - 
> because token is 5000mS. the token timeout is always what the value of 
> totem.token is.
>
> token_retransmits_before_loss_const affects the token hold timeout - which is 
> how long the token is held on a node that has no messages to send before 
> being forwarded on. So increasing token_retransmits_before_loss_const changes 
> the number of times per 'token' timeout that the token is actually sent.
>
> In the example above you will see that the token is sent approximately
> 5000/10 = 500 mS. That's approximate, the value is scaled slightly to make 
> actual timeouts less likely, and also is affected by messages that may beed 
> to be sent.
>
> Chrissie
>
>> I'm trying to understand how this works together with my bonded NIC's 
>> arp_interval settings. I normally set arp_interval=1000. My question is, how 
>> many arp losses are required before the bonding driver decides to failover 
>> to the other link? If arp_interval=1000, how many times does the driver send 
>> an arp and fail to receive a reply before it decides that the link is dead?
>>
>> I think I need to know this so I can set my corosync.conf settings correctly 
>> to avoid "false positive" cluster failovers. In other words, if there is a 
>> link or switch failure, I want to make sure that the cluster allows plenty 
>> of time for link communication to recover before deciding that a node has 
>> actually died. 
>>
>> --
>> Eric Robinson
>>
>>
>> _______________________________________________
>> Users mailing list: Users@clusterlabs.org 
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Establishing Timeouts

Reply via email to