On 10/10/16 05:51, Eric Robinson wrote: > I have about a dozen corosync+pacemaker clusters and I am just now getting > around to understanding timeouts. > > Most of my corosync.conf files look something like this: > > version: 2 > token: 5000 > token_retransmits_before_loss_const: 10 > join: 1000 > consensus: 7500 > vsftype: none > max_messages: 20 > secauth: off > threads: 0 > clear_node_high_bit: yes > rrp_mode: active > > If I understand this correctly, this means the node will wait 50 seconds > (5000ms x 10) before deciding that a cluster reconfig is necessary (perhaps > after a link failure). Is that correct? >
No that's not correct. the token timeout is 5 seconds in your example - because token is 5000mS. the token timeout is always what the value of totem.token is. token_retransmits_before_loss_const affects the token hold timeout - which is how long the token is held on a node that has no messages to send before being forwarded on. So increasing token_retransmits_before_loss_const changes the number of times per 'token' timeout that the token is actually sent. In the example above you will see that the token is sent approximately 5000/10 = 500 mS. That's approximate, the value is scaled slightly to make actual timeouts less likely, and also is affected by messages that may beed to be sent. Chrissie > I'm trying to understand how this works together with my bonded NIC's > arp_interval settings. I normally set arp_interval=1000. My question is, how > many arp losses are required before the bonding driver decides to failover to > the other link? If arp_interval=1000, how many times does the driver send an > arp and fail to receive a reply before it decides that the link is dead? > > I think I need to know this so I can set my corosync.conf settings correctly > to avoid "false positive" cluster failovers. In other words, if there is a > link or switch failure, I want to make sure that the cluster allows plenty of > time for link communication to recover before deciding that a node has > actually died. > > -- > Eric Robinson > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org