Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

Jan Friesse Thu, 09 Apr 2020 03:32:22 -0700

Andrei Borzenkov napsal(a):

08.04.2020 10:12, Jan Friesse пишет:

Sherrard,

i could not determine which of these sub-threads to include this in,
so i am going to (reluctantly) top-post it.

i switched the transport to udp, and in limited testing i seem to not
be hitting the race condition. of course i have no idea whether this
will behave consistently, or which part of the knet vs udp setup makes
the most difference.

ie, is it the overhead of the crypto handshakes/setup? is there some
other knet layer that imparts additional delay in establishing
connection to other nodes? is the delay on the rebooted node, the
standing node, or both?


Very high level, what is happening in corosync when using udpu:
- Corosync started and begins in gather state -> sends "multicast"
(emulated by unicast to all expected members) message telling "I'm here
and this is my view of live nodes").
- In this state, corosync waits for answers
- When node receives this message it "multicast" same message with
updated view of live nodes
- After all nodes agrees, they move to next state (commit/recovery and
finally operational)

With udp, this happens instantly so most of the time corosync doesn't
even create single node membership, which would be created if no other
nodes exists and/or replies wouldn't be delivered on time.


Is it possible to delay "creating single node membership" until some
reasonable initial timeout after corosync starts to ensure node view of

The thing is, totemsrp begins by creating single node membership. It hasto start somewhere. Of course question is, if it would make sense toslow a bit on the startup to create "better" membership? I would say so,and it is something I'm considering as TODO.

cluster is up to date? It is clear that there will always be some corner
cases, but at least this would make "obviously correct" configuration to
behave as expected.

Corosync already must have timeout to declare peers unreachable - it
sounds like most logical to use in this case.

It does, join timeout, but enlarging it will generally slow failuredetection/recovery.


Knet adds a layer which monitors links between each of the node and it
will make line active after it received configured number of "pong"
packets. Idea behind is to have evidence of reasonable stable line. As
long as line is not active no data packet goes thru (corosync traffic is
just "data"). This basically means, that initial corosync multicast is
not delivered to other nodes so corosync creates single node membership.
After line becomes active "multicast" is delivered to other nodes and
they move to gather state.


I would expect "reasonable timeout" to also take in account knet delay.

So to answer you question. "Delay" is on both nodes side because link is
not established between the nodes.


knet was expected to improve things, was not it? :)

And I believe it does :) Actually, it now behaves more "correctly" (readas "as specification says") than before. Anyway, I got the point, it'sin TODO (https://github.com/corosync/corosync/issues/549)


Honza

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

Reply via email to