Each cluster member typically always transfers leadership to the same other member, which is the first in their list of servers. This may result in two servers in a 3-node cluster to transfer leadership to each other and never to the third one.
Randomizing the selection to make the load more evenly distributed. This also makes cluster failure tests cover more scenarios as servers will transfer leadership to servers they didn't before. This is important especially for cluster joining tests. Ideally, we would transfer to a random server with a highest apply index, but not trying to implement this for now. Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> --- ovsdb/raft.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index f463afcb3..25f462431 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -1261,8 +1261,12 @@ raft_transfer_leadership(struct raft *raft, const char *reason) return; } + size_t n = hmap_count(&raft->servers) * 3; struct raft_server *s; - HMAP_FOR_EACH (s, hmap_node, &raft->servers) { + + while (n--) { + s = CONTAINER_OF(hmap_random_node(&raft->servers), + struct raft_server, hmap_node); if (!uuid_equals(&raft->sid, &s->sid) && s->phase == RAFT_PHASE_STABLE) { struct raft_conn *conn = raft_find_conn_by_sid(raft, &s->sid); -- 2.43.0 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev