On Thu, Mar 27, 2025 at 6:11 PM Ilya Maximets <i.maxim...@ovn.org> wrote: > > When the cluster falls apart, there is no leader and so there are no > heartbeats that need to be sent or received on time. In this case > the RAFT_TIMER_THRESHOLD is too small, as the process will only wake > up for a full election timer. When this happens, cooperative > multitasking code will emit warnings and traces thinking there is a > timer overrun. > > Fix that by setting the multitasking interval according to the full > election timer, if there is no known leader. > > Fixes: d4a15647b917 ("ovsdb: raft: Enable cooperative multitasking.") > Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> > --- > > Cc: Frode Nordahl <fnord...@ubuntu.com> > > ovsdb/raft.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/ovsdb/raft.c b/ovsdb/raft.c > index 9c3c351b5..a9da28780 100644 > --- a/ovsdb/raft.c > +++ b/ovsdb/raft.c > @@ -2170,9 +2170,14 @@ raft_run(struct raft *raft) > raft_reset_ping_timer(raft); > } > > - uint64_t interval = raft->joining > - ? RAFT_JOIN_TIMEOUT_MS > - : RAFT_TIMER_THRESHOLD(raft->election_timer); > + uint64_t interval = RAFT_TIMER_THRESHOLD(raft->election_timer);
Why not set interval in an else clause? -M > + > + if (raft->joining) { > + interval = RAFT_JOIN_TIMEOUT_MS; > + } else if (uuid_is_zero(&raft->leader_sid)) { > + /* There are no heartbeats to handle when there is no leader. */ > + interval = raft->election_timer; > + } > cooperative_multitasking_set( > &raft_run_cb, (void *) raft, time_msec(), > interval + interval / 10, "raft_run"); > -- > 2.47.0 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev