On 1/5/26 8:51 PM, Han Zhou wrote: > > > On Wed, Dec 17, 2025 at 7:14 AM Ilya Maximets <[email protected] > <mailto:[email protected]>> wrote: >> >> Section 4.2.3 "Disruptive Servers" says: >> >> ...if a server receives a RequestVote request within the minimum >> election timeout of hearing from a current leader, it does not >> update its term or grant its vote. It can either drop the request, >> reply with a vote denial, or delay the request; the result is >> essentially the same... >> >> However, while logic in the code does make it skip the term update, it >> still proceeds to reply with the vote, because when the suppression >> check returns true, we'll not enter the 'if' block and will proceed to >> the RPC 'switch' instead. >> >> It will reply with a vote for the candidate it already voted for on >> this term for an actual vote and it will vote for the requester on >> the pre-vote. This effectively bypasses the pre-vote scheme as the >> disruptive server will win the pre-vote and proceed to the regular >> election. > > Thanks Ilya for fixing this. > > This was surprising to me. I checked in more detail and now I see there > is another defect that actually lets the pre-vote succeed. In > raft_handle_vote_request__(), it returns true early because > "!uuid_is_zero(&raft->vote)" before actually comparing which log end is > more up-to-date (usually the disruptive server's log is behind and should > fail the prevote). So I think if rq->is_prevote we should compare the logs > first. Of course this should be a separate patch.
Yeah, that's an interesting one. Though, in practice, I'm not sure it matters much. As we'll get to the raft_handle_vote_request__() only if the raft_should_suppress_disruptive_server() passed. That means that it's a candidate or a follower that didn't receive a heartbeat for the whole election interval. So, that cluster must be already not healthy for the server with a sorter log to win a pre-vote. It's probably still a good thing to address this, I agree. > >> And while the disruptor will not win the actual vote, it >> has a term +1 and will reply with a "stale term" to the next append >> request, forcing the current leader to step down. >> >> Fix that by actually not responding to the disruptive vote requests, >> i.e. by taking the "drop the request" route. >> >> The test is updated with a new failure model where vote requests are >> actually getting sent out. There is a value in testing the full RPC >> stop as well, so keeping the current checks as they are. >> >> The FT_FORCE_ELECTION enum entry is not actually needed, but the code >> feels awkward without it. > > This is not very important but I think it may be better to avoid this > enum because when the failure_test stays FT_FORCE_ELECTION, some would > assume the behavior would keep forcing elections until it is set back > to FT_NO_TEST, but it is actually a one-time test. Tests such as > FT_TRANSFER_LEADERSHIP automatically sets failure_test back to FT_NO_TEST, > but this one doesn't. It would be consistent if the code sets it back to > FT_NO_TEST but then again the FT_FORCE_ELECTION is useless. So, probably > we should just set failure_test to FT_NO_TEST directly, and it is also > clear from the code that the test is carried out directly. But I don't > have a strong opinion if you want to keep it as is. Makes sense, I dropped the enum. In the grand scheme this is not really important. And yes, any user of this functionality is expected to know the code anyway. > > Acked-by: Han Zhou <[email protected]> Thanks! With the enum dropped, I applied the change and backported down to 3.3. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
