On 1/5/26 8:51 PM, Han Zhou wrote:
> 
> 
> On Wed, Dec 17, 2025 at 7:14 AM Ilya Maximets <[email protected] 
> <mailto:[email protected]>> wrote:
>>
>>  Section 4.2.3 "Disruptive Servers" says:
>>
>>     ...if a server receives a RequestVote request within the minimum
>>     election timeout of hearing from a current leader, it does not
>>     update its term or grant its vote.  It can either drop the request,
>>     reply with a vote denial, or delay the request; the result is
>>     essentially the same...
>>
>> However, while logic in the code does make it skip the term update, it
>> still proceeds to reply with the vote, because when the suppression
>> check returns true, we'll not enter the 'if' block and will proceed to
>> the RPC 'switch' instead.
>>
>> It will reply with a vote for the candidate it already voted for on
>> this term for an actual vote and it will vote for the requester on
>> the pre-vote.  This effectively bypasses the pre-vote scheme as the
>> disruptive server will win the pre-vote and proceed to the regular
>> election.  
> 
> Thanks Ilya for fixing this.
> 
> This was surprising to me. I checked in more detail and now I see there
> is another defect that actually lets the pre-vote succeed. In
> raft_handle_vote_request__(), it returns true early because
> "!uuid_is_zero(&raft->vote)" before actually comparing which log end is
> more up-to-date (usually the disruptive server's log is behind and should
> fail the prevote). So I think if rq->is_prevote we should compare the logs
> first. Of course this should be a separate patch.

Yeah, that's an interesting one.  Though, in practice, I'm not sure it
matters much.  As we'll get to the raft_handle_vote_request__() only if
the raft_should_suppress_disruptive_server() passed.  That means that
it's a candidate or a follower that didn't receive a heartbeat for the
whole election interval.  So, that cluster must be already not healthy
for the server with a sorter log to win a pre-vote.  It's probably still
a good thing to address this, I agree.

> 
>> And while the disruptor will not win the actual vote, it
>> has a term +1 and will reply with a "stale term" to the next append
>> request, forcing the current leader to step down.
>>
>> Fix that by actually not responding to the disruptive vote requests,
>> i.e. by taking the "drop the request" route.
>>
>> The test is updated with a new failure model where vote requests are
>> actually getting sent out.  There is a value in testing the full RPC
>> stop as well, so keeping the current checks as they are.
>>
>> The FT_FORCE_ELECTION enum entry is not actually needed, but the code
>> feels awkward without it.
> 
> This is not very important but I think it may be better to avoid this
> enum because when the failure_test stays FT_FORCE_ELECTION, some would
> assume the behavior would keep forcing elections until it is set back
> to FT_NO_TEST, but it is actually a one-time test. Tests such as
> FT_TRANSFER_LEADERSHIP automatically sets failure_test back to FT_NO_TEST,
> but this one doesn't. It would be consistent if the code sets it back to
> FT_NO_TEST but then again the FT_FORCE_ELECTION is useless. So, probably
> we should just set failure_test to FT_NO_TEST directly, and it is also
> clear from the code that the test is carried out directly. But I don't
> have a strong opinion if you want to keep it as is.

Makes sense, I dropped the enum.  In the grand scheme this is not really
important.  And yes, any user of this functionality is expected to know
the code anyway.

> 
> Acked-by: Han Zhou <[email protected]>

Thanks!  With the enum dropped, I applied the change and backported down
to 3.3.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to