Hey folks, thanks for the reviews! Addressing them one by one. From Luke - Some comments: > 1. Follower transitions to: Prospective: After expiration of the election > timeout > -> Is this the fetch timeout, not election timeout? > Yes, thanks for this catch!
> 2. I also agree we don't bump the epoch in prospective state. > A candidate will now send a VoteRequest with the PreVote field set to true > and CandidateEpoch set to its [epoch + 1] when its election timeout > expires. > -> What is "CandidateEpoch"? And I thought you've agreed to not set [epoch > + 1] ? > Forgot to update this section, it now reads A follower will now transition to Prospective when its fetch timeout expires. The Prospective server will send a VoteRequest with the PreVote field set to true and ReplicaEpoch set to its current, unbumped epoch. If [majority - 1] of VoteResponse grant the vote, the server will transition to Candidate and will then bump its epoch up and send a VoteRequest with PreVote set to false (which is the original behavior). On Wed, Nov 29, 2023 at 4:53 PM José Armando García Sancio <jsan...@confluent.io.invalid> wrote: > Hi Alyssa, > > 1. In the schema for VoteRequest and VoteResponse, you are using > "boolean" as the type keyword. The correct keyword should be "bool" > instead. > > 2. In the states and state transaction table you have the following entry: > > * Candidate transitions to: > > * ... > > * Prospective: After expiration of the election timeout > > Can you explain the reason a candidate would transition back to > prospective? If a voter transitions to the candidate state it is > because the voters don't support KIP-996 or the replica was able to > win the majority of the votes at some point in the past. Are we > concerned that the network partition might have occurred after the > replica has become a candidate? If so, I think we should state this > explicitly in the KIP. > > 3. In the proposed section and state transition section, I think it > would be helpful to explicitly state that we have an invariant that > only the prospective state can transition to the candidate state. This > transition to the candidate state from the prospective state can only > happen because the replica won the majority of the votes or there is > at least one remote voter that doesn't support pre-vote. > > 4. I am a bit confused by this paragraph > > A candidate will now send a VoteRequest with the PreVote field set to > true and CandidateEpoch set to its [epoch + 1] when its election timeout > expires. If [majority - 1] of VoteResponse grant the vote, the candidate > will then bump its epoch up and send a VoteRequest with PreVote set to > false which is our standard vote that will cause state changes for servers > receiving the request. > > I am assuming that "candidate" refers to the states enumerated on the > table above this quote. If so, I think you mean "prospective" for the > first candidate. > > CandidateEpoch should be ReplicaEpoch. > > [epoch + 1] should just be epoch. I thought we agreed that replicas > will always send their current epoch to the remote replicas. > > 5. I am a bit confused by this bullet section > > true if the server receives less than [majority] VoteResponse with > VoteGranted set to false within [election.timeout.ms + a little > randomness] and the first bullet point does not apply > Explanation for why we don't send a standard vote at this point > is explained in rejected alternatives. > > Can we explain this case in plain english? I assume that this case is > trying to cover the scenario where the election timer expired but the > prospective candidate hasn't received enough votes (granted or > rejected) to make a decision if it could win an election. > > 6. > > Yes. If a leader is unable to receive fetch responses from a majority of > servers, it can impede followers that are able to communicate with it from > voting in an eligible leader that can communicate with a majority of the > cluster. > > In general, leaders don't receive fetch responses. They receive FETCH > requests. Did you mean "if a leader is able to send FETCH responses to > the majority - 1 of the voters, it can impede fetching voters > (followers) from granting their vote to prospective candidates. This > should stop prospective candidates from getting enough votes to > transition to the candidate state and increase their epoch". > > 7. > > Check Quorum ensures a leader steps down if it is unable to receive > fetch responses from a majority of servers. > > I think you mean "... if it is unable to receive FETCH requests from > the majority - 1 of the voters". > > 8. At the end of the Proposed changes section you have the following: > > The logic now looks like the following for servers receiving > VoteRequests with PreVote set to true: > > > > When servers receive VoteRequests with the PreVote field set to true, > they will respond with VoteGranted set to > > > > * true if they are not a Follower and the epoch and offsets in the > Pre-Vote request satisfy the same requirements as a standard vote > > * false if they are a Follower or the epoch and end offsets in the > Pre-Vote request do not satisfy the requirements > > This seems to duplicate the same algorithm that was stated earlier in > the section. > > 9. I don't understand this rejected idea: Sending Standard Votes after > failure to win Pre-Vote > > In your example in the "Disruptive server scenarios" voters 4 and 5 > are partitioned from the majority of the voters. We don't want voters > 4 and 5 increasing their epoch and transitioning to the candidate > state else they would disrupt the quorum established by voters 1, 2 > and 3. > > > Thanks, > -- > -José >