kevin-wu24 commented on PR #18987:
URL: https://github.com/apache/kafka/pull/18987#issuecomment-2682522812
@ahuang98 thanks for the quick review. I took another look into that
invariant and I'll address the last two comments here since the situation is a
bit complicated. @jsancio too since this may be missing in the current KIP-853
implementation. Let me know what y'all think:
I agree, we should only be checking this invariant using the leader's
`lastVoterSet()`. When I remove the extra case checking the committed voter set
(voter set at HWM - 1), we fail this invariant check in the following addVoter
scenario:
- We're adding a voter that will increase the number of nodes that are
considered a majority (i.e. old voter set = {0}, new voter set = {0, 1})
- The events involved are: `SequentialAppendAction`, an event which
repeatedly gets executed, and my implementation of `AddVoterAction`.
`SequentialAppendAction` just increments a counter and writes these records to
the log using `client.prepareAppend` and `client.schedulePreparedAppend`.
`AddVoterAction` creates two RaftMessages and adds them to the leader's message
queue, the `AddVoterRequest` and `API_VERSIONS_RESPONSE`. The problem is that I
use `CompleteableFuture#whenComplete` to set the `correlationId` of the
`API_VERSIONS_RESPONSE`, which is asynchronous and exit `run()` after these two
messages are added to the leader's queue (i.e. the `AddVoterRequest#run()` can
finish execution without completing the future, and perhaps more importantly,
before the new VoterSet is added to the log, allowing subsequent
`SequentialAddActions` to execute even though they shouldn't be allowed to yet).
- This section of the KIP says we should not commit records while
AddVoterRequest is being handled:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217391519#KIP853:KRaftControllerMembershipChanges-Handling.3
- I believe the easiest way to solve this is to not allow
`AddVoterAction#run()` to return until we get back an `AddVoterResponse` from
the leader.
- I want to make sure that this isn't possible in the current
implementation. From looking at the raft code and the fact that this is
possible in the simulation tests it looks like there is no guard against this
in the raft layer (i.e. my proposed fix for the simulation test I think is
considered a higher layer fix, since I'm just not allowing the handling of
`AddVoterRequest` and `prepareAppend + schedulePrepareAppend` to happen
concurrently, but I'm not seeing anything that prevents this in the source
right now).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]