kevin-wu24 commented on code in PR #18987:
URL: https://github.com/apache/kafka/pull/18987#discussion_r1970056520
##########
raft/src/test/java/org/apache/kafka/raft/RaftEventSimulationTest.java:
##########
@@ -1127,14 +1331,75 @@ private MajorityReachedHighWatermark(Cluster cluster) {
@Override
public void verify() {
- cluster.leaderHighWatermark().ifPresent(highWatermark -> {
- long numReachedHighWatermark =
cluster.nodes.entrySet().stream()
- .filter(entry ->
cluster.voters.containsKey(entry.getKey()))
- .filter(entry -> entry.getValue().log.endOffset().offset()
>= highWatermark)
- .count();
- assertTrue(
- numReachedHighWatermark >= cluster.majoritySize(),
- "Insufficient nodes have reached current high watermark");
+ if (cluster.withKip853) {
+ /*
+ * For clusters running in KIP-853 mode, we check that a
majority of at least one of:
+ * 1. the leader's voter set at the HWM
+ * 2. the leader's lastVoterSet()
+ * has reached the HWM. We need to perform a more elaborate
check here because in clusters where
+ * an Add/RemoveVoter request increases/decreases the majority
of voters value by 1, the leader
+ * could have used either majority value to update its HWM
value. This is because depending on
+ * whether the leader read the most recent VotersRecord prior
to updating its HWM value, the number
+ * of nodes (the majority) used to calculate that HWM value is
different. This matters for invariant
+ * checking because we perform this verification on every
message delivery.
+ * */
+ cluster.leaderWithMaxEpoch().ifPresent(leaderNode -> {
+ leaderNode.client.highWatermark().ifPresent(highWatermark
-> {
+ VoterSet voterSet =
leaderNode.client.partitionState().lastVoterSet();
+ long numReachedHighWatermark =
numReachedHighWatermark(highWatermark, voterSet.voterIds());
+ if (numReachedHighWatermark <
cluster.majoritySize(voterSet.size())) {
+
leaderNode.client.partitionState().voterSetAtOffset(highWatermark -
1).ifPresent(otherVoterSet -> {
Review Comment:
An "event" from the perspective of the simulation tests is the execution of
one `Runnable`. This detail is important because `SequentialAppendAction` will
directly call `client.prepareAppend` and `client.schedulePreparedAppend` which
can wakeup the message queue (this is where my implementation of
`AddVoterAction` adds the voter-to-be-added's `AddVoterRequest` and
`API_VERSIONS_RESPONSE`).
~~This check is incorrect, and we should rely on only the last voter set.
More explained below~~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]