philipnee commented on code in PR #13550:
URL: https://github.com/apache/kafka/pull/13550#discussion_r1169135808


##########
clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java:
##########
@@ -835,6 +835,7 @@ public void handle(SyncGroupResponse syncResponse,
                 } else if (error == Errors.REBALANCE_IN_PROGRESS) {
                     log.info("SyncGroup failed: The group began another 
rebalance. Need to re-join the group. " +
                                  "Sent generation was {}", sentGeneration);
+                    resetStateAndGeneration("member missed the rebalance", 
true);

Review Comment:
   Hey @dajac  thanks for the review. I think in case of the member misses a 
generation, we want to make sure the owned partitions are revoked (due to 
generation reset).  Regardlessly, it should still rejoin with its current 
partitions and should continue to hold on to its partition if it is only 1 
generation behind. If it is 1+ generations behind, circle back to the beginning 
of my response, we want to make sure they are revoked because the partition 
might have already been reassigned.
   
   This makes me think that this will happen regularly in medium to large 
groups. -> I think this might not be as uncommon as what we think, especially 
with a large consumer group deployed to multiple pods, considering the pods can 
be staled before sending out syncGroup, while another consumer in a different 
pod tries to join the group.
   
   I hope i'm answering your questions there, I apologize if I misunderstood 
anything.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to