philipnee commented on code in PR #13550: URL: https://github.com/apache/kafka/pull/13550#discussion_r1169135808
########## clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java: ########## @@ -835,6 +835,7 @@ public void handle(SyncGroupResponse syncResponse, } else if (error == Errors.REBALANCE_IN_PROGRESS) { log.info("SyncGroup failed: The group began another rebalance. Need to re-join the group. " + "Sent generation was {}", sentGeneration); + resetStateAndGeneration("member missed the rebalance", true); Review Comment: Hey @dajac thanks for the review. I think in case of the member misses a generation, we want to make sure the owned partitions are revoked (due to generation reset). Regardlessly, it should still rejoin with its current partitions and should continue to hold on to its partition if it is only 1 generation behind. If it is 1+ generations behind, circle back to the beginning of my response, we want to make sure they are revoked because the partition might have already been reassigned. This makes me think that this will happen regularly in medium to large groups. -> I think this might not be as uncommon as what we think, especially with a large consumer group deployed to multiple pods, considering the pods can be staled before sending out syncGroup, while another consumer in a different pod tries to join the group. I hope i'm answering your questions there, I apologize if I misunderstood anything. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org