ableegoldman opened a new pull request #10985: URL: https://github.com/apache/kafka/pull/10985
The primary goal of this PR is to address the problem we've seen in the wild in which the ConsumerCoordinator fails to update its SubscriptionState and ultimately feeds invalid `ownedPartitions` data as input to the assignor. Previously the assignor would detect that something was wrong and just throw an exception, now we make several efforts to detect this earlier in the assignment process and then fix it if possible, and work around it if not. Specifically, this PR does a few things: 1) Bring the `generation` field back to the CooperativeStickyAssignor so we don't need to rely so heavily on the ConsumerCoordinator properly updating its SubscriptionState after eg falling out of the group. The plain StickyAssignor always used the generation since it had to, so we just make sure the CooperativeStickyAssignor has this tool as well 2) In case of unforeseen problems or further bugs that slip past the `generation` field safety net, the assignor will now explicitly look out for partitions that are being claimed by multiple consumers as owned in the same generation. Such a case should never occur, but if it does, we have to invalidate this partition from the `ownedPartitions` of both consumers, since we can't tell who, if anyone, has the valid claim to this partition. 3) Fix a subtle bug that I discovered while writing tests for the above two fixes: in the constrained algorithm, we compute the exact number of partitions each consumer should end up with, and keep track of the "unfilled" members who must -- or _might_ -- require more partitions to hit their quota. The problem was that members at the `minQuota` were being considered as "unfilled" even after we had already hit the maximum number of consumers allowed to go up to the `maxQuota`, meaning those `minQuota` members could/should not accept any more partitions beyond that. I believe this was introduced in [#10509](https://github.com/apache/kafka/pull/10509), so it shouldn't be in any released versions and does not need to be backported. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org