Sophie Blee-Goldman created KAFKA-8767:
------------------------------------------
Summary: Optimize StickyAssignor for Cooperative mode
Key: KAFKA-8767
URL: https://issues.apache.org/jira/browse/KAFKA-8767
Project: Kafka
Issue Type: Sub-task
Reporter: Sophie Blee-Goldman
In some rare cases, the StickyAssignor will fail to balance an assignment
without violating stickiness despite a balanced and sticky assignment being
possible. The implications of this for cooperative rebalancing are that an
unnecessary additional rebalance will be triggered.
This was seen to happen for example when each consumer is subscribed to some
random subset of all topics and all their subscriptions change to a different
random subset, as occurs in
AbstractStickyAssignorTest#testReassignmentWithRandomSubscriptionsAndChanges.
The initial assignment after the random subscription change obviously involved
migrating partitions, so following the cooperative protocol those partitions
are removed from the balanced first assignment, and a second rebalance is
triggered. In some cases, during the second rebalance the assignor was unable
to reach a balanced assignment without migrating a few partitions, even though
one must have been possible (since the first assignment was balanced). A third
rebalance was needed to reach a stable balanced state.
Under the conditions in the previously mentioned test (between 20-40 consumers,
10-20 topics (with 0-20 partitions) this third rebalance was required roughly
30% of the time. Some initial improvements to the sticky assignment logic
reduced this to under 15%, but we should consider closing this gap and
optimizing the cooperative sticky assignment
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)