Sophie Blee-Goldman created KAFKA-8951:
------------------------------------------
Summary: Avoid unnecessary rebalances and downtime for "safe"
partitions
Key: KAFKA-8951
URL: https://issues.apache.org/jira/browse/KAFKA-8951
Project: Kafka
Issue Type: Improvement
Components: clients, streams
Reporter: Sophie Blee-Goldman
With cooperative rebalancing, any partition that is encoded in one consumer's
Subscription cannot be re-assigned to a different consumer during that
rebalance. The partition must be removed from the assignment and revoked by its
old owner before triggering a second rebalance during which it can be assigned.
This is to enforce a synchronization barrier so that no two consumers can ever
own the same partition at the same time
This leads to down time for that partition plus a second rebalance, which may
not always be necessary. In Streams for example, the consumer will pause all
partitions of an active task until it is running (ie has been initialized and
restored). It should be safe to give these partitions away, provided they are
not resumed between sending the joinGroup request and receiving the syncGroup
response.
One proposal would be to modify two methods in the ConsumerPartitionAssignor
interface. 1) ConsumerPartitionAssignor#subscriptionUserData would be passed in
the set of `ownedPartitions` that will be included in the subscription,
allowing it to remove any that it knows are safe to give away.
2) ConsumerPartitionAssignor#onAssignment would be passed the set of revoked
partitions, allowing it to remove any that it knows were already reassigned and
should not trigger another rebalance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)