[ https://issues.apache.org/jira/browse/KAFKA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733633#comment-16733633 ]
Boyang Chen commented on KAFKA-7728: ------------------------------------ [~guozhang]Thanks for the confirmation! [~mgharat] Do you want to start a KIP to reach wider discussion? I think this is a very valuable change that could greatly improve global resource rebalancing. > Add JoinReason to the join group request for better rebalance handling > ---------------------------------------------------------------------- > > Key: KAFKA-7728 > URL: https://issues.apache.org/jira/browse/KAFKA-7728 > Project: Kafka > Issue Type: Improvement > Reporter: Boyang Chen > Assignee: Mayuresh Gharat > Priority: Major > Labels: consumer, mirror-maker, needs-kip > > Recently [~mgharat] and I discussed about the current rebalance logic on > leader join group request handling. So far we blindly trigger rebalance when > the leader rejoins. The caveat is that KIP-345 is not covering this effort > and if a consumer group is not using sticky assignment but using other > strategy like round robin, the redundant rebalance could still shuffle the > topic partitions around consumers. (for example mirror maker application) > I checked on broker side and here is what we currently do: > > {code:java} > if (group.isLeader(memberId) || !member.matches(protocols)) > // force a rebalance if a member has changed metadata or if the leader sends > JoinGroup. > // The latter allows the leader to trigger rebalances for changes affecting > assignment > // which do not affect the member metadata (such as topic metadata changes > for the consumer) {code} > Based on the broker logic, we only need to trigger rebalance for leader > rejoin when the topic metadata change has happened. I also looked up the > ConsumerCoordinator code on client side, and found out the metadata > monitoring logic here: > {code:java} > public boolean rejoinNeededOrPending() { > ... > // we need to rejoin if we performed the assignment and metadata has changed > if (assignmentSnapshot != null && > !assignmentSnapshot.equals(metadataSnapshot)) > return true; > }{code} > I guess instead of just returning true, we could introduce a new enum field > called JoinReason which could indicate the purpose of the rejoin. Thus we > don't need to do a full rebalance when the leader is just in rolling bounce. > We could utilize this information I guess. Just add another enum field into > the join group request called JoinReason so that we know whether leader is > rejoining due to topic metadata change. If yes, we trigger rebalance > obviously; if no, we shouldn't trigger rebalance. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)