[ https://issues.apache.org/jira/browse/KAFKA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730594#comment-16730594 ]
Boyang Chen edited comment on KAFKA-7728 at 12/31/18 10:50 PM: --------------------------------------------------------------- [~enether] Thanks for the thoughts! I think the compatibility should be considered as the JoinGroupRequest version will be bumped. I already come up some common join reasons for a potential enum type: {code:java} public enum JoinGroupReason { BLIND("blind"), // Join request from a start-up consumer SELF_META_CHANGE("self_meta_change"), // The consumer metadata has changed TOPIC_METAD_CHANGE("topic_meta_change"); // The topic metadata changed (must be from the leader) } {code} the self metadata change might be used for user to indicate some status updates (for example Stream task replay ready) for broker to judge whether rebalance should be triggered. It would be better to discuss more scenarios before we finalize anything. Let's brainstorm on more join reasons that will be helpful for the broker to make decision. was (Author: bchen225242): [~enether] Thanks for the thoughts! I think the compatibility should be considered as the JoinGroupRequest version will be bumped. I already come up some common join reasons for a potential enum type: {code:java} public enum JoinGroupReason { BLIND("blind"), // Join request from a start-up consumer SELF_META_CHANGE("self_meta_change"), // The consumer metadata has changed TOPIC_METAD_CHANGE("topic_meta_change"); // The topic metadata changed (must be from the leader) } {code} the self metadata change might be trivial to realize now, but I think it would be better to discuss more scenarios before we finalize anything. Let's brainstorm on more join reasons that will be helpful for the broker to make decision. > Add JoinReason to the join group request for better rebalance handling > ---------------------------------------------------------------------- > > Key: KAFKA-7728 > URL: https://issues.apache.org/jira/browse/KAFKA-7728 > Project: Kafka > Issue Type: Improvement > Reporter: Boyang Chen > Assignee: Mayuresh Gharat > Priority: Major > Labels: consumer, mirror-maker, needs-kip > > Recently [~mgharat] and I discussed about the current rebalance logic on > leader join group request handling. So far we blindly trigger rebalance when > the leader rejoins. The caveat is that KIP-345 is not covering this effort > and if a consumer group is not using sticky assignment but using other > strategy like round robin, the redundant rebalance could still shuffle the > topic partitions around consumers. (for example mirror maker application) > I checked on broker side and here is what we currently do: > > {code:java} > if (group.isLeader(memberId) || !member.matches(protocols)) > // force a rebalance if a member has changed metadata or if the leader sends > JoinGroup. > // The latter allows the leader to trigger rebalances for changes affecting > assignment > // which do not affect the member metadata (such as topic metadata changes > for the consumer) {code} > Based on the broker logic, we only need to trigger rebalance for leader > rejoin when the topic metadata change has happened. I also looked up the > ConsumerCoordinator code on client side, and found out the metadata > monitoring logic here: > {code:java} > public boolean rejoinNeededOrPending() { > ... > // we need to rejoin if we performed the assignment and metadata has changed > if (assignmentSnapshot != null && > !assignmentSnapshot.equals(metadataSnapshot)) > return true; > }{code} > I guess instead of just returning true, we could introduce a new enum field > called JoinReason which could indicate the purpose of the rejoin. Thus we > don't need to do a full rebalance when the leader is just in rolling bounce. > We could utilize this information I guess. Just add another enum field into > the join group request called JoinReason so that we know whether leader is > rejoining due to topic metadata change. If yes, we trigger rebalance > obviously; if no, we shouldn't trigger rebalance. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)