[ 
https://issues.apache.org/jira/browse/KAFKA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725512#comment-16725512
 ] 

Mayuresh Gharat commented on KAFKA-7728:
----------------------------------------

To shed more light on this : 
As per the current scenario, there are 2 types of metadata changes that trigger 
rebalance :
1) Increase in the number of partitions for any of the currently subscribed 
topics, causing a rebalance.
2) Newly added topics, causing a rebalance.

The leader is responsible for rebalancing (by virtue of sending a 
JoinGroupRequest), in case of 1). The other consumers in the group will not 
cause the rebalance for 1).

For 2) anyone in the consumer group can trigger a rebalance, when it detects 
that a new topic has be created (by virtue of its metadata refresh).

Currently we trigger a rebalance on leader rejoin because, we don't know the 
reason why the leader is sending the JoinGroupRequest. 
With the proposal in this jira and using static membership from KIP-345 (using 
the "group.instance.id" as the leader, instead of "member.id", we can check the 
reason for the rejoin from the leader. If no reason is specified, we can assume 
that the JoinGroupRequest was because of leader restart. In that case, the 
GroupCoordinator can send the current assignment for the leader and also send 
the groupSubscription (all the topics subscribed by all the consumers of the 
group) back to the leader. This prevents the unnecessary rebalance due to 
leader bounce.
We will have to change logic in ConsumerCoordinator to not perform assignments 
(if it is the leader) and just accept the assignment given by the 
GroupCoordinator, in this scenario.  

[~bchen225242], [~guozhang], [~hachikuji] thoughts??







> Add JoinReason to the join group request for better rebalance handling
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-7728
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7728
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Boyang Chen
>            Assignee: Mayuresh Gharat
>            Priority: Major
>              Labels: consumer, mirror-maker, needs-kip
>
> Recently [~mgharat] and I discussed about the current rebalance logic on 
> leader join group request handling. So far we blindly trigger rebalance when 
> the leader rejoins. The caveat is that KIP-345 is not covering this effort 
> and if a consumer group is not using sticky assignment but using other 
> strategy like round robin, the redundant rebalance could still shuffle the 
> topic partitions around consumers. (for example mirror maker application)
> I checked on broker side and here is what we currently do:
>  
> {code:java}
> if (group.isLeader(memberId) || !member.matches(protocols))  
> // force a rebalance if a member has changed metadata or if the leader sends 
> JoinGroup. 
> // The latter allows the leader to trigger rebalances for changes affecting 
> assignment 
> // which do not affect the member metadata (such as topic metadata changes 
> for the consumer) {code}
> Based on the broker logic, we only need to trigger rebalance for leader 
> rejoin when the topic metadata change has happened. I also looked up the 
> ConsumerCoordinator code on client side, and found out the metadata 
> monitoring logic here:
> {code:java}
> public boolean rejoinNeededOrPending() {
> ...
> // we need to rejoin if we performed the assignment and metadata has changed
> if (assignmentSnapshot != null && 
> !assignmentSnapshot.equals(metadataSnapshot))
>   return true;
> }{code}
>  I guess instead of just returning true, we could introduce a new enum field 
> called JoinReason which could indicate the purpose of the rejoin. Thus we 
> don't need to do a full rebalance when the leader is just in rolling bounce.
> We could utilize this information I guess. Just add another enum field into 
> the join group request called JoinReason so that we know whether leader is 
> rejoining due to topic metadata change. If yes, we trigger rebalance 
> obviously; if no, we shouldn't trigger rebalance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to