[ 
https://issues.apache.org/jira/browse/KAFKA-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17274181#comment-17274181
 ] 

Guozhang Wang commented on KAFKA-12169:
---------------------------------------

[~zoushengfu] by "restart with unknown member id", do you mean you bounced the 
client leader at the same time? Also, which broker version are you running with?

If you did bounced the leader and the brokers are on older versions (i.e. on 
older versions the broker would not trigger rebalance on non-leader joins with 
different metadata), there might indeed have a race condition here if we bounce 
the leader at the same time, such as:

T0: topic partitions metadata changes from 1000 to 2000, but have not been 
propagated to the consumer group leader.
T1: Leader is bounced, and then rejoined the group with a known instance.id, at 
that time its metadata is already at 2000 partitions, but the group coordinator 
would still give it the old assignment which only contains 1000 partitions.
T2: Since then the leader would not try to resend the join-group since its 
"join-group metadata snapshot" is the same as the refreshed metadata.

> Consumer can not know paritions change when client leader restart with static 
> membership protocol
> -------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12169
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12169
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 2.5.1, 2.6.1
>            Reporter: zou shengfu
>            Priority: Major
>
> Background: 
>  Kafka consumer services run with static membership and cooperative rebalance 
> protocol on kubernetes, and services often restart because of operation. When 
> we added partitions from 1000 to 2000 for the topic, client leader restart 
> with unknown member id at the same time, we found  the consumers do not 
> tigger rebalance and still consume 1000 paritions
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to