[ 
https://issues.apache.org/jira/browse/KAFKA-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308575#comment-17308575
 ] 

hudeqi commented on KAFKA-12478:
--------------------------------

Thank you very much for your reply! I think you may already understand the 
scenario I said.

For the solution of this scenario, my idea is very similar to your suggestion. 
The biggest difference is: I implement it completely on the server side, 
because the company’s business uses too many types and numbers of kafka 
clients, we need to fix each type of client, it is still a big trouble to 
promote the upgrade. Secondly, I don't quite understand what you said above:"if 
the new partitions are added around the same time when consumers are started". 
My idea is to find out all the groups subscribed to this topic before the admin 
starts to add partitions, and then let these groups commit an initial offset 0 
for these expanded partitions (also using adminClient). Finally, the real 
process of adding partitions is carried out. In this way, the above scenario 
can be completely solved, and it is transparent to the client. Can I mention a 
KIP and a patch for this problem? 

Looking forward to your reply!

> Consumer group may lose data for newly expanded partitions when add 
> partitions for topic if the group is set to consume from the latest
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12478
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12478
>             Project: Kafka
>          Issue Type: Improvement
>          Components: clients
>    Affects Versions: 2.7.0
>            Reporter: hudeqi
>            Priority: Blocker
>              Labels: patch
>   Original Estimate: 1,158h
>  Remaining Estimate: 1,158h
>
>   This problem is exposed in our product environment: a topic is used to 
> produce monitoring data. *After expanding partitions, the consumer side of 
> the business reported that the data is lost.*
>   After preliminary investigation, the lost data is all concentrated in the 
> newly expanded partitions. The reason is: when the server expands, the 
> producer firstly perceives the expansion, and some data is written in the 
> newly expanded partitions. But the consumer group perceives the expansion 
> later, after the rebalance is completed, the newly expanded partitions will 
> be consumed from the latest if it is set to consume from the latest. Within a 
> period of time, the data of the newly expanded partitions is skipped and lost 
> by the consumer.
>   If it is not necessarily set to consume from the earliest for a huge data 
> flow topic when starts up, this will make the group consume historical data 
> from the broker crazily, which will affect the performance of brokers to a 
> certain extent. Therefore, *it is necessary to consume these partitions from 
> the earliest separately.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to