[ 
https://issues.apache.org/jira/browse/KAFKA-15086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hudeqi updated KAFKA-15086:
---------------------------
    Labels: kip-943  (was: )

> The unreasonable segment size setting of the internal topics in MM2 may cause 
> the worker startup time to be too long
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-15086
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15086
>             Project: Kafka
>          Issue Type: Improvement
>          Components: mirrormaker
>    Affects Versions: 3.4.1
>            Reporter: hudeqi
>            Assignee: hudeqi
>            Priority: Major
>              Labels: kip-943
>         Attachments: WechatIMG364.jpeg, WechatIMG365.jpeg, WechatIMG366.jpeg
>
>
> As the config 'segment.bytes' for topics related MM2(such as 
> offset.storage.topic, config.storage.topic,status.storage.topic), if 
> following the default configuration of the broker or set it larger, then when 
> the MM cluster runs many and complicated tasks, especially the log volume of 
> the topic 'offset.storage.topic' is very large, it will affect the restart 
> speed of the MM workers.
> After investigation, the reason is that a consumer needs to be started to 
> read the data of ‘offset.storage.topic’ at startup. Although this topic is 
> set to compact, if the 'segment size' is set to a large value, such as the 
> default value of 1G, then this topic may have tens of gigabytes of data that 
> cannot be compacted and has to be read from the earliest (because the active 
> segment cannot be cleaned), which will consume a lot of time (in our online 
> environment, we found that this topic stores 13G of data, it took nearly half 
> an hour for all the data to be consumed), which caused the worker to be 
> unable to start and execute tasks for a long time.
> Of course, the number of consumer threads can also be adjusted, but I think 
> it may be easier to reduce the 'segment size', for example, refer to the 
> default value of __consumer_offsets: 100MB
>  
> The first picture in the attachment is the log size stored in the internal 
> topic, the second one is the time when ‘offset.storage.topic’ starts to be 
> read, and the third one is the time when ‘offset.storage.topic’ being read 
> finished. It took about 23 minutes in total.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to