[GitHub] [kafka] hudeqi commented on pull request #13852: KAFKA-15086:Set a reasonable segment size upper limit for MM2 internal topics

via GitHub Thu, 22 Jun 2023 03:36:15 -0700


hudeqi commented on PR #13852:
URL: https://github.com/apache/kafka/pull/13852#issuecomment-1602409354


   Thank you for your reply! @C0urante, seeing so many of your thoughts, I 
think it is very meaningful to think about! Below I will give my actual results 
and thoughts for each question.
   1. I agree with you, if it is not mandatory to overwrite the user-defined 
value, then we may need to give a warning log to prompt for this case.
   2. The reason why I encountered this kind of case is that there are too many 
topics synchronized in the MM cluster, too many partitions (very common and may 
not be avoided), the frequent update and storage of offset information leads to 
too large internal offset topic. As for internal config topic and internal 
status topic, I think it is difficult to meet the conditions for a large 
increase, and I haven't encountered it yet. Therefore, if we follow the 
"principle of least change", we may not need to make any adjustments for these 
two internal topics.
   3. It's ok here if we respect the point of the first answer.
   4. I've been calling it "MM2" probably inappropriately. Because in fact, I 
realized topic replication in the form of MirrorSourceConnector through 
DistributedHerder. Your idea is right, it may be beneficial for whole connect, 
I have no experience with other types of non-topic replication, this point 
needs your deciding. But if we use the connect cluster to achieve topic 
replication like me, I think this problem also needs to be solved.
   5. This is how I do it as you say: if the current connect cluster already 
exists, I directly adjust the log segment size corresponding to the internal 
offset topic through “kafka-topic.sh”, which is a bit tricky. Although I 
mentioned setting 100MB in PR, in practice, I think it may still be a bit 
large. I set it to 50MB, and finally the startup time was shortened to 30s(It 
has been compacted to a total of only about 700MB), which may not have reached 
the 'worst case' you said (every partition It is full and the prior segment is 
also full)'. Maybe we can also increase the number of consumer threads that 
read offsets, what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] hudeqi commented on pull request #13852: KAFKA-15086:Set a reasonable segment size upper limit for MM2 internal topics

Reply via email to