hudeqi commented on PR #13852: URL: https://github.com/apache/kafka/pull/13852#issuecomment-1602409354
Thank you for your reply! @C0urante, seeing so many of your thoughts, I think it is very meaningful to think about! Below I will give my actual results and thoughts for each question. 1. I agree with you, if it is not mandatory to overwrite the user-defined value, then we may need to give a warning log to prompt for this case. 2. The reason why I encountered this kind of case is that there are too many topics synchronized in the MM cluster, too many partitions (very common and may not be avoided), the frequent update and storage of offset information leads to too large internal offset topic. As for internal config topic and internal status topic, I think it is difficult to meet the conditions for a large increase, and I haven't encountered it yet. Therefore, if we follow the "principle of least change", we may not need to make any adjustments for these two internal topics. 3. It's ok here if we respect the point of the first answer. 4. I've been calling it "MM2" probably inappropriately. Because in fact, I realized topic replication in the form of MirrorSourceConnector through DistributedHerder. Your idea is right, it may be beneficial for whole connect, I have no experience with other types of non-topic replication, this point needs your deciding. But if we use the connect cluster to achieve topic replication like me, I think this problem also needs to be solved. 5. This is how I do it as you say: if the current connect cluster already exists, I directly adjust the log segment size corresponding to the internal offset topic through “kafka-topic.sh”, which is a bit tricky. Although I mentioned setting 100MB in PR, in practice, I think it may still be a bit large. I set it to 50MB, and finally the startup time was shortened to 30s(It has been compacted to a total of only about 700MB), which may not have reached the 'worst case' you said (every partition It is full and the prior segment is also full)'. Maybe we can also increase the number of consumer threads that read offsets, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org