[GitHub] [kafka] C0urante commented on pull request #13852: KAFKA-15086:Set a reasonable segment size upper limit for MM2 internal topics

via GitHub Mon, 26 Jun 2023 10:30:29 -0700


C0urante commented on PR #13852:
URL: https://github.com/apache/kafka/pull/13852#issuecomment-1607920082


   @hudeqi sorry, this is a tricky issue and I'm trying to take time to think 
things through :)
   
   I hate to say it, but I don't think we can make this change or anything like 
it without a KIP. This is for two reasons:
   
   1. We're effectively changing the default value for the 
`offset.storage.topic.segment.bytes` property (even if we don't implement this 
change with that exact logic), which counts as a change to public API for the 
project
   2. By explicitly setting a value for the offset topic's `segment.bytes` 
property, we cause any broker-side value for the [log.segment.bytes 
property](https://kafka.apache.org/documentation.html#brokerconfigs_log.segment.bytes)
 to be ignored. If the broker uses a lower value for this property than our 
default, then we may make things worse instead of better
   
   I still think it's likely that decreasing the segment size for the offsets 
topic would help, but it'd be nice if we could get the kind of review that a 
KIP requires before making that kind of change.
   
   As far as increasing the number of consumer threads goes, I think it's 
really a question of what the performance bottleneck is when reading to the end 
of the topic. If CPU is the issue, then sure, it'd probably help to scale up 
the number of consumers. However, if network transfer between the worker and 
the Kafka cluster is the limiting factor, then it won't have any impact. The 
nice thing about decreasing the segment size is that (as long as it leads to a 
reduction in the total size of the offsets topic), it would help in either 
case: you'd have less data to consume from Kafka, and also less data to process 
on your Connect worker.
   
   This almost certainly varies depending on the environment Kafka Connect and 
Kafka are run in, but my hunch is that your fix here would be more effective 
than scaling up the number of consumers. I'd be curious to see if we could get 
benchmark numbers on that front, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] C0urante commented on pull request #13852: KAFKA-15086:Set a reasonable segment size upper limit for MM2 internal topics

Reply via email to