C0urante commented on PR #13852: URL: https://github.com/apache/kafka/pull/13852#issuecomment-1607920082
@hudeqi sorry, this is a tricky issue and I'm trying to take time to think things through :) I hate to say it, but I don't think we can make this change or anything like it without a KIP. This is for two reasons: 1. We're effectively changing the default value for the `offset.storage.topic.segment.bytes` property (even if we don't implement this change with that exact logic), which counts as a change to public API for the project 2. By explicitly setting a value for the offset topic's `segment.bytes` property, we cause any broker-side value for the [log.segment.bytes property](https://kafka.apache.org/documentation.html#brokerconfigs_log.segment.bytes) to be ignored. If the broker uses a lower value for this property than our default, then we may make things worse instead of better I still think it's likely that decreasing the segment size for the offsets topic would help, but it'd be nice if we could get the kind of review that a KIP requires before making that kind of change. As far as increasing the number of consumer threads goes, I think it's really a question of what the performance bottleneck is when reading to the end of the topic. If CPU is the issue, then sure, it'd probably help to scale up the number of consumers. However, if network transfer between the worker and the Kafka cluster is the limiting factor, then it won't have any impact. The nice thing about decreasing the segment size is that (as long as it leads to a reduction in the total size of the offsets topic), it would help in either case: you'd have less data to consume from Kafka, and also less data to process on your Connect worker. This almost certainly varies depending on the environment Kafka Connect and Kafka are run in, but my hunch is that your fix here would be more effective than scaling up the number of consumers. I'd be curious to see if we could get benchmark numbers on that front, though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org