Hey hudeqi, Thanks for the KIP! I did not know about the existing segment.bytes default value, and it does seem rather high in the context of the Connect internal topics. If I think about the segment.size as a "minimum per-partition data transfer on startup", 1GB is certainly not appropriate for even the single-partition config topic.
1. I have a concern about changing the topic configuration on startup. In the existing codebase, the *.storage.* worker configurations appear to only have an effect for newly created topics. If the topics were manually created before a Connect cluster starts, or were created by a previous Connect instance, then the Connect worker configuration could have arbitrary contents that have no effect. Updating the topic configurations after creation would be a new capability. Consider the situation where someone were to notice this log.segment problem, where a natural response would be to reconfigure the topic, diverging from the two configurations. When the worker can change the topic configuration after creation, that has the potential to roll back topic configurations that are managed externally. Do you think that changing the default for new Connect clusters, and emitting a startup warning for excessive segment.bytes is reasonable? We have other startup assertions that fail the startup of a worker based on partition and compaction requirements, and this would be similar in that it alerts the user to reconfigure the internal topics, but with a lesser severity. 2. I'm also interested to know what a reasonable value for this configuration would be. I did find the __consumer_offsets topic uses 104857600 (100 MiB) as defined in OffsetConfig.scala, so there is precedent for having a smaller segment.size for internal topics. 3. I believe there's a potential bug where compaction can happen before a worker reads a tombstone, leading the KafkaBasedLog to produce inconsistent in-memory states across multiple workers. Since the segment.size is so large, it makes me think that compaction has been wholly ineffective so far, and has prevented this bug from manifesting. By lowering the segment.size, we're increasing the likelihood of this failure, so it may need to finally be addressed. Thanks, Greg On Thu, Jul 6, 2023 at 5:39 AM Yash Mayya <yash.ma...@gmail.com> wrote: > > Also, just adding to the above point - we don't necessarily need to > explicitly add a new worker configuration right? Instead, we could > potentially just use the new proposed default value internally which can be > overridden by users through setting a value for > "offset.storage.segment.bytes" (via the existing KIP-605 based mechanism). > > On Thu, Jul 6, 2023 at 6:04 PM Yash Mayya <yash.ma...@gmail.com> wrote: > > > Hi hudeqi, > > > > Thanks for the KIP! Just to clarify - since KIP-605 ( > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-605%3A+Expand+Connect+Worker+Internal+Topic+Settings) > > already allows configuring "segment.bytes" for the Connect cluster's > > offsets topic via a worker configuration ("offset.storage.segment.bytes", > > same as what is being proposed in this KIP), the primary motivation of this > > KIP is essentially to override the default value for that topic > > configuration to a smaller value and decouple it from the backing Kafka > > cluster's "log.segment.bytes" configuration? Also, I'm curious about how > > the new default value of 50 MB was chosen (if there were any experiments > > that were run etc.)? > > > > Thanks, > > Yash > > > > On Mon, Jul 3, 2023 at 6:08 PM hudeqi <16120...@bjtu.edu.cn> wrote: > > > >> Is anyone following this KIP? Bump this thread. > > > >