Hey hudeqi,

Thanks for the KIP! I did not know about the existing segment.bytes
default value, and it does seem rather high in the context of the
Connect internal topics.
If I think about the segment.size as a "minimum per-partition data
transfer on startup", 1GB is certainly not appropriate for even the
single-partition config topic.

1. I have a concern about changing the topic configuration on startup.

In the existing codebase, the *.storage.* worker configurations appear
to only have an effect for newly created topics. If the topics were
manually created before a Connect cluster starts, or were created by a
previous Connect instance, then the Connect worker configuration could
have arbitrary contents that have no effect. Updating the topic
configurations after creation would be a new capability.
Consider the situation where someone were to notice this log.segment
problem, where a natural response would be to reconfigure the topic,
diverging from the two configurations. When the worker can change the
topic configuration after creation, that has the potential to roll
back topic configurations that are managed externally.
Do you think that changing the default for new Connect clusters, and
emitting a startup warning for excessive segment.bytes is reasonable?
We have other startup assertions that fail the startup of a worker
based on partition and compaction requirements, and this would be
similar in that it alerts the user to reconfigure the internal topics,
but with a lesser severity.

2. I'm also interested to know what a reasonable value for this
configuration would be. I did find the __consumer_offsets topic uses
104857600 (100 MiB) as defined in OffsetConfig.scala, so there is
precedent for having a smaller segment.size for internal topics.

3. I believe there's a potential bug where compaction can happen
before a worker reads a tombstone, leading the KafkaBasedLog to
produce inconsistent in-memory states across multiple workers. Since
the segment.size is so large, it makes me think that compaction has
been wholly ineffective so far, and has prevented this bug from
manifesting. By lowering the segment.size, we're increasing the
likelihood of this failure, so it may need to finally be addressed.

Thanks,
Greg




On Thu, Jul 6, 2023 at 5:39 AM Yash Mayya <yash.ma...@gmail.com> wrote:
>
> Also, just adding to the above point - we don't necessarily need to
> explicitly add a new worker configuration right? Instead, we could
> potentially just use the new proposed default value internally which can be
> overridden by users through setting a value for
> "offset.storage.segment.bytes" (via the existing KIP-605 based mechanism).
>
> On Thu, Jul 6, 2023 at 6:04 PM Yash Mayya <yash.ma...@gmail.com> wrote:
>
> > Hi hudeqi,
> >
> > Thanks for the KIP! Just to clarify - since KIP-605 (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-605%3A+Expand+Connect+Worker+Internal+Topic+Settings)
> > already allows configuring "segment.bytes" for the Connect cluster's
> > offsets topic via a worker configuration ("offset.storage.segment.bytes",
> > same as what is being proposed in this KIP), the primary motivation of this
> > KIP is essentially to override the default value for that topic
> > configuration to a smaller value and decouple it from the backing Kafka
> > cluster's "log.segment.bytes" configuration? Also, I'm curious about how
> > the new default value of 50 MB was chosen (if there were any experiments
> > that were run etc.)?
> >
> > Thanks,
> > Yash
> >
> > On Mon, Jul 3, 2023 at 6:08 PM hudeqi <16120...@bjtu.edu.cn> wrote:
> >
> >> Is anyone following this KIP? Bump this thread.
> >
> >

Reply via email to