Sure! good to know that is tracked.

Thanks, Luke!

On Thu, 21 Mar 2024 at 07:52, Luke Chen <show...@gmail.com> wrote:

> Hi Jorge,
>
> You should check the JIRA:
> https://issues.apache.org/jira/browse/KAFKA-16385
> where we had some discussion.
> Welcome to provide your thoughts there.
>
> Thanks.
> Luke
>
> On Thu, Mar 21, 2024 at 3:33 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Hi dev community,
> >
> > I'd like to share some findings on how rotation of active segment differ
> > depending on whether topic retention is time- or size-based.
> >
> > I was (wrongly) under the assumption that active segments are only
> rotated
> > when segment configs (segment.bytes (1GiB) or segment.ms (7d)) or global
> > log configs (log.roll.ms) force it  -- regardless of the retention
> > configuration.
> > This seems to be different depending on how retention is defined:
> >
> > - If a topic has a retention based on time[1], the condition to rotate
> the
> > active segment is based on the latest timestamp. If the difference with
> > current time is largest than retention time, then segment (including
> > active) should be deleted. Active segment is rotated, and in next round
> is
> > deleted.
> >
> > - If a topic has retention based on size[2] though, the condition not
> only
> > depends on the size of the segment itself but first on the total log
> size,
> > forcing to always have at least a single (active) segment: first
> difference
> > between total log size and retention is calculated, let's say a single
> > segment of 5MB and retention is 1MB; then total difference is 4MB, then
> the
> > condition to delete validates if the difference of the current segment
> and
> > the total difference is higher than zero, then delete. As the segment
> size
> > will always be higher than the total difference when there is a single
> > segment, then there will always be at least 1 segment. In this case the
> > only case where active segment is rotated it is when a new message
> arrives.
> >
> > Added steps to reproduce[3].
> >
> > Maybe I missing something obvious, but this seems inconsistent to me.
> > Either both retention configs should rotate active segments, or none of
> > them should and active segment should be only governed by segment
> bytes|ms
> > configs or log.roll config.
> >
> > I believe it's a useful feature to "force" active segment rotation
> without
> > changing segment of global log rotation given that features like
> Compaction
> > and Tiered Storage can benefit from this; but would like to clarify this
> > behavior and make it consistent for both retention options, and/or call
> it
> > out explicitly in the documentation.
> >
> > Looking forward to your feedback!
> >
> > Jorge.
> >
> > [1]:
> >
> >
> https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1566
> > [2]:
> >
> >
> https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1575-L1583
> >
> > [3]: https://gist.github.com/jeqo/d32cf07493ee61f3da58ac5e77b192b2
> >
>

Reply via email to