Sure! good to know that is tracked. Thanks, Luke!
On Thu, 21 Mar 2024 at 07:52, Luke Chen <show...@gmail.com> wrote: > Hi Jorge, > > You should check the JIRA: > https://issues.apache.org/jira/browse/KAFKA-16385 > where we had some discussion. > Welcome to provide your thoughts there. > > Thanks. > Luke > > On Thu, Mar 21, 2024 at 3:33 PM Jorge Esteban Quilcate Otoya < > quilcate.jo...@gmail.com> wrote: > > > Hi dev community, > > > > I'd like to share some findings on how rotation of active segment differ > > depending on whether topic retention is time- or size-based. > > > > I was (wrongly) under the assumption that active segments are only > rotated > > when segment configs (segment.bytes (1GiB) or segment.ms (7d)) or global > > log configs (log.roll.ms) force it -- regardless of the retention > > configuration. > > This seems to be different depending on how retention is defined: > > > > - If a topic has a retention based on time[1], the condition to rotate > the > > active segment is based on the latest timestamp. If the difference with > > current time is largest than retention time, then segment (including > > active) should be deleted. Active segment is rotated, and in next round > is > > deleted. > > > > - If a topic has retention based on size[2] though, the condition not > only > > depends on the size of the segment itself but first on the total log > size, > > forcing to always have at least a single (active) segment: first > difference > > between total log size and retention is calculated, let's say a single > > segment of 5MB and retention is 1MB; then total difference is 4MB, then > the > > condition to delete validates if the difference of the current segment > and > > the total difference is higher than zero, then delete. As the segment > size > > will always be higher than the total difference when there is a single > > segment, then there will always be at least 1 segment. In this case the > > only case where active segment is rotated it is when a new message > arrives. > > > > Added steps to reproduce[3]. > > > > Maybe I missing something obvious, but this seems inconsistent to me. > > Either both retention configs should rotate active segments, or none of > > them should and active segment should be only governed by segment > bytes|ms > > configs or log.roll config. > > > > I believe it's a useful feature to "force" active segment rotation > without > > changing segment of global log rotation given that features like > Compaction > > and Tiered Storage can benefit from this; but would like to clarify this > > behavior and make it consistent for both retention options, and/or call > it > > out explicitly in the documentation. > > > > Looking forward to your feedback! > > > > Jorge. > > > > [1]: > > > > > https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1566 > > [2]: > > > > > https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1575-L1583 > > > > [3]: https://gist.github.com/jeqo/d32cf07493ee61f3da58ac5e77b192b2 > > >