[
https://issues.apache.org/jira/browse/KAFKA-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guang Zhao reassigned KAFKA-17212:
----------------------------------
Assignee: Guang Zhao
> Segments containing a single message can be incorrectly marked as local only
> ----------------------------------------------------------------------------
>
> Key: KAFKA-17212
> URL: https://issues.apache.org/jira/browse/KAFKA-17212
> Project: Kafka
> Issue Type: Bug
> Components: Tiered-Storage
> Affects Versions: 3.8.0, 3.7.1, 3.9.0
> Reporter: Guillaume Mallet
> Assignee: Guang Zhao
> Priority: Trivial
>
> There is an edge case triggered when a segment containing a single message
> causes the segment to be considered as local only which skews the deletion
> process towards deleting more data.
>
> *This is very unlikely to happen in a real scenario but can happen in tests
> when segment are rolled manually.*
> *It could possibly happen when segment are rolled based on time but even then
> the skew would be minimal.*
> h2. What happens
> In order to delete the right amount of data against the byte retention
> policy, we first count all the bytes in
> [buildRetentionSizeData|https://github.com/apache/kafka/blob/09be14bb09dc336f941a7859232094bfb3cb3b96/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1335]
> function that are breaching {{{}retention.bytes{}}}. In order to do this,
> the size of each segment is added to the size of the segments present only on
> the disk {{{}onlyLocalLogSegmentsSize{}}}.
> Listing the segment only present on disk is made through the function
> [onlyLocalLogSegmentSize|https://github.com/apache/kafka/blob/a0f6e6f816c6ac3fbbc4e0dc503dc43bfacfe6c7/core/src/main/scala/kafka/log/UnifiedLog.scala#L1618-L1619]
> by adding the size of each segments that have a _baseOffset_ greater or
> equal compared to {{{}highestOffsetInRemoteStorage{}}}{_}.{_}
> {{highestOffsetInRemoteStorage}} is the highest offset that has been
> successfully sent to the remote store{_}.{_}
> The _baseOffset_ of a segment is “a [lower bound ({*}inclusive{*}) of the
> offset in the
> segment”|https://github.com/apache/kafka/blob/a0f6e6f816c6ac3fbbc4e0dc503dc43bfacfe6c7/storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java#L115].
>
> In the case of a segment with a single message, the baseOffset can be equal
> to _highestOffsetInRemoteStorage,_ which means that despite the offset being
> offloaded to the RemoteStorage, we would count that segment as local only.
> This has consequence when counting the bytes to delete as we will count the
> size of this segment twice in the
> [buildRetentionSizeData|https://github.com/apache/kafka/blob/09be14bb09dc336f941a7859232094bfb3cb3b96/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1155],
> once as a segment offloaded in the RemoteStorage and once as a local segment
> when
> [onlyLocalSegmentSize|https://github.com/apache/kafka/blob/a0f6e6f816c6ac3fbbc4e0dc503dc43bfacfe6c7/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1361-L1363]
> is added.
> The result is that {{remainingBreachedSize}} will be higher than expected
> which can lead to more byte deleted than what we would initially expect, up
> to the size of the segment which is double counted.
> The issue is due to the fact we are using a greater or equal rather than
> equal. A segment present only locally will have a {{baseOffset}} strictly
> greater than {{highestOffsetInRemoteStorage.}}
> h2. Reproducing the issue
> The problem is highlighted in the 2 tests added in this [commit
> |https://github.com/apache/kafka/commit/97af351db517d69a2b37c92861e463a6d0c5cb8f]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)