[ https://issues.apache.org/jira/browse/KAFKA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782860#comment-17782860 ]
Arpit Goyal commented on KAFKA-15388: ------------------------------------- [~divijvaidya] [~satish.duggana] [~christo_lolov] [~showuon] can anyone help me with the above query ? This would help me to proceed further. To summarize it In a given segment we are trying to find the right record batch for the requested offset , but it may be possible the end offset of a given batch is already compacted , for example Lets say we ar looking to fetch data for offset 50 A segments contain record batch with start and end offset in the following format, but 50th offset is historically compacted. RB1[33-38] RB2[42-49] RB3[51-56] Now if we try to fetch data for 50th offset it would return null. Is it something expected or a bug ? If it is how would we identify which record batch needs to be returned ? > Handle topics that were having compaction as retention earlier are changed to > delete only retention policy and onboarded to tiered storage. > -------------------------------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-15388 > URL: https://issues.apache.org/jira/browse/KAFKA-15388 > Project: Kafka > Issue Type: Bug > Reporter: Satish Duggana > Assignee: Arpit Goyal > Priority: Blocker > Fix For: 3.7.0 > > > Context: [https://github.com/apache/kafka/pull/13561#discussion_r1300055517] > > There are 3 paths I looked at: > * When data is moved to remote storage (1) > * When data is read from remote storage (2) > * When data is deleted from remote storage (3) > (1) Does not have a problem with compacted topics. Compacted segments are > uploaded and their metadata claims they contain offset from the baseOffset of > the segment until the next segment's baseOffset. There are no gaps in offsets. > (2) Does not have a problem if a customer is querying offsets which do not > exist within a segment, but there are offset after the queried offset within > the same segment. *However, it does have a problem when the next available > offset is in a subsequent segment.* > (3) For data deleted via DeleteRecords there is no problem. For data deleted > via retention there is no problem. > > *I believe the proper solution to (2) is to make tiered storage continue > looking for the next greater offset in subsequent segments.* > Steps to reproduce the issue: > {code:java} > // TODO (christo) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)