[ https://issues.apache.org/jira/browse/KAFKA-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kamal Chandraprakash updated KAFKA-17637: ----------------------------------------- Issue Type: Improvement (was: Task) > Invert the search for LIST_OFFSETS request for remote storage topic > ------------------------------------------------------------------- > > Key: KAFKA-17637 > URL: https://issues.apache.org/jira/browse/KAFKA-17637 > Project: Kafka > Issue Type: Improvement > Reporter: Kamal Chandraprakash > Priority: Major > > The timestamp in the records are non-monotonic so we begin the search from > earliest to latest offset for LIST_OFFSETS request. > When tiered storage is enabled for a topic, then we begin the search from > remote to local storage. There can be possible concurrency issue that can > happen, when the search moves from remote to local storage, some of the > local-log segments might get uploaded to remote and deleted from local in the > meantime. This can lead to loss of precision in returning the offset for the > given timestamp. If this issue happens, then we might silently search for the > timestamp in the next/available local-log segment. > This scenario is rare but good to fix since we traverse metadata events > returned by the > [RemoteLogMetadataManager#listRemoteLogSegments|https://sourcegraph.com/github.com/apache/kafka@trunk/-/blob/storage/api/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogMetadataManager.java?L183] > which is a plugin and the implementation might differ. > One way to fix this issue, is-to trigger the search in local-log first, then > move to remote-log, and compare the result. The similar approach for > MAX_TIMESTAMP is explained in: > [https://github.com/apache/kafka/pull/16602#discussion_r1759757001] > 1. Search in local-log and find the result. > 2. Search in remote-log and find the result. > 3. Compare both the results to pickup the correct offset. -- This message was sent by Atlassian Jira (v8.20.10#820010)