[ https://issues.apache.org/jira/browse/KAFKA-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henry Cai updated KAFKA-15620: ------------------------------ Description: Use the newly released 3.6.0, turn on tiered storage feature: {*}remote.log.storage.system.enable{*}=true 1. Set up topic tier5 to be remote storage enabled. Adding some data to the topic and the data is copied to remote storage. After a few days when the log segment is removed from remote storage due to log retention expiration, noticed the following warnings in the server log: [2023-10-16 22:19:10,971] DEBUG Updating remote log segment metadata: [RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} , customMetadata=Optional.empty, state=DELETE_SEGMENT_STARTED, eventTimestampMs=1697005926358, brokerId=1043}] (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) [2023-10-16 22:19:10,971] WARN Error occurred while updating the remote log segment. (org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore) org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: No remote log segment metadata found for :RemoteLogSegmentId\{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} at org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache.updateRemoteLogSegmentMetadata(RemoteLogMetadataCache.java:161) at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.handleRemoteLogSegmentMetadataUpdate(RemotePartitionMetadataStore.java:89) at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataEventHandler.handleRemoteLogMetadata(RemotePartitionMetadataEventHandler.java:33) at org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.processConsumerRecord(ConsumerTask.java:157) at org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.run(ConsumerTask.java:133) at java.base/java.lang.Thread.run(Thread.java:829) 2. After some debugging, realized the problem is *there are 2 sets of DELETE_SEGMENT_STARTED/FINISHED pairs* in the remote metadata topic for this segment. The DELETE_SEGMENT_FINISHED in the first set remove the segment from the metadata cache and this caused the above exception when the DELETE_SEGMENT_STARTED from the second set needs to be processed. 3. And traced the log to where the log retention kicked in and saw *there were two delete log segment generated* at that time: ``` [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach based on the largest record timestamp in the segment (kafka.log.remote.RemoteLogManager$RLMTask) [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach based on the largest record timestamp in the segment (kafka.log.remote.RemoteLogManager$RLMTask) ``` 4. And dumped out the content of the original COPY_SEGMENT_STARTED for this segment (which triggers the generation of the later DELETE_SEGMENT metadata): [2023-10-16 22:19:10,894] DEBUG Adding to in-progress state: [RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} , startOffset=6387830, endOffset=9578660, brokerId=1043, maxTimestampMs=1696401123036, eventTimestampMs=1696401127062, segmentLeaderEpochs=\{5=6387830, 6=6721329}, segmentSizeInBytes=511987531, customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}] (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) You can see there are 2 leader epochs in this segment: *segmentLeaderEpochs=\{5=6387830, 6=6721329}* 5. From the remote log retention code ([https://github.com/apache/kafka/blob/3.6.0/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L987)] It's bucketing segments into epochs first and then looping through epochs. I am not sure whether it should generate one or two DELETE_SEGMENT for this COPY_SEGMENT_START segment. If it needs to generate 2 DELETE_SEGMENT metadata, the consumer task needs to handle that duplicate metadata situation (not throwing exceptions in the log). was: Use the newly released 3.6.0, turn on tiered storage feature: {*}remote.log.storage.system.enable{*}=true 1. Set up topic tier5 to be remote storage enabled. Adding some data to the topic and the data is replicated to remote storage. After a few days when the log segment is removed from remote storage when log retention kicks in, noticed the following warnings in the server log: [2023-10-16 22:19:10,971] DEBUG Updating remote log segment metadata: [RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} , customMetadata=Optional.empty, state=DELETE_SEGMENT_STARTED, eventTimestampMs=1697005926358, brokerId=1043}] (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) [2023-10-16 22:19:10,971] WARN Error occurred while updating the remote log segment. (org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore) org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: No remote log segment metadata found for :RemoteLogSegmentId\{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} at org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache.updateRemoteLogSegmentMetadata(RemoteLogMetadataCache.java:161) at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.handleRemoteLogSegmentMetadataUpdate(RemotePartitionMetadataStore.java:89) at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataEventHandler.handleRemoteLogMetadata(RemotePartitionMetadataEventHandler.java:33) at org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.processConsumerRecord(ConsumerTask.java:157) at org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.run(ConsumerTask.java:133) at java.base/java.lang.Thread.run(Thread.java:829) 2. After some debugging, realized the problem is there are 2 sets of DELETE_SEGMENT_STARTED/FINISHED pairs in the remote metadata topic for this segment. The DELETE_SEGMENT_FINISHED in the first set remove the segment from the metadata cache and this caused the above exception when the DELETE_SEGMENT_STARTED needs to be processed. 3. And traced the log to where the log retention started and saw there are two delete log segment generated at that time: ``` [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach based on the largest record timestamp in the segment (kafka.log.remote.RemoteLogManager$RLMTask) [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach based on the largest record timestamp in the segment (kafka.log.remote.RemoteLogManager$RLMTask) ``` 4. And dumped out the content of the original COPY_SEGMENT_STARTED for this segment (which triggers the generation of the later DELETE_SEGMENT metadata): [2023-10-16 22:19:10,894] DEBUG Adding to in-progress state: [RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} , startOffset=6387830, endOffset=9578660, brokerId=1043, maxTimestampMs=1696401123036, eventTimestampMs=1696401127062, segmentLeaderEpochs=\{5=6387830, 6=6721329}, segmentSizeInBytes=511987531, customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}] (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) You can see there are 2 leader epochs in this segment: segmentLeaderEpochs=\{5=6387830, 6=6721329} 5. From the remote log retention code ([https://github.com/apache/kafka/blob/3.6.0/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L987)] It's bucketing segments into epochs first and then looping through epochs. I am not sure whether it should generate one or two DELETE_SEGMENT for this COPY_SEGMENT_START segment. If it needs to generate 2 DELETE_SEGMENT metadata, the consumer task needs to handle that duplicate metadata situation (not throwing exceptions in the log). > Duplicate remote log DELETE_SEGMENT metadata is generated when there are > multiple leader epochs in the segment > -------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-15620 > URL: https://issues.apache.org/jira/browse/KAFKA-15620 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 3.6.0 > Reporter: Henry Cai > Priority: Major > Fix For: 3.6.1 > > > Use the newly released 3.6.0, turn on tiered storage feature: > {*}remote.log.storage.system.enable{*}=true > 1. Set up topic tier5 to be remote storage enabled. Adding some data to the > topic and the data is copied to remote storage. After a few days when the > log segment is removed from remote storage due to log retention expiration, > noticed the following warnings in the server log: > [2023-10-16 22:19:10,971] DEBUG Updating remote log segment metadata: > [RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId > {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} > , customMetadata=Optional.empty, state=DELETE_SEGMENT_STARTED, > eventTimestampMs=1697005926358, brokerId=1043}] > (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) > [2023-10-16 22:19:10,971] WARN Error occurred while updating the remote log > segment. > (org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore) > org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: > No remote log segment metadata found for > :RemoteLogSegmentId\{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id=YFNCaWjPQFSKCngQ1QcKpA} > at > org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache.updateRemoteLogSegmentMetadata(RemoteLogMetadataCache.java:161) > at > org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.handleRemoteLogSegmentMetadataUpdate(RemotePartitionMetadataStore.java:89) > at > org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataEventHandler.handleRemoteLogMetadata(RemotePartitionMetadataEventHandler.java:33) > at > org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.processConsumerRecord(ConsumerTask.java:157) > at > org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.run(ConsumerTask.java:133) > at java.base/java.lang.Thread.run(Thread.java:829) > > 2. After some debugging, realized the problem is *there are 2 sets of > DELETE_SEGMENT_STARTED/FINISHED pairs* in the remote metadata topic for this > segment. The DELETE_SEGMENT_FINISHED in the first set remove the segment > from the metadata cache and this caused the above exception when the > DELETE_SEGMENT_STARTED from the second set needs to be processed. > > 3. And traced the log to where the log retention kicked in and saw *there > were two delete log segment generated* at that time: > ``` > [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 > partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment > RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach > based on the largest record timestamp in the segment > (kafka.log.remote.RemoteLogManager$RLMTask) > [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 > partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment > RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 604800000ms breach > based on the largest record timestamp in the segment > (kafka.log.remote.RemoteLogManager$RLMTask) > ``` > 4. And dumped out the content of the original COPY_SEGMENT_STARTED for this > segment (which triggers the generation of the later DELETE_SEGMENT metadata): > [2023-10-16 22:19:10,894] DEBUG Adding to in-progress state: > [RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId > {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} > , startOffset=6387830, endOffset=9578660, brokerId=1043, > maxTimestampMs=1696401123036, eventTimestampMs=1696401127062, > segmentLeaderEpochs=\{5=6387830, 6=6721329}, segmentSizeInBytes=511987531, > customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}] > (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) > > You can see there are 2 leader epochs in this segment: > *segmentLeaderEpochs=\{5=6387830, 6=6721329}* > > 5. From the remote log retention code > ([https://github.com/apache/kafka/blob/3.6.0/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L987)] > It's bucketing segments into epochs first and then looping through epochs. > I am not sure whether it should generate one or two DELETE_SEGMENT for this > COPY_SEGMENT_START segment. If it needs to generate 2 DELETE_SEGMENT > metadata, the consumer task needs to handle that duplicate metadata situation > (not throwing exceptions in the log). > -- This message was sent by Atlassian Jira (v8.20.10#820010)