leiminghany created KAFKA-13218: ----------------------------------- Summary: kafka deleted unexpired message unexpectedly Key: KAFKA-13218 URL: https://issues.apache.org/jira/browse/KAFKA-13218 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.7.0 Environment: docker file : from openjdk:11-jre-slim-buster
RUN apt-get update RUN apt-get -y install net-tools iputils-ping curl procps RUN curl -OL https://mirrors.bfsu.edu.cn/apache/kafka/2.7.0/kafka_2.13-2.7.0.tgz && tar -xzf kafka_2.13-2.7.0.tgz && rm -f kafka_2.13-2.7.0.tgz ENV PATH "$PATH:/kafka_2.13-2.7.0/bin" RUN mkdir /etc/kafka COPY server.properties /etc/kafka/server.properties CMD ["kafka-server-start.sh", "/etc/kafka/server.properties"] configure file: broker.id=2 log.dirs=/var/lib/kafka log.segment.bytes=10485760 zookeeper.connect=zk-cs.default.svc.cluster.local:2181 sasl.enabled.mechanisms=PLAIN sasl.mechanism.inter.broker.protocol=PLAIN inter.broker.listener.name=INTERNAL listener.security.protocol.map=INTERNAL:SASL_PLAINTEXT,EXTERNAL:SASL_PLAINTEXT listeners=INTERNAL://:9092,EXTERNAL://:30101 advertised.listeners=INTERNAL://kafka-2.kafka.default.svc.cluster.local:9092,EXTERNAL://192.168.0.13:30101 Reporter: leiminghany I created a topic like this : {code:java} kafka-topics.sh --create --zookeeper zk-cs.default.svc.cluster.local:2181 --partitions 64 --replication-factor 2 --topic signal --config retention.ms=60480000000{code} and then I send several message into partition 2 of this topic . after than, I try to consumer the message from this partiton, but I can't get any messages. I read the kafka data directory, I found the log file was rolled, here is the files: {code:java} root@kafka-2:/var/lib/kafka/signal-2# ls 00000000000000000005.index 00000000000000000005.log 00000000000000000005.snapshot 00000000000000000005.timeindex leader-epoch-checkpoint {code} and the dump info is : {code:java} root@kafka-2:/var/lib/kafka/signal-2# kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --files 00000000000000000005.log Dumping 00000000000000000005.log Starting offset: 5 root@kafka-2:/var/lib/kafka/signal-2# root@kafka-2:/var/lib/kafka/signal-2# kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --files 00000000000000000005.index Dumping 00000000000000000005.index root@kafka-2:/var/lib/kafka/signal-2# kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --files 00000000000000000005.snapshot Dumping 00000000000000000005.snapshot root@kafka-2:/var/lib/kafka/signal-2# kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --files 00000000000000000005.timeindex Dumping 00000000000000000005.timeindex timestamp: 0 offset: 5 The following indexed offsets are not found in the log. Indexed offset: 5, found log offset: -1 root@kafka-2:/var/lib/kafka/signal-2# cat leader-epoch-checkpoint 0 1 0 5 {code} here is the kafka console log about this partition: {code:java} [2021-08-18 12:04:57,652] INFO [ProducerStateManager partition=signal-2] Writing producer snapshot at offset 5 (kafka.log.ProducerStateManager) [2021-08-18 12:04:57,653] INFO [Log partition=signal-2, dir=/var/lib/kafka] Rolled new log segment at offset 5 in 7 ms. (kafka.log.Log) [2021-08-18 12:04:57,653] INFO [Log partition=signal-2, dir=/var/lib/kafka] Deleting segment LogSegment(baseOffset=0, size=318, lastModifiedTime=1629288220552, largestRecordTimestamp=Some(0)) due to retention time 60480000000ms breach based on the largest record timestamp in the segment (kafka.log.Log) [2021-08-18 12:04:57,653] INFO [Log partition=signal-2, dir=/var/lib/kafka] Incremented log start offset to 5 due to segment deletion (kafka.log.Log) [2021-08-18 12:05:57,671] INFO [Log partition=signal-2, dir=/var/lib/kafka] Deleting segment files LogSegment(baseOffset=0, size=318, lastModifiedTime=1629288220552, largestRecordTimestamp=Some(0)) (kafka.log.Log) [2021-08-18 12:05:57,672] INFO Deleted log /var/lib/kafka/signal-2/00000000000000000000.log.deleted. (kafka.log.LogSegment) [2021-08-18 12:05:57,672] INFO Deleted offset index /var/lib/kafka/signal-2/00000000000000000000.index.deleted. (kafka.log.LogSegment) [2021-08-18 12:05:57,673] INFO Deleted time index /var/lib/kafka/signal-2/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment) {code} I think the `largestRecordTimestamp=Some(0)` may be the clue to track this problem, But I can not find out the exact reason。 anyone can help me? this problem is happenned occasionally. -- This message was sent by Atlassian Jira (v8.3.4#803005)