Hello,
We have a cluster of 3 Kafka brokers working in KRaft mode as a docker
containers, configured to clean up old messages with reteintion.ms parameter.
After some period of time Kafka suddenly stopped to clean up old messages in
several partitions in one topic, while other partitions of same topic are
cleaned up normally according to retention.ms.
After manual cleanup of affected partitions problem is gone and log segments
are rolled and deleted according to retention.ms and segment.ms again without
any issues.
Affected partitions are assigned to brokers correctly.
In broker logs there are no errors or warnings found.
>From logs attached below it can be seen that at some moment log segments just
>stopped to be deleted, but new segments are rolled normally according to
>segment.ms.
Topic configurations:
cleanup.policy=delete
compression.type=producer
delete.retention.ms=86400000
file.delete.delay.ms=60000
flush.messages=9223372036854775807
flush.ms=9223372036854775807
follower.replication.throttled.replicas=
index.interval.bytes=4096
leader.replication.throttled.replicas=
local.retention.bytes=-2
local.retention.ms=-2
max.compaction.lag.ms=9223372036854775807
max.message.bytes=104857600
message.downconversion.enable=true
message.format.version=3.0-IV1
message.timestamp.after.max.ms=9223372036854775807
message.timestamp.before.max.ms=9223372036854775807
message.timestamp.difference.max.ms=9223372036854775807
message.timestamp.type=CreateTime
min.cleanable.dirty.ratio=0.5
min.compaction.lag.ms=0
min.insync.replicas=1
preallocate=false
remote.storage.enable=false
retention.bytes=-1
retention.ms=43200000
segment.bytes=1073741824
segment.index.bytes=10485760
segment.jitter.ms=0
segment.ms=43200000
unclean.leader.election.enable=false
Part of log related to one of affected partitions lifecycle:
{"log":"[2025-02-17 09:35:27,681] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3053780 in 0 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-17T09:35:27.681757986Z"}
{"log":"[2025-02-17 09:35:27,681] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3053780 with 9
producer ids in 0 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-17T09:35:27.681904312Z"}
{"log":"[2025-02-17 21:00:40,200] INFO [UnifiedLog partition=topic.name-24,
dir=/bitnami/kafka/data] Incremented log start offset to 3053267 due to segment
deletion
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.200910181Z"}
{"log":"[2025-02-17 21:00:40,201] INFO [UnifiedLog partition=topic.name-24,
dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3048556,
size=92636371, lastModifiedTime=1739782794987,
largestRecordTimestamp=1739782794807) due to log retention time 43200000ms
breach based on the largest record timestamp in the segment
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.201699088Z"}
{"log":"[2025-02-17 21:01:40,201] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3048556,
size=92636371, lastModifiedTime=1739782794987,
largestRecordTimestamp=1739782794807)
(kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:01:40.202002422Z"}
{"log":"[2025-02-17 21:01:40,201] INFO Deleted producer state snapshot
/bitnami/kafka/data/topic.name-24/00000000000003048556.snapshot.deleted
(org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:01:40.202028871Z"}
{"log":"[2025-02-17 21:01:40,225] INFO Deleted log
/bitnami/kafka/data/topic.name-24/00000000000003048556.log.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.22611061Z"}
{"log":"[2025-02-17 21:01:40,226] INFO Deleted offset index
/bitnami/kafka/data/topic.name-24/00000000000003048556.index.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226186892Z"}
{"log":"[2025-02-17 21:01:40,226] INFO Deleted time index
/bitnami/kafka/data/topic.name-24/00000000000003048556.timeindex.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226203127Z"}
{"log":"[2025-02-17 21:35:40,198] INFO [UnifiedLog partition=topic.name-24,
dir=/bitnami/kafka/data] Incremented log start offset to 3053780 due to segment
deletion
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.198570382Z"}
{"log":"[2025-02-17 21:35:40,199] INFO [UnifiedLog partition=topic.name-24,
dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3053267,
size=7863464, lastModifiedTime=1739784914960,
largestRecordTimestamp=1739784914996) due to log retention time 43200000ms
breach based on the largest record timestamp in the segment
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.200056724Z"}
{"log":"[2025-02-17 21:36:40,200] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3053267,
size=7863464, lastModifiedTime=1739784914960,
largestRecordTimestamp=1739784914996)
(kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:36:40.20029636Z"}
{"log":"[2025-02-17 21:36:40,202] INFO Deleted producer state snapshot
/bitnami/kafka/data/topic.name-24/00000000000003053267.snapshot.deleted
(org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:36:40.202355814Z"}
{"log":"[2025-02-17 21:36:40,205] INFO Deleted log
/bitnami/kafka/data/topic.name-24/00000000000003053267.log.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.205770247Z"}
{"log":"[2025-02-17 21:36:40,205] INFO Deleted offset index
/bitnami/kafka/data/topic.name-24/00000000000003053267.index.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206006967Z"}
{"log":"[2025-02-17 21:36:40,206] INFO Deleted time index
/bitnami/kafka/data/topic.name-24/00000000000003053267.timeindex.deleted.
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206457446Z"}
{"log":"[2025-02-21 17:12:46,160] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3114191 in 2 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-21T17:12:46.161085611Z"}
{"log":"[2025-02-21 17:12:46,161] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3114191 with 7
producer ids in 1 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-21T17:12:46.161400854Z"}
{"log":"[2025-02-22 05:12:49,276] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3119890 in 0 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T05:12:49.27700611Z"}
{"log":"[2025-02-22 05:12:49,276] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3119890 with 7
producer ids in 1 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T05:12:49.277087412Z"}
{"log":"[2025-02-22 17:12:27,376] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3130184 in 1 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T17:12:27.377141504Z"}
{"log":"[2025-02-22 17:12:27,377] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3130184 with 7
producer ids in 0 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T17:12:27.37716661Z"}
{"log":"[2025-02-23 05:13:23,236] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3136583 in 1 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T05:13:23.236501907Z"}
{"log":"[2025-02-23 05:13:23,236] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3136583 with 7
producer ids in 0 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T05:13:23.236553325Z"}
{"log":"[2025-02-23 17:14:02,679] INFO [LocalLog partition=topic.name-24,
dir=/bitnami/kafka/data] Rolled new log segment at offset 3146102 in 1 ms.
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T17:14:02.679379715Z"}
{"log":"[2025-02-23 17:14:02,679] INFO [ProducerStateManager
partition=topic.name-24] Wrote producer snapshot at offset 3146102 with 7
producer ids in 0 ms.
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T17:14:02.67942178Z"}
LogCleaner JMX metrics:
kafka_log_logcleanermanager_max_dirty_percent 0.0
kafka_log_logcleanermanager_uncleanable_bytes{logdirectory="\"/bitnami/kafka/data\"",}
0.0
kafka_log_logcleaner_max_clean_time_secs 0.0
kafka_log_logcleaner_max_buffer_utilization_percent 0.0
kafka_log_logcleanermanager_uncleanable_partitions_count{logdirectory="\"/bitnami/kafka/data\"",}
0.0
kafka_log_logcleaner_deadthreadcount 0.0