[ https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909876#comment-16909876 ]
Richard Yu commented on KAFKA-8522: ----------------------------------- The below code is located in LogManager.scala. I noticed that in LogManager this was the following code that was used for the LogManager constructor: {code:java} new LogManager(logDirs = config.logDirs.map(new File(_).getAbsoluteFile), initialOfflineDirs = initialOfflineDirs.map(new File(_).getAbsoluteFile), topicConfigs = topicConfigs, initialDefaultConfig = defaultLogConfig, cleanerConfig = cleanerConfig, recoveryThreadsPerDataDir = config.numRecoveryThreadsPerDataDir, flushCheckMs = config.logFlushSchedulerIntervalMs, flushRecoveryOffsetCheckpointMs = config.logFlushOffsetCheckpointIntervalMs, flushStartOffsetCheckpointMs = config.logFlushStartOffsetCheckpointIntervalMs, retentionCheckMs = config.logCleanupIntervalMs, maxPidExpirationMs = config.transactionIdExpirationMs, scheduler = kafkaScheduler, brokerState = brokerState, brokerTopicStats = brokerTopicStats, logDirFailureChannel = logDirFailureChannel, time = time) {code} The variable {{config}} is a KafkaConfig instance. The {{logDirs}} you see here will be the ones that will eventually be used in {{LogCleanerManager}}. This was the place from which I drew the conclusion in my previous comment. > Tombstones can survive forever > ------------------------------ > > Key: KAFKA-8522 > URL: https://issues.apache.org/jira/browse/KAFKA-8522 > Project: Kafka > Issue Type: Improvement > Components: log cleaner > Reporter: Evelyn Bayes > Priority: Minor > > This is a bit grey zone as to whether it's a "bug" but it is certainly > unintended behaviour. > > Under specific conditions tombstones effectively survive forever: > * Small amount of throughput; > * min.cleanable.dirty.ratio near or at 0; and > * Other parameters at default. > What happens is all the data continuously gets cycled into the oldest > segment. Old records get compacted away, but the new records continuously > update the timestamp of the oldest segment reseting the countdown for > deleting tombstones. > So tombstones build up in the oldest segment forever. > > While you could "fix" this by reducing the segment size, this can be > undesirable as a sudden change in throughput could cause a dangerous number > of segments to be created. -- This message was sent by Atlassian JIRA (v7.6.14#76016)