[jira] [Commented] (KAFKA-8522) Tombstones can survive forever

Richard Yu (JIRA) Sat, 17 Aug 2019 20:21:58 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909876#comment-16909876
 ]


Richard Yu commented on KAFKA-8522:
-----------------------------------

The below code is located in LogManager.scala. I noticed that in LogManager 
this was the following code that was used for the LogManager constructor:

 
{code:java}
new LogManager(logDirs = config.logDirs.map(new File(_).getAbsoluteFile),

      initialOfflineDirs = initialOfflineDirs.map(new File(_).getAbsoluteFile),

      topicConfigs = topicConfigs,

      initialDefaultConfig = defaultLogConfig,

      cleanerConfig = cleanerConfig,

      recoveryThreadsPerDataDir = config.numRecoveryThreadsPerDataDir,

      flushCheckMs = config.logFlushSchedulerIntervalMs,

      flushRecoveryOffsetCheckpointMs = 
config.logFlushOffsetCheckpointIntervalMs,

      flushStartOffsetCheckpointMs = 
config.logFlushStartOffsetCheckpointIntervalMs,

      retentionCheckMs = config.logCleanupIntervalMs,

      maxPidExpirationMs = config.transactionIdExpirationMs,

      scheduler = kafkaScheduler,

      brokerState = brokerState,

      brokerTopicStats = brokerTopicStats,

      logDirFailureChannel = logDirFailureChannel,

      time = time)
{code}
 

The variable {{config}} is a KafkaConfig instance. The {{logDirs}} you see here 
will be the ones that will eventually be used in {{LogCleanerManager}}. This 
was the place from which I drew the conclusion in my previous comment.

> Tombstones can survive forever
> ------------------------------
>
>                 Key: KAFKA-8522
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8522
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log cleaner
>            Reporter: Evelyn Bayes
>            Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly 
> unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest 
> segment. Old records get compacted away, but the new records continuously 
> update the timestamp of the oldest segment reseting the countdown for 
> deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be 
> undesirable as a sudden change in throughput could cause a dangerous number 
> of segments to be created.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8522) Tombstones can survive forever

Reply via email to