James Cheng created KAFKA-3137:
----------------------------------

             Summary: Delete tombstones in log compacted topics may never get 
removed.
                 Key: KAFKA-3137
                 URL: https://issues.apache.org/jira/browse/KAFKA-3137
             Project: Kafka
          Issue Type: Bug
            Reporter: James Cheng


I spoke about this with [~junrao]. I haven't tried to reproduce this, but Jun 
said that it looks like this is possible, so I'm filing it.

Delete tombstones in log compacted topics are deleted after delete.retention.ms 
(at the topic level) or log.cleaner.delete.retention.ms (at the broker level).

However, we don't have per-message timestamps (at least until KIP-32 is 
implemented). So the timestamp of the log segment file is used as a proxy. 
However, the modification time of the log segment changes whenever a compaction 
run happens.

It's possible then that if log compaction happens very frequently that 
delete.retention.ms will never be reached. In that case, the delete tombstones 
would stay around longer than the user expected. 

I believe that means that log compaction would have to happen more frequently 
than delete.retention.ms. The frequency of log compaction is some calculation 
based on segment size, the criteria for segment roll (time or bytes), the 
min.cleanable.dirty.ratio, as well as the amount of traffic coming into the log 
compacted topic. So it's possible, but I'm not sure how likely.

And I would imagine that this can't be fixed until KIP-32 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to