[
https://issues.apache.org/jira/browse/KAFKA-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346113#comment-15346113
]
Ismael Juma commented on KAFKA-1981:
------------------------------------
Sorry for the delay [~ewasserman]. I will try to take a look, but it would
probably be good for [~junrao] to take a look too as he knows this area of the
code better.
> Make log compaction point configurable
> --------------------------------------
>
> Key: KAFKA-1981
> URL: https://issues.apache.org/jira/browse/KAFKA-1981
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.8.2.0
> Reporter: Jay Kreps
> Labels: newbie++
> Attachments: KIP for Kafka Compaction Patch.md
>
>
> Currently if you enable log compaction the compactor will kick in whenever
> you hit a certain "dirty ratio", i.e. when 50% of your data is uncompacted.
> Other than this we don't give you fine-grained control over when compaction
> occurs. In addition we never compact the active segment (since it is still
> being written to).
> Other than this we don't really give you much control over when compaction
> will happen. The result is that you can't really guarantee that a consumer
> will get every update to a compacted topic--if the consumer falls behind a
> bit it might just get the compacted version.
> This is usually fine, but it would be nice to make this more configurable so
> you could set either a # messages, size, or time bound for compaction.
> This would let you say, for example, "any consumer that is no more than 1
> hour behind will get every message."
> This should be relatively easy to implement since it just impacts the
> end-point the compactor considers available for compaction. I think we
> already have that concept, so this would just be some other overrides to add
> in when calculating that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)