[
https://issues.apache.org/jira/browse/KAFKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548776#comment-15548776
]
ASF GitHub Bot commented on KAFKA-3224:
---------------------------------------
GitHub user bill-warshaw opened a pull request:
https://github.com/apache/kafka/pull/1972
KAFKA-3224: New log deletion policy based on timestamp
* adds a new topic-level broker configuration, `log.retention.min.timestamp`
* if unset, this setting is ignored
* setting this value to a Unix timestamp will allow the log cleaner to
delete any segments for a given topic whose last timestamp is earlier than the
set timestamp
--
###
[KIP-47](https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy)
### [JIRA](https://issues.apache.org/jira/browse/KAFKA-3224)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bill-warshaw/kafka KAFKA-3224
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/1972.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1972
----
commit 008dc813d82e5938201f069712ba1bd44277e755
Author: Bill Warshaw <[email protected]>
Date: 2016-02-02T16:45:47Z
KAFKA-3224: New log deletion policy based on timestamp
* setting log.retention.min.timestamp will set a timestamp for a log,
and any message before that timestamp is eligible for deletion
----
> Add timestamp-based log deletion policy
> ---------------------------------------
>
> Key: KAFKA-3224
> URL: https://issues.apache.org/jira/browse/KAFKA-3224
> Project: Kafka
> Issue Type: Improvement
> Reporter: Bill Warshaw
> Labels: kafka
>
> One of Kafka's officially-described use cases is a distributed commit log
> (http://kafka.apache.org/documentation.html#uses_commitlog). In this case,
> for a distributed service that needed a commit log, there would be a topic
> with a single partition to guarantee log order. This service would use the
> commit log to re-sync failed nodes. Kafka is generally an excellent fit for
> such a system, but it does not expose an adequate mechanism for log cleanup
> in such a case. With a distributed commit log, data can only be deleted when
> the client application determines that it is no longer needed; this creates
> completely arbitrary ranges of time and size for messages, which the existing
> cleanup mechanisms can't handle smoothly.
> A new deletion policy based on the absolute timestamp of a message would work
> perfectly for this case. The client application will periodically update the
> minimum timestamp of messages to retain, and Kafka will delete all messages
> earlier than that timestamp using the existing log cleaner thread mechanism.
> This is based off of the work being done in KIP-32 - Add timestamps to Kafka
> message.
> h3. Initial Approach
> https://github.com/apache/kafka/compare/trunk...bill-warshaw:KAFKA-3224
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)