[
https://issues.apache.org/jira/browse/KAFKA-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Warshaw updated KAFKA-3178:
--------------------------------
Description:
h3. Description
One of Kafka's officially-described use cases is a distributed commit log
(http://kafka.apache.org/documentation.html#uses_commitlog). In this case, for
a distributed service that needed a commit log, there would be a topic with a
single partition to guarantee log order. This service would use the commit log
to re-sync failed nodes. Kafka is generally an excellent fit for such a
system, but it does not expose an adequate mechanism for log cleanup in such a
case. The built-in log cleanup mechanisms are based on time / size thresholds,
which doesn't work well with a commit log; data can only be deleted from a
commit log when the client application determines that it is no longer needed.
Here we propose a new API exposed to clients through AdminUtils that will
delete all messages before a certain offset from a specific partition.
h3. Rejected Alternatives
- Manually setting / resetting time intervals for log retention configs to
periodically flush messages from the logs from before a certain time period.
Doing this involves several asynchronous processes, none of which provide any
hooks to know when they are actually complete.
- Rolling a new topic each time we want to cleanup the log. This is the best
existing approach, but is not ideal. All incoming writes would be paused while
waiting for a new topic to be created.
was:
One of Kafka's officially-described use cases is a distributed commit log
(http://kafka.apache.org/documentation.html#uses_commitlog). In this case, for
a distributed service that needed a commit log, there would be a topic with a
single partition to guarantee log order. This service would use the commit log
to re-sync failed nodes. Kafka is generally an excellent fit for such a
system, but it does not expose an adequate mechanism for log cleanup in such a
case. The built-in log cleanup mechanisms are based on time / size thresholds,
which doesn't work well with a commit log; data can only be deleted from a
commit log when the client application determines that it is no longer needed.
Here we propose a new API exposed to clients through AdminUtils that will
delete all messages before a certain offset from a specific partition.
> Expose a method in AdminUtils to manually truncate a specific partition to a
> particular offset
> ----------------------------------------------------------------------------------------------
>
> Key: KAFKA-3178
> URL: https://issues.apache.org/jira/browse/KAFKA-3178
> Project: Kafka
> Issue Type: Improvement
> Reporter: Bill Warshaw
> Labels: kafka
>
> h3. Description
> One of Kafka's officially-described use cases is a distributed commit log
> (http://kafka.apache.org/documentation.html#uses_commitlog). In this case,
> for a distributed service that needed a commit log, there would be a topic
> with a single partition to guarantee log order. This service would use the
> commit log to re-sync failed nodes. Kafka is generally an excellent fit for
> such a system, but it does not expose an adequate mechanism for log cleanup
> in such a case. The built-in log cleanup mechanisms are based on time / size
> thresholds, which doesn't work well with a commit log; data can only be
> deleted from a commit log when the client application determines that it is
> no longer needed. Here we propose a new API exposed to clients through
> AdminUtils that will delete all messages before a certain offset from a
> specific partition.
> h3. Rejected Alternatives
> - Manually setting / resetting time intervals for log retention configs to
> periodically flush messages from the logs from before a certain time period.
> Doing this involves several asynchronous processes, none of which provide any
> hooks to know when they are actually complete.
> - Rolling a new topic each time we want to cleanup the log. This is the best
> existing approach, but is not ideal. All incoming writes would be paused
> while waiting for a new topic to be created.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)