[ 
https://issues.apache.org/jira/browse/KAFKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028184#comment-14028184
 ] 

Jay Kreps commented on KAFKA-1489:
----------------------------------

Go for it!

One slight oddity to consider is this. Different nodes will have different 
partitions. So the amount of data retained for different replicas of the same 
partition may vary quite a lot. A replica on a node with lots of data will 
retain little, and one on a more empty broker will retain lots. The current 
per-partition retention strategies are only approximately the same across nodes 
as well, but this will potentially be much more extreme.

In fact, in steady state any partition movement will simultaneously cause data 
to get purged to free up space.

I don't think this is necessarily a problem but we will need to warn people.

> Global threshold on data retention size
> ---------------------------------------
>
>                 Key: KAFKA-1489
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1489
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.8.1.1
>            Reporter: Andras Sereny
>            Assignee: Jay Kreps
>              Labels: newbie
>
> Currently, Kafka has per topic settings to control the size of one single log 
> (log.retention.bytes). With lots of topics of different volume and as they 
> grow in number, it could become tedious to maintain topic level settings 
> applying to a single log. 
> Often, a chunk of disk space is dedicated to Kafka that hosts all logs 
> stored, so it'd make sense to have a configurable threshold to control how 
> much space *all* data in Kafka can take up.
> See also:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201406.mbox/browser
> http://mail-archives.apache.org/mod_mbox/kafka-users/201311.mbox/%3c20131107015125.gc9...@jkoshy-ld.linkedin.biz%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to