[ 
https://issues.apache.org/jira/browse/KAFKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107844#comment-14107844
 ] 

Jay Kreps commented on KAFKA-1489:
----------------------------------

I agree I think generally it makes most sense to consider dropping from the end 
rather than rejecting new messages. If you buy that then you can think about 
this feature as being about how to choose which partition to drop the last 
segment from when you are over your space allocation. The two obvious ways 
would be (a) drop the oldest segment amongst all logs or (b) drop from the 
partition which is taking up the most space. However Jim points out the case 
that make these slightly confusing: you can have different retention settings 
by space and time for each topic. So if you have one topic which has retention 
30 days and one topic with retention 1 day then this emergency discard would 
always discard from the 30 day topic. Jim's alternative actually makes some 
sense--assume all topics are in steady state (i.e. up against their maximum 
retention be it size or time). Then you can just discard (say) 10% across the 
board. So if that were the case I think the only config you need is something 
like
  max.total.disk.space.bytes=12345
and we can probably just hard code the 10% discard when you hit this limit.

> Global threshold on data retention size
> ---------------------------------------
>
>                 Key: KAFKA-1489
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1489
>             Project: Kafka
>          Issue Type: New Feature
>          Components: log
>    Affects Versions: 0.8.1.1
>            Reporter: Andras Sereny
>            Assignee: Jay Kreps
>              Labels: newbie
>
> Currently, Kafka has per topic settings to control the size of one single log 
> (log.retention.bytes). With lots of topics of different volume and as they 
> grow in number, it could become tedious to maintain topic level settings 
> applying to a single log. 
> Often, a chunk of disk space is dedicated to Kafka that hosts all logs 
> stored, so it'd make sense to have a configurable threshold to control how 
> much space *all* data in one Kafka log data directory can take up.
> See also:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201406.mbox/browser
> http://mail-archives.apache.org/mod_mbox/kafka-users/201311.mbox/%3c20131107015125.gc9...@jkoshy-ld.linkedin.biz%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to