Agree! Thats a serious problem. We are trying to fix this in the upcoming release.
Gwen On Fri, Apr 8, 2016 at 2:56 PM, Anandha L Ranganathan <analog.s...@gmail.com > wrote: > Thanks. > > I have seen this in our system would like to understand the behavior of the > log segment. > > How the log segment will get deleted in the case of one of the ISR moved to > the new node. > Say for an example currently my ISR nodes {1,2,3} for the partition-0. Due > to some reason after 2 days the new ISR nodes are {2,3,4}. > Brokers {2,3} will contains some log segment creation date as T1 for the > partition-0 > Broker {4} has different log segment creation date as T2 for the > partition-0. > > The deletion of log segment will be based on broker {4} or brokers > {2,3}. We noticed that latest timestamp of log segment applies and it > sometime requires more disk space than anticipated. > > > > > > On Fri, Apr 8, 2016 at 1:07 PM Gwen Shapira <g...@confluent.io> wrote: > > > Yes. It is whichever is shorter :) > > > > Another clarification: > > A segment is deleted as a whole, based on the newest event in the > segment. > > So if the newest event is too recent to delete, the older events in the > > segment will also be kept around. > > > > On Fri, Apr 8, 2016 at 12:52 PM, Anandha L Ranganathan < > > analog.s...@gmail.com> wrote: > > > > > Just a clarification based on Gwen's reply > > > > > > *log.segment.bytes* - by default this property is set to 1 GB. > > > If we haven't set any value for *log.roll.ms <http://log.roll.ms>* , > > > again > > > by default it is set to 168 hours. In that case after every 1 GB, > will > > it > > > roll out new log segment file ? > > > > > > > > > > > > > > > > > > <http://log.roll.ms> > > > > > > On Fri, Apr 8, 2016 at 11:32 AM Heath Ivie <hi...@autoanything.com> > > wrote: > > > > > > > Gwen, > > > > > > > > Thanks for the detailed reply. > > > > > > > > That makes it more clear for me. > > > > > > > > Heath > > > > > > > > -----Original Message----- > > > > From: Gwen Shapira [mailto:g...@confluent.io] > > > > Sent: Tuesday, April 05, 2016 6:13 PM > > > > To: users@kafka.apache.org > > > > Subject: Re: Log Retention: What gets deleted > > > > > > > > I think you got it almost right. The missing part is that we only > > delete > > > > whole partition segments, not individual messages. > > > > > > > > As you are writing messages, every X bytes or Y milliseconds, a new > > file > > > > gets created for the partition to store new messages in. Those files > > are > > > > called segments. > > > > The segment you are currently writing to is an active segment. > > > > > > > > We will never delete an active segment, so in order to delete old > > > messages > > > > we will look for an inactive segment where the newest message is > older > > > than > > > > our retention and delete the entire segment. > > > > > > > > So there are several parameters controlling when will data get > deleted > > > > (I'm looking at just the time based, not the size-based): > > > > 1. log.retention.ms - how old messages should be before we consider > > them > > > > for deletion 2. log.roll.ms - how frequently we roll new segments. > > > > Messages will not get deleted before a new segment is rolled 3. > > > > log.retention.check.interval.ms - how frequently we check for > segments > > > > that we can delete. > > > > > > > > A message will be deleted if all 3 are true: > > > > 1. It is older than log.retention.ms > > > > 2. It is in an inactive segment, meaning enough time passed since the > > > > message was written to roll a new segment 3. Kafka checked for > segments > > > > that can be deleted, meaning that more than check.interval.ms time > > > passed > > > > since the segment was rolled. > > > > > > > > Hope this helps, > > > > > > > > Gwen > > > > > > > > > > > > > > > > On Fri, Apr 1, 2016 at 12:21 PM, Heath Ivie <hi...@autoanything.com> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I have some questions about the log retention and specifically what > > > > > gets deleted. > > > > > > > > > > I have a test app where I am writing 10 logs to the topic every > > second. > > > > > > > > > > What I would expect is a lag in a group would be somewhere around > 10 > > > > > if I have retention.ms at 1000. > > > > > > > > > > What I am seeing that the lag continues to grow, but then at some > > > > > point all messages are gone and the lag is at 0. > > > > > > > > > > I thought that the messages that are old would be deleted first. > > > > > > > > > > Am I misinterpreting how the log retention works? > > > > > > > > > > Heath Ivie > > > > > Solutions Architect > > > > > > > > > > > > > > > Warning: This e-mail may contain information proprietary to > > > > > AutoAnything Inc. and is intended only for the use of the intended > > > > > recipient(s). If the reader of this message is not the intended > > > > > recipient(s), you have received this message in error and any > review, > > > > > dissemination, distribution or copying of this message is strictly > > > > > prohibited. If you have received this message in error, please > notify > > > > > the sender immediately and delete all copies. > > > > > > > > > > > > > > >