This is what Tood said:

"Retention is going to be based on a combination of both the retention and
segment size settings (as a side note, it's recommended to use
log.retention.ms and log.segment.ms, not the hours config. That's there for
legacy reasons, but the ms configs are more consistent). As messages are
received by Kafka, they are written to the current open log segment for
each partition. That segment is rotated when either the log.segment.bytes
or the log.segment.ms limit is reached. Once that happens, the log segment
is closed and a new one is opened. Only after a log segment is closed can
it be deleted via the retention settings. Once the log segment is closed
AND either all the messages in the segment are older than log.retention.ms
OR the total partition size is greater than log.retention.bytes, then the
log segment is purged.

As a note, the default segment limit is 1 gibibyte. So if you've only
written in 1k of messages, you have a long way to go before that segment
gets rotated. This is why the retention is referred to as a minimum time.
You can easily retain much more than you're expecting for slow topics."

On Dec 9, 2016 02:38, "Rodrigo Sandoval" <rodrigo.madfe...@gmail.com> wrote:

> Your understanding about segment.bytes and retention.ms is correct. But
> Tood Palino said just after having reached the segment size, that is when
> the segment is "closed"  PLUS all messages within the segment that was
> closed are older than the retention policy defined ( in this case
> retention.ms) THEN delete the segment.
>
> At least according to my testing, it is not necessary to wait until the
> segment is closed to delete it. Simply if all messages in a segment ( no
> matter if the segment reached the size defined by segment.bytes) are older
> than the policy defined by retention.ms , THEN delete the segment.
>
> I have been playing a lot today with kafka, and at least that is what I
> figured out.
>
> On Dec 9, 2016 02:13, "Sachin Mittal" <sjmit...@gmail.com> wrote:
>
>> I think segment.bytes defines the size of single log file before creating
>> a
>> new one.
>> retention.ms defines number of ms to wait on a log file before deleting
>> it.
>>
>> So it is working as defined in docs.
>>
>>
>> On Fri, Dec 9, 2016 at 2:42 AM, Rodrigo Sandoval <
>> rodrigo.madfe...@gmail.com
>> > wrote:
>>
>> > How is that about that when the segment size is reached, plus every
>> single
>> > message inside the segment is older than the retention time, then the
>> > segment will be deleted?
>> >
>> >
>> > I have playing with Kafka and I have the following:
>> >
>> > bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1
>> > config retention.ms=60000
>> >
>> > bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1
>> > —config file.delete.delay.ms=40000
>> >
>> > bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1
>> > --config segment.bytes=400000
>> >
>> > My understanding according to your thoughts is a segment will be deleted
>> > when the segment reaches out the segment size above defined
>> > (segment.bytes=400000) PLUS every single message within the segment is
>> > older than the retention time above defined (retention.ms=60000).
>> >
>> > What I noticed is a segment of just 35 bytes, which conteined just one
>> > message, was deleted after the minute (maybe a little more). Therefore,
>> the
>> > segment size was not met in order to delete it.
>> >
>>
>

Reply via email to