0.10.x format messages have timestamps within them so retention and
expiring of messages isn't entirely based on the filesystem timestamp of
the log segments anymore.

>From KIP-33 -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-Enforcetimebasedlogrolling

"Enforce time based log rolling

Currently time based log rolling is based on the creating time of the log
segment. With this KIP, the time based rolling would be changed to only
based on the message timestamp. More specifically, if the first message in
the log segment has a timestamp, A new log segment will be rolled out if
timestamp in the message about to be appended is greater than the timestamp
of the first message in the segment + log.roll.ms. When
message.timestamp.type=CreateTime, user should set
max.message.time.difference.ms appropriately together with log.roll.ms to
avoid frequent log segment roll out.

During the migration phase, if the first message in a segment does not have
a timestamp, the log rolling will still be based on the (current time -
create time of the segment)."

-hans

/**
 * Hans Jespersen, Principal Systems Engineer, Confluent Inc.
 * h...@confluent.io (650)924-2670
 */

On Thu, May 25, 2017 at 12:44 AM, Milind Vaidya <kava...@gmail.com> wrote:

> I have 6 broker cluster.
>
> I upgraded it from 0.8.1.1 to 0.10.0.0.
>
> Kafka Producer to cluster to consumer (apache storm) upgrade went smooth
> without any errors.
> Initially keeping protocol to 0.8 and after clients were upgraded it was
> promoted to 0.10.
>
> Out of 6 brokers, 3 are honouring  log.retention.hours. For other 3 when
> broker is restarted the time stamp for segment changes to current time.
> That leads to segments not getting deleted hence disk gets full.
>
> du -khc /disk1/kafka-broker/topic-1
>
> 71G     /disk1/kafka-broker/topic-1
>
> 71G     total
>
> Latest segment timestamp : May 25 07:34
>
> Oldest segment timestamp : May 25 07:16
>
>
> It is impossible that 71 GB data was collected in mere 15 mins of
> time. The log.retention.hours=24
> and this is not new broker so oldest data should be around 24 hrs old.
>
> As mentioned above only 3 out of 6 are showing same behaviour.  Why is this
> happening ?
>

Reply via email to