HI,

> as far as i understand, log retention time in kafka will delete message
> that's older than the retention time.


  log retention is applicable to log segment files. In kafka, each topic
can have multiple
  partitions and each partition data stored in multiple log segment files.


> say i have a list of messages for a partition of topic:
>
> 1,2,3,4,5 are the message (offsets) associated with the partition in
> current time.
>
> if message 1,2,3 expired earlier and only 4,5 are left, does that mean
> consumer can only consume 4,5 and need a way to detect 1,2,3 has expired
> and make sure it never reads before the earliest offset for the partition
>

  You can consume only the currently available/retained messages.
SimpleConsumerAPI will
  throw OffsetOutOfRange error code for non-existent offsets.


https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example


> if all 1,2,3,4,5 are expired, it seems the offset will become 0. i assume
> in this case consumer need to reset its consumed offsets to 0 for the
> consumer group.
>

   offset is a sequential id number and will not be reset to 0. Consumer
needs to
   be reset to latest offset available on the partition. Pl check
SImpleConsumer
   example for details.

Regards,
Kumar

Reply via email to