[ https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jiangtao Liu updated KAFKA-8270: -------------------------------- Issue Type: Bug (was: Improvement) > Kafka retention hour is not working > ----------------------------------- > > Key: KAFKA-8270 > URL: https://issues.apache.org/jira/browse/KAFKA-8270 > Project: Kafka > Issue Type: Bug > Components: log > Reporter: Jiangtao Liu > Assignee: Richard Yu > Priority: Major > Labels: storage > > Currently, when a consumer falls out of a consumer group, it will restart > processing from the last checkpointed offset. However, this design could > result in a lag which some users could not afford to let happen. For example, > lets say a consumer crashed at offset 100, with the last checkpointed offset > being at 70. When it recovers at a later offset (say, 120), it will be behind > by an offset range of 50 (120 - 70). This is because the consumer restarted > at 70, forcing it to reprocess old data. To avoid this from happening, one > option would be to allow the current consumer to start processing not from > the last checkpointed offset (which is 70 in the example), but from 120 where > it recovers. Meanwhile, a new KafkaConsumer will be instantiated and start > reading from offset 70 in concurrency with the old process, and will be > terminated once it reaches 120. In this manner, a considerable amount of lag > can be avoided, particularly since the old consumer could proceed as if > nothing had happened. -- This message was sent by Atlassian JIRA (v7.6.3#76005)