> Jun Rao via dev <[email protected]> 於 2026年4月17日 凌晨4:57 寫道:
> 
> The main motivation of the KIP is for Scenario 1: Partition Expansion Data
> Loss. I am wondering if this case can be covered by the
> existing by_duration. With KIP-848, if a new partition is added, the
> consumer client can pick it up after group.consumer.heartbeat.interval.ms,
> which defaults to 5 seconds. If the consumer sets auto.offset.reset to
> by_duration:5 secs, it won't miss any new messages on new partitions. It
> does mean that the consumer needs to pick up an extra 5 seconds' worth of
> data on the first start, but it probably doesn't make a difference.

We actually evaluated using by_duration as a workaround during the KIP design, 
but found two major drawbacks in a distributed environment:

1. The calculated time (now() - duration) varies across nodes due to clock 
skew. To prevent data loss, users must set a large duration, which forces the 
consumer to reprocess too much historical data.

2. If a network error occurs, the client recalculates the seek time on the next 
retry. This shifts the target timestamp forward, risking data loss.

Best, Chia-Ping

Reply via email to