Hello Jun, Thanks for the reply,
When the offset goes out of range, the user faces two options: 1. Skip to the end (latest behavior) — risk losing data that was produced during the group's lifetime but not yet consumed. 2. Seek back to the group creation time (to_start_time behavior) — potentially reprocess some data, but guarantee no data from the group's lifetime is silently lost. to_start_time chooses option 2 because its core promise is "never silently lose data produced after the group started." If we fell back to latest on out-of-range, we would break this guarantee. I consider users who prefer option 1 can simply use auto.offset.reset=latest. Best Regards, Jiunn-Yang > Jun Rao via dev <[email protected]> 於 2026年4月18日 凌晨1:57 寫道: > > Hi, Jiunn-Yang and Chia-Ping, > > Thanks for the reply. > > "The core semantic of to_start_time is to read all records since the > creation of the group." > > I am just questioning whether this actually covers a common use case. If > the offset doesn't go out of range, the logic makes sense to me. I'm not > sure about the logic if the offset is out of range. If a user chooses to > skip the historical data when starting the group, it seems the user likely > wants to do the same if the offset is out of range. > > Jun > > On Fri, Apr 17, 2026 at 5:23 AM 黃竣陽 <[email protected]> wrote: > >> Hello Jun, >> >> Thank for the feedback, >> >> Adding to the points above: >> >> Regarding by_duration as an alternative to Scenario 1: beyond clock skew >> and retry issues, there is also a usability concern. by_duration requires >> users >> to reason about operational timing — "how long does partition discovery >> take >> in my environment?”, and then translate that into a configuration value. >> to_start_time >> requires no such reasoning. It simply anchors to the group creation time >> recorded >> by the broker. >> >> Regarding Scenario 2: I'd also like to clarify that to_start_time does not >> branch between >> "use latest" and "use earliest." It applies the same ListOffsetsRequest >> with the group creation >> timestamp in all cases. The difference in outcome: >> - skipping old data on first start >> - consuming surviving data after truncation >> is a natural consequence of what data exists in the partition at that >> point, not a different policy >> being applied. The rule is always the same. >> >> Best Regards, >> Jiunn-Yang >> >>> Chia-Ping Tsai <[email protected]> 於 2026年4月17日 上午9:48 寫道: >>> >>> >>>> Jun Rao via dev <[email protected]> 於 2026年4月17日 凌晨4:57 寫道: >>>> >>>> Also, a group is deleted after the consumer has been idle longer >>>> than offsets.retention.minutes. What's the semantic of to_start_time if >> the >>>> group creation time is unavailable? >>> >>> If the group is recreated, a new creation time will be recorded. Hence, >> it acts like a new group. Plus, it throws an exception directly if the >> group truly has no creation time. >> >>
