hi Jun

Nice point. Group GC is definitely an issue for to_start_time, but it is 
actually an issue for other policies as well.

For example, a consumer using the earliest policy will suddenly read all 
historical records from scratch if it sleeps for a long while and gets GC'd; 
otherwise, it just resumes from previous offsets if the group still exists. It 
is equally hard to explain to users: "Oh, your group was GC'd, so your offset 
behavior changed."

Therefore, it seems to me the right approach to fix this "inconsistency" is to 
offer a group-level GC timeout in a future KIP, allowing users to explicitly 
protect critical groups from GC. This saves not only to_start_time, but all 
other reset policies too.

Best,
Chia-Ping

On 2026/04/20 20:19:47 Jun Rao via dev wrote:
> Hi, Jiunn-Yang and Chia-Ping,
> 
> Thanks for the reply.
> 
> The main concern I see with to_start_time is that its behavoir on how much
> data to consume when the offset is out of range is not consistent and is
> hard to explain. If the group still exists, it will read from the earliest
> offset. Otherwise, it will read from the latest.
> 
> Jun
> 
> On Mon, Apr 20, 2026 at 10:13 AM Chia-Ping Tsai <[email protected]> wrote:
> 
> > hi all,
> >
> > Just a note for a potential latest_v2:
> >
> > Since the purpose is to read all records from extended partitions, we
> > could leverage the group creation time to compare against the earliest
> > record of a partition when there is no committed offset. If the group
> > creation time is larger than the earliest record's timestamp, we assume it
> > is not an extended partition. Otherwise, we treat it as an extended
> > partition.
> >
> > This approach allows us to catch all "possible" extended partitions, which
> > includes both "true" extended partitions and old but truncated partitions.
> > While there is a rare edge case where the cost is reprocessing some records
> > we don't necessarily want, it is very easy to implement and guarantees we
> > will never miss the actual extended partitions.
> >
> > Best,
> > Chia-Ping
> >
> > On 2026/04/20 13:33:31 黃竣陽 wrote:
> > > Hello all,
> > >
> > > I have added a new "Future Work: latest_strict Policy" section to the
> > KIP.
> > > The idea is a future policy that uses latest semantics by default but
> > falls
> > > back to the group creation timestamp specifically for newly added
> > partitions
> > > during partition expansion. This would reuse the group creation time
> > anchor
> > > introduced by this KIP, making it a natural extension with minimal
> > additional
> > > protocol changes.
> > >
> > > Best Regards,
> > > Jiunn-Yang
> > >
> > > > Chia-Ping Tsai <[email protected]> 於 2026年4月18日 下午4:09 寫道:
> > > >
> > > > Hi all,
> > > >
> > > > It is practically NP-hard to guess everyone's ideal use case right now.
> > > > Also, I believe we all want to avoid falling back to the intricate
> > > > multi-policy approach proposed in KIP-842.
> > > >
> > > > I prefer to keep this KIP focused and discuss a "v2 latest" policy in a
> > > > separate KIP. That future policy could build upon the to_start_time
> > anchor
> > > > to fix data loss specifically for extended partitions. We could call it
> > > > something like latest_strict.
> > > >
> > > > Thoughts?
> > > >
> > > >
> > > > 黃竣陽 <[email protected]> 於 2026年4月18日週六 下午3:24寫道:
> > > >
> > > >> Hello Jun,
> > > >>
> > > >> Thanks for the reply,
> > > >>
> > > >> When the offset goes out of range, the user faces two options:
> > > >>
> > > >> 1. Skip to the end (latest behavior) — risk losing data that was
> > produced
> > > >> during
> > > >> the group's lifetime but not yet consumed.
> > > >> 2. Seek back to the group creation time (to_start_time behavior) —
> > > >> potentially
> > > >> reprocess some data, but guarantee no data from the group's lifetime
> > is
> > > >> silently lost.
> > > >>
> > > >> to_start_time chooses option 2 because its core promise is "never
> > silently
> > > >> lose data
> > > >> produced after the group started." If we fell back to latest on
> > > >> out-of-range, we would
> > > >> break this guarantee.
> > > >>
> > > >> I consider users who prefer option 1 can simply use
> > > >> auto.offset.reset=latest.
> > > >>
> > > >> Best Regards,
> > > >> Jiunn-Yang
> > > >>
> > > >>> Jun Rao via dev <[email protected]> 於 2026年4月18日 凌晨1:57 寫道:
> > > >>>
> > > >>> Hi, Jiunn-Yang and Chia-Ping,
> > > >>>
> > > >>> Thanks for the reply.
> > > >>>
> > > >>> "The core semantic of to_start_time is to read all records since the
> > > >>> creation of the group."
> > > >>>
> > > >>> I am just questioning whether this actually covers a common use
> > case. If
> > > >>> the offset doesn't go out of range, the logic makes sense to me. I'm
> > not
> > > >>> sure about the logic if the offset is out of range. If a user
> > chooses to
> > > >>> skip the historical data when starting the group, it seems the user
> > > >> likely
> > > >>> wants to do the same if the offset is out of range.
> > > >>>
> > > >>> Jun
> > > >>>
> > > >>> On Fri, Apr 17, 2026 at 5:23 AM 黃竣陽 <[email protected]> wrote:
> > > >>>
> > > >>>> Hello Jun,
> > > >>>>
> > > >>>> Thank for the feedback,
> > > >>>>
> > > >>>> Adding to the points above:
> > > >>>>
> > > >>>> Regarding by_duration as an alternative to Scenario 1: beyond clock
> > skew
> > > >>>> and retry issues, there is also a usability concern. by_duration
> > > >> requires
> > > >>>> users
> > > >>>> to reason about operational timing — "how long does partition
> > discovery
> > > >>>> take
> > > >>>> in my environment?”, and then translate that into a configuration
> > value.
> > > >>>> to_start_time
> > > >>>> requires no such reasoning. It simply anchors to the group creation
> > time
> > > >>>> recorded
> > > >>>> by the broker.
> > > >>>>
> > > >>>> Regarding Scenario 2: I'd also like to clarify that to_start_time
> > does
> > > >> not
> > > >>>> branch between
> > > >>>> "use latest" and "use earliest." It applies the same
> > ListOffsetsRequest
> > > >>>> with the group creation
> > > >>>> timestamp in all cases. The difference in outcome:
> > > >>>> - skipping old data on first start
> > > >>>> - consuming surviving data after truncation
> > > >>>> is a natural consequence of what data exists in the partition at
> > that
> > > >>>> point, not a different policy
> > > >>>> being applied. The rule is always the same.
> > > >>>>
> > > >>>> Best Regards,
> > > >>>> Jiunn-Yang
> > > >>>>
> > > >>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月17日 上午9:48 寫道:
> > > >>>>>
> > > >>>>>
> > > >>>>>> Jun Rao via dev <[email protected]> 於 2026年4月17日 凌晨4:57 寫道:
> > > >>>>>>
> > > >>>>>> Also, a group is deleted after the consumer has been idle longer
> > > >>>>>> than offsets.retention.minutes. What's the semantic of
> > to_start_time
> > > >> if
> > > >>>> the
> > > >>>>>> group creation time is unavailable?
> > > >>>>>
> > > >>>>> If the group is recreated, a new creation time will be recorded.
> > Hence,
> > > >>>> it acts like a new group. Plus, it throws an exception directly if
> > the
> > > >>>> group truly has no creation time.
> > > >>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
> 

Reply via email to