Hello Jun,

You're right that group creation time is the more intuitive answer at first 
glance, 
the KIP's own motivation talks about partitions that "predate the group" vs 
partitions 
"created during group runtime," which directly points to a group-lifecycle 
classifier. 
I'd like to walk through why we landed on partition age, and the trade-offs we 
considered.

We evaluated three candidate signals:

1. `by_duration:5secs`

This covers the metadata blindness window, but has issues the KIP currently 
documents 
under "Why not use `by_duration`?": 

- Client-side `now() - duration` introduces clock skew across consumers. 
- `ListOffsets` retries shift the target forward, potentially missing records 
produced between 
attempts. 
- It applies uniformly to all partitions without committed offsets, including 
pre-existing partitions 
newly assigned to the group, causing unnecessary replay.

2. Group creation time as classifier

This works cleanly when the consumer is actively running. Our concern
is the idle / late-rejoin case:

T=0:         Group created.
T=1..T=100:  Consumer idle (down, disconnected, etc.).
T=50:        Partition added during the idle window.
T=100:       Consumer resumes.

Under group creation time, the new partition is classified as new
(`50 > 0`) and reset to `earliest`, replaying everything from T=50.
But during `[T=1, T=100]`, base partitions also accumulated data that
the consumer accepts as lost — that is precisely the contract of
`auto.offset.reset=latest`. There is no principled reason to treat
the new partition differently; both contain backlog accumulated during
the same idle window.

This aligns with the "backlog is backlog” principle you raised in
the KIP-1282 thread: a `latest` user has tolerated some backlog on
every other partition during the same idle period; forcing 0-backlog
tolerance only on new partitions would be inconsistent with that
tolerance.

3. Partition age vs threshold

Partition age corresponds to the actual silent data loss window, 
the gap between partition creation and the consumer’s metadata
refresh. Within this window, data loss is genuinely silent: the
consumer had no opportunity to know about the partition. Outside this
window, missing data reflects either:

- (a) the user’s tolerated cost of running with idle consumers, or
- (b) an operational issue to surface via monitoring, not via reset policy.

We did not choose partition age because it is more elegant than group 
creation time — we chose it because its failure mode (requires a threshold) is 
less invasive than the failure mode of group creation time (overrides 
user-stated 
`latest` intent during idle periods).

Best Regards,
Jiunn-Yang

> Chia-Ping Tsai <[email protected]> 於 2026年6月13日 上午11:52 寫道:
> 
> Hi Jun,
> 
> Relying on both creation times will create an inconsistent scenario. A
> consumer that lost all offsets due to a long sleep will seek to the
> beginning for the partitions created later than the group.
> 
> That is why we initially proposed KIP-1282 to fix the inconsistency using a
> whole new policy. Since KIP-1282 couldn't reach a consensus, KIP-1327 goes
> back to using flexible configurations to prevent users from falling into
> that pitfall.
> 
> Best, Chia-Ping
> 
> Jun Rao via dev <[email protected]> 於 2026年6月13日週六 上午6:49寫道:
> 
>> Hi, Jiunn-Yang,
>> 
>> Thanks for the reply and sorry for the late reply.
>> 
>> JR1. The design of auto.offset.reset.max.age.ms still feels weird to me.
>> It
>> categorizes partitions as new or existing based on the partition creation
>> time. Intuitively, the categorization should be based on the group creation
>> time: all partitions existing when the group is created are existing and
>> all partitions created after the group creation are new partitions.
>> 
>> Jun
>> 
>> 
>> 
>> On Tue, Jun 9, 2026 at 8:51 AM 黃竣陽 <[email protected]> wrote:
>> 
>>> Hi all,
>>> 
>>> Manually bumping this thread. If there is no further
>>> discussion, I will close the vote.
>>> 
>>> Best Regards,
>>> Jiunn-Yang
>>> 
>>>> 黃竣陽 <[email protected]> 於 2026年6月1日 晚上7:16 寫道:
>>>> 
>>>> Hello Jian,
>>>> 
>>>> Thanks for your feedback,
>>>> 
>>>> Agreed, partition expansion is a common operational task, not an edge
>>>> case. I've updated the Motivation section accordingly.
>>>> 
>>>> Best Regards,
>>>> Jiunn-Yang
>>>> 
>>>>> jian fu <[email protected]> 於 2026年6月1日 下午5:49 寫道:
>>>>> 
>>>>> Hi Jiunn-Yang:
>>>>> 
>>>>> Thanks for the KIP. I think it would be useful to clarify that this
>> is a
>>>>> common scenario rather than an edge case, which further demonstrates
>> the
>>>>> need for this optimization. For example:
>>>>> A partition expansion is a common operational task in Kafka: To
>> balance
>>>>> resource utilization and cost, topics are typically created with a
>>> moderate
>>>>> default partition count. However, as traffic grows over time, it is
>>> often
>>>>> necessary to increase the number of partitions to accommodate the
>> higher
>>>>> workload.
>>>>> 
>>>>> Regards
>>>>> Jian
>>>>> 
>>>>> 黃竣陽 <[email protected]> 于2026年5月30日周六 22:31写道:
>>>>> 
>>>>>> Hello chia,
>>>>>> 
>>>>>> Thanks for the comments, I have updated the KIP!
>>>>>> 
>>>>>> Best Regards,
>>>>>> Jiunn-Yang
>>>>>> 
>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月30日 晚上8:29 寫道:
>>>>>>> 
>>>>>>> Hi Jiunn-Yang,
>>>>>>> 
>>>>>>> Would you mind removing the terms "hot" and "cold" when describing
>>>>>>> partitions in the KIP? I understand you are using them to describe
>> the
>>>>>>> "freshness" or the users' need for the records, but applying these
>>> terms
>>>>>> to
>>>>>>> the partition itself feels a bit unnatural.
>>>>>>> 
>>>>>>> After all, in this scenario, users don't really care whether a
>>> partition
>>>>>> is
>>>>>>> newly expanded or not. Their only expectation is that they won't
>>> silently
>>>>>>> lose any live records produced to the topic during their active
>>>>>> consumption.
>>>>>>> 
>>>>>>> Best, Chia-Ping
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月30日週六 下午12:30寫道:
>>>>>>> 
>>>>>>>> Hello Jun,
>>>>>>>> 
>>>>>>>> Thanks for the feedback, I have updated the KIP motivation section.
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Jiunn-Yang
>>>>>>>> 
>>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年5月30日 凌晨1:12 寫道:
>>>>>>>>> 
>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>> 
>>>>>>>>> Thanks for the reply. I think we need a stronger motivation for
>> the
>>>>>> KIP.
>>>>>>>>> 
>>>>>>>>> The KIP says "The core insight is that not all partitions without
>> a
>>>>>>>>> committed offset are the same. A newly expanded partition (hot) is
>>>>>>>>> fundamentally different from a partition the consumer has never
>> seen
>>>>>>>>> because it predates the group (cold)." Why is the hot partition
>>>>>>>>> fundamentally different from the cold?
>>>>>>>>> 
>>>>>>>>> The KIP says "The existing by_duration policy is also insufficient
>>>>>>>> because:
>>>>>>>>> 
>>>>>>>>> - The calculated seek time (now() - duration) varies across nodes
>>> due
>>>>>>>> to
>>>>>>>>> clock skew. To be safe, users must set an overly large duration,
>>>>>>>> causing
>>>>>>>>> unnecessary reprocessing.
>>>>>>>>> - On network errors, the client recalculates the seek time on
>> retry,
>>>>>>>>> shifting the target timestamp forward and risking data loss."
>>>>>>>>> 
>>>>>>>>> However, both of these situations are rare. If these issues
>> persist,
>>>>>> more
>>>>>>>>> severe problems likely exist elsewhere. Rare situations don't
>> need a
>>>>>>>> common
>>>>>>>>> solution. If users care about those rare situations, they can
>>> implement
>>>>>>>>> customized logic using
>>>>>> ConsumerRebalanceListener.onPartitionsAssigned().
>>>>>>>>> 
>>>>>>>>> Jun
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sun, May 17, 2026 at 6:50 AM 黃竣陽 <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> Hello chia,
>>>>>>>>>> 
>>>>>>>>>> Thanks for the feedback,
>>>>>>>>>> 
>>>>>>>>>>> If the creation time exists, the returned value should always be
>>>>>>>> greater
>>>>>>>>>> than or equal to zero, right?
>>>>>>>>>> I have explicitly mentioned this in the KIP.
>>>>>>>>>> 
>>>>>>>>>>>> New  Old (MetadataResponse v0–13)    positive        any
>>> field
>>>>>>>>>> absent    UnsupportedVersionException
>>>>>>>>>> 
>>>>>>>>>> The earliest point at which we can detect the version mismatch is
>>>>>> during
>>>>>>>>>> the
>>>>>>>>>> first metadata fetch after assignment, which occurs inside
>> poll().
>>>>>>>>>> Therefore, the
>>>>>>>>>> user would encounter an UnsupportedVersionException from poll().
>>> I’ll
>>>>>>>>>> clarify this in the KIP.
>>>>>>>>>> 
>>>>>>>>>> Best Regards,
>>>>>>>>>> Jiunn-Yang
>>>>>>>>>> 
>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月17日 下午4:50 寫道:
>>>>>>>>>>> 
>>>>>>>>>>> hi Jiunn
>>>>>>>>>>> 
>>>>>>>>>>>> PartitionAgeMs (int64, default -1): The age of this partition
>> in
>>>>>>>>>> milliseconds, computed server-side by the broker as
>>>>>> broker_current_time
>>>>>>>> -
>>>>>>>>>> partition_creation_time. Returns -1 if the broker does not
>> support
>>>>>> this
>>>>>>>>>> feature or the partition creation time is unknown.
>>>>>>>>>>> 
>>>>>>>>>>> If the creation time exists, the returned value should always be
>>>>>>>> greater
>>>>>>>>>> than or equal to zero, right?
>>>>>>>>>>> 
>>>>>>>>>>>> New  Old (MetadataResponse v0–13)    positive        any
>>> field
>>>>>>>>>> absent    UnsupportedVersionException
>>>>>>>>>>> 
>>>>>>>>>>> Will user encounter UnsupportedVersionException when calling
>>>>>> `poll()`?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Chia-Ping
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 2026/05/16 04:30:49 黃竣陽 wrote:
>>>>>>>>>>>> Hello Jun, chia,
>>>>>>>>>>>> 
>>>>>>>>>>>> I've updated KIP-1327 with a design change based on the
>>> discussion
>>>>>>>>>>>> feedback.
>>>>>>>>>>>> 
>>>>>>>>>>>> The updated design decouples the new-partition reset behavior
>>> from
>>>>>>>>>>>> the base auto.offset.reset policy:
>>>>>>>>>>>> 
>>>>>>>>>>>> - auto.offset.reset.max.age.ms now applies to all
>>> auto.offset.reset
>>>>>>>>>> values
>>>>>>>>>>>> (latest, earliest, by_duration, none).
>>>>>>>>>>>> - For new ("hot") partitions, the consumer resets to
>>>>>>>>>> auto.offset.reset.new.partitions
>>>>>>>>>>>> config setting
>>>>>>>>>>>> - For existing ("cold") partitions, the base auto.offset.reset
>>>>>> policy
>>>>>>>>>> continues
>>>>>>>>>>>> to apply unchanged.
>>>>>>>>>>>> - The new-partition reset behavior is represented by a separate
>>>>>>>>>> internal config
>>>>>>>>>>>> (auto.offset.reset.new.partitions, currently fixed to
>> earliest).
>>>>>> This
>>>>>>>>>> decoupled design makes
>>>>>>>>>>>> it straightforward to promote the behavior to a public
>>> user-facing
>>>>>>>>>> configuration in a future KIP.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月16日 清晨7:46 寫道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> hi Jun
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I see what you mean now. The proposal from me is listed below:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1) Add auto.offset.reset.new.partitions with a default value
>> of
>>>>>>>>>> earliest. It fixes the data loss from both by_duration and
>> latest,
>>> and
>>>>>>>> it
>>>>>>>>>> does not change the logic of auto.offset.reset=earliest.
>>>>>>>>>>>>> 2) Mark auto.offset.reset.new.partitions as an internal
>>>>>>>>>> configuration. auto.offset.reset.new.partitions=earliest already
>>>>>>>>>> addresses the issue, and we can discuss the use cases of other
>>> values
>>>>>>>> in a
>>>>>>>>>> separate KIP.
>>>>>>>>>>>>> 3) Both configs, auto.offset.reset.new.partitions and
>>>>>>>>>> auto.offset.reset.latest.max.age.ms, will be applied to all for
>>>>>>>>>> consistency.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 2026/05/15 20:53:20 Jun Rao via dev wrote:
>>>>>>>>>>>>>> Hi, Chia-Ping,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. In the motivation section, the KIP says "When a Kafka
>> topic
>>> is
>>>>>>>>>> expanded
>>>>>>>>>>>>>> with new partitions, consumers using the latest auto offset
>>> reset
>>>>>>>>>> policy
>>>>>>>>>>>>>> will silently miss all records produced to those partitions
>>> before
>>>>>>>> the
>>>>>>>>>>>>>> consumer discovers them.". If a user sets
>>>>>>>>>>>>>> auto.offset.reset=by_duration=1sec, the same record loss
>> issue
>>>>>> could
>>>>>>>>>> also
>>>>>>>>>>>>>> happen, right?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. I was thinking auto.offset.reset.new.partitions will take
>>> the
>>>>>>>> same
>>>>>>>>>>>>>> values as auto.offset.reset. So a user could set it
>>> by_duration if
>>>>>>>>>> needed.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jun
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai <
>>>>>> [email protected]
>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> hi Jun
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks for the feedback. I might be missing something
>>> important
>>>>>>>> from
>>>>>>>>>> your
>>>>>>>>>>>>>>> suggestion, so please bear with me as I try to clarify with
>> a
>>> few
>>>>>>>>>> questions:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 1. Is there a strong use case for extending this logic to
>>> other
>>>>>>>> reset
>>>>>>>>>>>>>>> policies? Unlike latest, policies like earliest or
>> by_duration
>>>>>>>> don't
>>>>>>>>>> seem
>>>>>>>>>>>>>>> to suffer from the same silent data loss issue when a
>>> partition
>>>>>> is
>>>>>>>>>> expanded.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 2. What values would we expect users to configure for
>>>>>>>>>>>>>>> auto.offset.reset.new.partitions? If they set it to
>> earliest
>>> or
>>>>>>>>>> latest,
>>>>>>>>>>>>>>> we might run into the exact same edge cases. For example,
>> if a
>>>>>>>>>> consumer is
>>>>>>>>>>>>>>> offline for a while and a new partition is created during
>> that
>>>>>>>>>> downtime,
>>>>>>>>>>>>>>> the user might actually want to skip to latest when
>> resuming,
>>>>>>>> rather
>>>>>>>>>> than
>>>>>>>>>>>>>>> reading from earliest just because the partition is
>>> technically
>>>>>>>>>> "new" to
>>>>>>>>>>>>>>> the group.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> This is exactly why we opted for introducing a max.age
>>> threshold.
>>>>>>>> It
>>>>>>>>>> gives
>>>>>>>>>>>>>>> users a time-bound way to define what is genuinely "hot/new"
>>> and
>>>>>>>>>> what is
>>>>>>>>>>>>>>> just an old partition they haven't seen yet.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 2026/05/14 20:48:09 Jun Rao via dev wrote:
>>>>>>>>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I find auto.offset.reset.latest.max.age a bit weird. It
>> only
>>>>>>>>>> applies when
>>>>>>>>>>>>>>>> auto.offset.reset is latest. However, it seems that the
>>>>>> motivation
>>>>>>>>>>>>>>> equally
>>>>>>>>>>>>>>>> applies when auto.offset.reset is set to other values like
>>>>>>>>>> by_duration.
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> intention is that we want to have a separate way to control
>>>>>> newly
>>>>>>>>>> created
>>>>>>>>>>>>>>>> partitions vs existing partitions when the group starts.
>>> Have we
>>>>>>>>>>>>>>> considered
>>>>>>>>>>>>>>>> adding a new config like auto.offset.reset.new.partitions?
>>> If
>>>>>>>> this
>>>>>>>>>> new
>>>>>>>>>>>>>>>> config is not set, the offset reset policy defaults to the
>>>>>> policy
>>>>>>>>>> used
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> existing partitions. The user could set it explicitly to
>>>>>> customize
>>>>>>>>>> the
>>>>>>>>>>>>>>>> behavior for new partitions.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Jun
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected]>
>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I’d like to manually bump this thread.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for the feedback.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> DJ01/DJ02:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> MetadataResponse bumps from v13 to v14. The
>>> PartitionMetadata
>>>>>>>>>> struct
>>>>>>>>>>>>>>>>> gains a new
>>>>>>>>>>>>>>>>>> field PartitionAgeMs (int64, default -1), computed
>>> server-side
>>>>>>>> by
>>>>>>>>>> the
>>>>>>>>>>>>>>>>> broker as
>>>>>>>>>>>>>>>>>> broker_current_time - partition_creation_time.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Also add the consumer heartbeat flow. when
>>> MembershipManager
>>>>>>>>>> detects
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> newly assigned
>>>>>>>>>>>>>>>>>> partition, it explicitly invalidates the metadata for the
>>>>>>>> affected
>>>>>>>>>>>>>>> topic
>>>>>>>>>>>>>>>>> and forces a fresh MetadataRequest
>>>>>>>>>>>>>>>>>> before making the offset reset decision, even if the
>> topic
>>> ID
>>>>>> is
>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>>> in the cache.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> MB0:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The consumer learns the broker's maximum supported
>>>>>>>>>> MetadataResponse
>>>>>>>>>>>>>>>>> version via the
>>>>>>>>>>>>>>>>>> ApiVersions negotiation at connection time. If the
>>> negotiated
>>>>>>>>>>>>>>> version is
>>>>>>>>>>>>>>>>> unsupported, the consumer
>>>>>>>>>>>>>>>>>> knows the broker does not support PartitionAgeMs at all
>> and
>>>>>> can
>>>>>>>>>>>>>>> throw an
>>>>>>>>>>>>>>>>> UnsupportedVersionException
>>>>>>>>>>>>>>>>>> immediately, rather than silently falling back to latest
>>> and
>>>>>>>>>> risking
>>>>>>>>>>>>>>>>> data loss without any operator-visible signal.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> MB1/MB2/MB3:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I have addressed these changes in the KIP.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 下午4:04
>>> 寫道:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> hi David
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I agree with the direction of moving the 'age'
>> resolution
>>>>>> from
>>>>>>>>>> the
>>>>>>>>>>>>>>>>> Heartbeat API to the Metadata API to keep the control
>> plane
>>>>>>>> clean.
>>>>>>>>>> The
>>>>>>>>>>>>>>> main
>>>>>>>>>>>>>>>>> trade-off, as we noted before, is introducing inter-broker
>>>>>> clock
>>>>>>>>>> skew.
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> Group Coordinator approach provided a single source of
>> truth
>>>>>> for
>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> However, realistically, this time skew should be
>>> negligible.
>>>>>>>>>> Given
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> the max.age threshold will likely be configured in minutes
>>> or
>>>>>>>>>> hours, a
>>>>>>>>>>>>>>>>> typical NTP skew (in milliseconds) between brokers won't
>>> impact
>>>>>>>> the
>>>>>>>>>>>>>>>>> fallback decision.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> David Jacot via dev <[email protected]> 於
>> 2026年4月29日
>>>>>>>> 下午3:29
>>>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks for the KIP!
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Sorry, I haven't really followed the previous
>>> conversation
>>>>>>>> but I
>>>>>>>>>>>>>>> took a
>>>>>>>>>>>>>>>>>>>> quick look at this one.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> DJ01: I don't clearly understand the flow with the
>>>>>>>>>>>>>>>>> ConsumerGroupHeartbeat
>>>>>>>>>>>>>>>>>>>> API after reading the KIP. There is a new boolean; the
>>> KIP
>>>>>>>>>> states
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>> partition ages are returned only when this boolean is
>>> set.
>>>>>>>>>>>>>>> Implicitly,
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> means that when the consumer receives a new partition,
>> it
>>>>>> will
>>>>>>>>>>>>>>> issue a
>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>> HB request with the boolean set to receive the ages. Is
>>> my
>>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>>>>> correct? We should perhaps clarify the flow and also
>>> explain
>>>>>>>>>> how it
>>>>>>>>>>>>>>>>> fits
>>>>>>>>>>>>>>>>>>>> into the existing flow (e.g. list offsets, fetch
>> offsets,
>>>>>>>> etc.).
>>>>>>>>>>>>>>>>>>>> DJ02: It my understanding is correct, I wonder if
>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API is the right place for
>>> this
>>>>>>>> given
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> a new
>>>>>>>>>>>>>>>>>>>> round trip is done anyway. Alternatively, it could
>> simply
>>>>>>>>>> include
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> metadata. Generally, we should be rather cautious about
>>> not
>>>>>>>>>>>>>>> overloading
>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API with unrelated concepts.
>>> The
>>>>>>>> API
>>>>>>>>>> is
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>> control plane API for assigning or revoking partitions.
>>> The
>>>>>>>> fact
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>> don't want to add it to the corresponding Streams API
>>> also
>>>>>>>>>> suggests
>>>>>>>>>>>>>>>>>>>> something is not quite right. What would we do if we
>>> want to
>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>> Streams in the future?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani via
>>> dev
>>>>>> <
>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Hi Jiunn,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thank you for this great kip. Good to know about the
>>> gap.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> mb-0 - why a new v2 version bump for
>>> RequestPartitionAges
>>>>>>>>>> field.
>>>>>>>>>>>>>>> Can a
>>>>>>>>>>>>>>>>>>>>> tagged field (for ex: on response, PartitionAges on
>>>>>>>>>>>>>>> TopicPartitions)
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> used here and avoid version bump?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> mb-1 - For the new config, is there a recommended
>> value
>>> or
>>>>>> a
>>>>>>>>>>>>>>> ConfigDef
>>>>>>>>>>>>>>>>>>>>> validator? Probably it should based on the
>>>>>>>> metadata.max.age.ms
>>>>>>>>>> ?
>>>>>>>>>>>>>>>>> Sizing
>>>>>>>>>>>>>>>>>>>>> instructions can be part of javadocs I guess.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> mb-2 - (minor) As there are no changes to Kafka
>> Streams,
>>>>>>>> would
>>>>>>>>>> it
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>>>>>>> to add this new config
>> auto.offset.reset.latest.max.age
>>> to
>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> StreamsConfig block list
>>>>>>>>>>>>>>> (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS)
>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>> clear warning, incase users configure it? This is the
>>> most
>>>>>>>>>>>>>>> familiar
>>>>>>>>>>>>>>>>>>>>> consumer config and users might easily mistakenly
>>> configure
>>>>>>>>>> it. Or
>>>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>>>>>>>> it's not worth it to add.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> mb-3 - (minor) The phrasing "the consumer falls back
>> to
>>>>>>>>>> earliest"
>>>>>>>>>>>>>>>>> reads as
>>>>>>>>>>>>>>>>>>>>> if the config were being changed per-partition which
>>> isn't
>>>>>>>>>>>>>>> supported.
>>>>>>>>>>>>>>>>> May
>>>>>>>>>>>>>>>>>>>>> be rephrasing to something like "consumer resolves the
>>>>>>>> initial
>>>>>>>>>>>>>>>>> position to
>>>>>>>>>>>>>>>>>>>>> start offset for that partition" as if earliest was
>>> applied
>>>>>>>> to
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>> partition only and auto.offset.reset config is
>>> unchanged.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Murali
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <
>>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Hi chia,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> I have updated the KIP to include this change.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月28日
>>> 晚上8:03
>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> hi Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> chia_0: Should we expose the partition creation time
>>> via
>>>>>>>> the
>>>>>>>>>>>>>>> Admin
>>>>>>>>>>>>>>>>> API?
>>>>>>>>>>>>>>>>>>>>>> I assume it would be valuable for users to diagnose
>> and
>>>>>>>>>>>>>>> troubleshoot
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> behavior of auto.offset.reset.latest.max.age
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on KIP-1327
>>> Prevent
>>>>>> Hot
>>>>>>>>>> Data
>>>>>>>>>>>>>>>>> Loss
>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> Partition Expansion for Latest Policy
>>>>>>>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> 
>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> This proposal aims to introduces
>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age,
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>> consumer config that lets the
>>>>>>>>>>>>>>>>>>>>>>>> latest reset policy distinguish newly expanded
>> (hot)
>>>>>>>>>> partitions
>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>> long-existing (cold) ones. Partitions
>>>>>>>>>>>>>>>>>>>>>>>> younger than the configured threshold automatically
>>> fall
>>>>>>>>>> back
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> earliest, preventing silent data loss
>>>>>>>>>>>>>>>>>>>>>>>> during topic expansion without forcing a full
>>> historical
>>>>>>>>>>>>>>> reprocess.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 

Reply via email to