I am not sure if we can/should change the behavior of existing "latest/earliest" due to backward compatibility concerns. While I agree that many users might not know the fine details how both behave, it would still be a change that might break other people that do understand the details and rely on it.

I also agree that having both "latest" and "safe_latest" might be difficult, as users might not know which one to choose?

Maybe we should have two configs instead of one? `auto.offset.reset`, as the name suggests, resets the offset automatically, and thus it's current behavior is actually well defined and sound. -- What seems to be missing is an `init.offset` config that is only used if there is _no_ committed offsets, but it's not used when the consumer has a position already (either via getting a committed offset or via seek())?


For the original use-case you mentioned, that you want to start from "latest" when the app starts, but if a new partition is added you want to start from "earliest" it seem that the right approach would be to actually configure "earliest", and when the app is deployed for the first time, use a `seekToEnd()` to avoid triggering auto-offset-reset?


Thoughts?


-Matthias


On 7/1/22 6:03 AM, hudeqi wrote:
Thanks for your attention and reply.
Having chatted with Guozhang Wang at KAFKA-12478 before, I came up with an idea 
similar to yours. It's just not implemented on the client side, but on the 
server side: Firstly, find out all the groups subscribed to this topic before 
extending partitions, and then let these groups commit an initial offset 0 for 
these new expanded partitions (also using adminClient). Finally, the real 
process of adding partitions is carried out. In this way, the problem can also 
be completely solved.

Best,
hudeqi

"Matthew Howlett" <m...@confluent.io.INVALID>写道:
My first reaction also is that the proposed configuration is surely too
complicated.

It seems like an ideal solution from a usability perspective (always a good
place to start) would be if the consumer just automatically behaved in this
way. To make that work:
1. auto.offset.reset=latest would need to behave like
auto.offset.reset=earliest in the case where a consumer is in a group, and
is assigned a newly created partition. This might seem a bit too "magic",
but from the perspective of the group, I think it makes conceptual sense
and people wouldn't find it surprising. Also, I don't think anyone would be
relying on the old behavior.
2. The group would need to detect the newly created partitions and
rebalance pretty quickly (this is not the case currently). The longer the
delay, the more tenuous the idea of changing the auto.offset.reset behavior
in this special circumstance.

I have a feeling this approach has implementation challenges (haven't
thought deeply), just throwing it out there.


On Wed, Jun 29, 2022 at 4:57 AM David Jacot <da...@apache.org> wrote:

Thanks for the KIP.

I read it and I am also worried by the complexity of the new
configurations. They are not easy to grasp. I need to digest it a bit more,
I think.

Best,
David

Le mer. 29 juin 2022 à 02:25, Matthias J. Sax <mj...@apache.org> a écrit :

Thanks for the KIP.

I don't think I fully digested the proposal yet, but my first reaction
is: this is quite complicated. Frankly, I am worried about complexity
and usability.

Especially the option `safe_latest` is a "weird" one IMHO, and `nearest`
is even more complex.

The problem at hand (as I understand it from the Jira) is a real one,
but I am wondering if it would be something that should be addressed by
the application? If you pass in strategy `none`, and a new partition is
added, you can react to it by custom code. For regular startup you can
still go with "latest" to avoid reprocessing the history.

Adding "latest/earliest_on_start" seems useful, as it seems to also
address https://issues.apache.org/jira/browse/KAFKA-3370


-Matthias


On 6/7/22 12:55 AM, hudeqi wrote:
I think so too, what about Guozhang Wang and Luke Chen? Can I initiate
a
voting process?

Best,
hudeqi

&gt; -----原始邮件-----
&gt; 发件人: "邓子明" <dengzim...@growingio.com>
&gt; 发送时间: 2022-06-07 10:23:37 (星期二)
&gt; 收件人: dev@kafka.apache.org
&gt; 抄送:
&gt; 主题: Re: [DISCUSS] KIP-842: Add richer group offset reset
mechanisms
&gt;
</dengzim...@growingio.com>


Reply via email to