Thanks for the KIP. Overall LGTM.

I think you can start a VOTE.


-Matthias

On 2/22/23 5:56 PM, Fq Public wrote:
Just wanted to bump this thread for visbility.
Thanks to everyone who has participated in the discussion so far.

Thanks,
Farooq

On 2023/02/14 19:32:53 Guozhang Wang wrote:
Thanks Farooq, that looks good to me.

Guozhang

On Sun, Feb 12, 2023 at 9:01 AM Dharin Shah <dh...@gmail.com> wrote:

Hello Farooq,

This is actually a great idea, we have dealt with this by using an array
instead of a set.
+1 to this :)

Thank you,
Dharin

On Sun, Feb 12, 2023 at 8:11 PM Fq Public <fq...@gmail.com> wrote:

Hi Guozhang,

Thanks for reading over my proposal!

Regarding the format, I'm just thinking if we can change the type of
"INT newDataLength" to UINT32?

Good idea, I've updated the KIP to reflect UINT32 since it makes clear the
value can never be less than zero.

`.equals` default implementation on Object is by reference, so if the
groupBy did not generate a new object, that may still pass. This means that
even if user does not implement the `.equals` function, if the same object
is returned then this feature would still be triggered, is that correct?

Correct, I've updated the KIP to call out this edge-case clearly as
follows:

Since the default `.equals` implementation for an `Object`  is by
reference, if a user's `groupBy` returns the same reference for the key,
then the oldKey and the newKey will naturally `.equals`  each other. This
will result in a single event being sent to the repartition topic. This
change in behaviour should be considered a "bug-fix" rather than a
"breaking change" as the semantics of the operation remain unchanged, the
only thing that changes for users is they no longer see transient
"inconsistent" states.  In the worst case, users in this situation will
need to update any strict tests that check specifically for the presence of
transient "inconsistent" states.

What do you think?

Thanks,
Farooq

On 2023/02/07 18:02:24 Guozhang Wang wrote:
Hello Farooq,

Thanks for the very detailed proposal! I think this is a great idea.
Just a few thoughts:

1. I regret that we over-optimized the Changed serde format for
footprint while making it less extensible. It seems to me that a two
rolling bounce migration is unavoidable.. Regarding the format, I'm
just thinking if we can change the type of "INT newDataLength" to
UINT32?

2. `.equals` default implementation on Object is by reference, so if
the groupBy did not generate a new object, that may still pass. This
means that even if user does not implement the `.equals` function, if
the same object is returned then this feature would still be
triggered, is that correct?


Guozhang

On Mon, Feb 6, 2023 at 5:05 AM Fq Public <fq...@gmail.com> wrote:

Hi everyone,

I'd like to share a new KIP for discussion:
https://cwiki.apache.org/confluence/x/P5VbDg

This could be considered mostly as a "bug fix" but we wanted to raise
a KIP
for discussion because it involves changes to the serialization format
of
an internal topic which raises backward compatibility considerations.

Please take a look and let me know what you think.

Thanks,
Farooq

Reply via email to