Hey Huangxiang,

The section of `Rejected Alternatives` may also need an update.

Current plan sounds like a reasonable one. I am fine with it.

Thanks for driving this.

Best
Yuan

On Tue, Oct 25, 2022 at 5:11 PM Hangxiang Yu <master...@gmail.com> wrote:

> (Resend the mail to fix the format issue)
> Hi, everyone.
>
> Thanks for your suggestions!
>
> Let me summarize the remaining questions in the thread and share my ideas
> based on your suggestions:
>
>
> 1. Should we put the new opposite interface in TypeSerializer or
> TypeSerializerSnapshot ?
>
>     Just as I replied to Dawid, I'd like to put it in
> TypeSerializerSnapshot so that we could still follow the contract between
> two classes and make later code migration easier based on current tools.
>
>     Thanks Dawid for the initial suggestion and Godfrey for the additional
> supplement.
>
>
>
> 2. Do we just break changes and make user codes incompatible or make sure
> compatibility using a more suitable migration plan ?
>
>     I agree with Yuan that we should make sure that user jobs still work
> without modifying any codes before removing the deprecated method.
>
>     Thanks Yuan for the migration plan. Let me try to add something to the
> suggestion of Yuan:
>
>
>           a. In Step 1, I prefer to make the new interface like:
>
>
>               default TypeSerializerSchemaCompatibility<T>
> resolveSchemaCompatibility(
>                       // Use 'oldSnapshot' not 'oldSerializer'
>                      TypeSerializerSnapshot<T> oldSnapshot) {
>                      return INCOMPATIBLE;
>                    }
>               }
>
>               I think using 'oldSnapshot' as the parameter will make the
> logic clear --- TypeSerializerSnapshot will take all responsibility for
> compatibility checks.
>               BTW, It's also easy to migrate original check logic to this
> interface.
>
>            b. In Step 1, In addition to introducing default implementations
> for both interfaces, we also need to implement the new interface in all
> inner TypeSerializerSnapshots.
>                Users may implement their own serializers based on inner
> serializers, we should make sure that the new interface of inner
> TypeSerializerSnapshots is usable.
>
>      Then I think it could work for both old custom serializers or new
> custom serializers.
>      No matter which interface the user implements, it could always work.
>      Of course, we will deprecate the old interface and encourage users to
> use the new one.
>
>
>  3. Do we need to squash this with
> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w ?
>      We will not break the compatibility based on 2, so it's not necessary
> to squash them together.
>
>
> Do you have any other suggestions ? Look forward to your reply!
>
> On Tue, Oct 25, 2022 at 1:31 PM Hangxiang Yu <master...@gmail.com> wrote:
>
> > Hi, everyone.
> >
> > Thanks for your suggestions!
> >
> > Let me summarize the remaining questions in the thread and share my ideas
> > based on your suggestions:
> >
> >    1. Should we put the new opposite interface in TypeSerializer or
> >    TypeSerializerSnapshot ?
> >
> > Just as I replied to Dawid, I'd like to put it in TypeSerializerSnapshot
> > so that we could still follow the contract between two classes and make
> > later code migration easier based on current tools.
> >
> > Thanks Dawid for the initial suggestion and Godfrey for the additional
> > supplement.
> >
> >
> >
> >    1. Do we just break changes and make user codes incompatible or make
> >    sure compatibility using a more suitable migration plan ?
> >
> > I agree with Yuan that we should make sure that user jobs still work
> > without modifying any codes before removing the deprecated method.
> >
> > Thanks Yuan for the migration plan. Let me try to add something to the
> > suggestion of Yuan:
> >
> >    1. In Step 1, I prefer to make the new interface like:
> >
> > default TypeSerializerSchemaCompatibility<T> resolveSchemaCompatibility(
> >
> >     // Use 'oldSnapshot' not 'oldSerializer'
> >     TypeSerializerSnapshot<T> oldSnapshot) {
> >
> >     return INCOMPATIBLE;
> >
> > }
> >
> > I think using 'oldSnapshot' as the parameter will make the logic clear
> ---
> > TypeSerializerSnapshot will take all responsibility for compatibility
> > checks.
> >
> > BTW, It's also easy to migrate original check logic to this interface.
> >
> >    1. In Step 1, In addition to introducing default implementations for
> >       both interfaces, we also need to implement the new interface in
> all inner
> >       TypeSerializerSnapshots.
> >
> > Users may implement their own serializers based on inner serializers, we
> > should make sure that the new interface of inner TypeSerializerSnapshots
> is
> > usable.
> >
> >
> > Then I think it could work for both old custom serializers or new custom
> > serializers.
> >
> > No matter which interface the user implements, it could always work.
> >
> > Of course, we will deprecate the old interface and encourage users to use
> > the new one.
> >
> >
> >
> >    1. Do we need to squash this with
> >    https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w ?
> >
> > We will not break the compatibility based on 2, so it's not necessary to
> > squash them together.
> >
> >
> > Do you have any other suggestions ? Look forward to your reply!
> >
> > On Mon, Oct 24, 2022 at 8:30 PM Yuan Mei <yuanmei.w...@gmail.com> wrote:
> >
> >> Hey Hangxiang,
> >>
> >> Thanks for driving this issue. I've read through all the discussions and
> >> suggestions in this thread, and here is my take:
> >>
> >> 1. I agree that the compatibility check should be done in the opposite
> >> direction.
> >>     The current interface *causes some real issues* for users using
> >> their own `TypeSerializer`, and we should help resolve this issue.
> >>
> >> 2. The new opposite interface would better still stay within the class
> >> `TypeSerializerSnapshot`.
> >>     I am suggesting this mostly considering the designed contract
> between
> >> `TypeSerializerSnapshot` and `TypeSerializer`. I do not think it is
> >> necessary to break the current contract because of this change.
> >>
> >> 3. It seems to me that "breaking changes" are unavoidable in each of the
> >> strategies proposed. However, *I'd be very concerned and worried *to
> >> make a breaking change *in the very first step*.
> >>     A more user-friendly way is to make sure of compatibility and mark
> >> the method to drop as `@Deprecated`. After several versions of adoption,
> >> carefully and publicly discussing with users to drop the method. By
> >> compatibilities, I mean the following things:
> >>     1). upgrading from a lower Flink version to the new version does not
> >> need the user to do extra work for the change.
> >>     2). In the new version, both the old and the new method should be
> >> supported, and of course, we would encourage users to use the new
> method.
> >>    For example, dropping "TypeSerializerConfigSnapshot" started to take
> >> into action after almost 10 releases the time it was marked deprecated.
> >>
> >>    Due to the above reason, I am hesitant to squash this FLIP with
> >> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w
> >>
> >> 4. What do I suggest doing?
> >>
> >>    *STEP 1: In `TypeSerializerSnapshot`, we do something like this, and
> >> keep everything else exactly the same*
> >>
> >>     @Deprecated
> >>     default TypeSerializerSchemaCompatibility<T>
> >> resolveSchemaCompatibility(
> >>             TypeSerializer<T> newSerializer) {
> >>         // use new method
> >>         return newSerializer.snapshotConfiguration().
> >>
> >> resolveSerializerSchemaCompatibility(this.restoreSerializer());
> >>
> >>     }
> >>
> >>     default TypeSerializerSchemaCompatibility<T>
> >> resolveSerializerSchemaCompatibility(
> >>             TypeSerializer<T> oldSerializer) {
> >>         return INCOMPATIBLE;
> >>     }
> >>
> >>   *STEP 2: (This is a breaking change), after several releases*
> >>   1. Remove the deprecated method
> >>   2. Substitute every usage with the new method
> >>   3. Make the new method not have a default implementation
> >>
> >> STEP1 is compatible, and STEP2 is a breaking change.
> >>
> >> Best
> >> Yuan
> >>
> >>
> >>
> >> On Fri, Oct 21, 2022 at 10:01 AM godfrey he <godfre...@gmail.com>
> wrote:
> >>
> >>> Hi Hangxiang, Dawid,
> >>>
> >>> I also prefer to add method into TypeSerializerSnapshot, which looks
> >>> more natural. TypeSerializerSnapshot has `Version` concept, which also
> >>> can be used for compatibility.
> >>> `
> >>> TypeSerializerSnapshot {
> >>>
> >>>         TypeSerializerSchemaCompatibility<T>
> resolveSchemaCompatibility(
> >>>              TypeSerializerSnapshot<T> oldSnapshot);
> >>>
> >>>     }
> >>> `
> >>>
> >>> Best,
> >>> Godfrey
> >>>
> >>> Dawid Wysakowicz <dwysakow...@apache.org> 于2022年10月20日周四 16:59写道:
> >>> >
> >>> > That's the final status we will arrive at.
> >>> >
> >>> > IIUC, we cannot just remove the original method but just mark it as
> >>> deprecated so two methods have to exist together in the short term.
> >>> >
> >>> > Users may also need to migrate their own external serializers which
> is
> >>> a long run.
> >>> >
> >>> > I'd like to make jobs where users could always work without modifying
> >>> any codes before removing the deprecated method so I provide a default
> >>> implementation in the proposal.
> >>> >
> >>> >
> >>> > I understand it works for old serializers, but it makes writing new
> >>> serializers more complicated. It's not as simple as extending an
> interface
> >>> and implementing all the methods it tells you to. You have to dig into
> >>> documentation and check that you must implement one of the methods
> (either
> >>> the old or the new one). Otherwise you end up in an ifinite loop.
> >>> >
> >>> >
> >>> > If we break the compatibility, yes we cause some annoyance, but I
> feel
> >>> it's a safer option. BTW, there is a proposal around serializer that
> also
> >>> causes some incompatibilities so we could squash this together:
> >>> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w
> >>> >
> >>> >
> >>> > I don't want to force that, but would be really happy to hear what
> >>> others think.
> >>> >
> >>> >
> >>> > Best,
> >>> >
> >>> > Dawid
> >>> >
> >>> > On 19/10/2022 04:05, Hangxiang Yu wrote:
> >>> >
> >>> > Hi, Dawid.
> >>> >
> >>> >
> >>> > Thanks for your reply.
> >>> >
> >>> > Should we introduce the new method to the TypeSerializerSnapshot
> >>> instead?
> >>> >
> >>> > You provided a valuable option, let me try to list the benefit and
> >>> cost.
> >>> >
> >>> > benefit:
> >>> >
> >>> > TypeSerializerSnapshot still owns the responsibility of resolving
> >>> schema compatibility so TypeSerializer could just pay attention to its
> >>> serialization and deserialization as before.
> >>> > It's very convenient to implement it based on current implementation
> >>> by all information in TypeSerializerSnapshot and tools which is also
> >>> helpful for users to migrate their external serializer.
> >>> >
> >>> > cost:
> >>> >
> >>> > The interface is a bit strange because we have to construct and use
> >>> the snapshot of the new serializer to check the compatibility.
> >>> >
> >>> > I appreciate this because users could migrate more conveniently.
> >>> >
> >>> > For its cost, We could see TypeSerializerSnapshot as a point-in-time
> >>> view of TypeSerializer used for forward compatibility checks and
> persisting
> >>> serializer within checkpoints. It seems reasonable to generate a
> >>> point-in-time view of the serializer when we need to resolve
> compatibility.
> >>> >
> >>> >
> >>> > I'd actually be fine with breaking some external serializers and not
> >>> provide a default implementation of the new method.
> >>> >
> >>> > That's the final status we will arrive at.
> >>> >
> >>> > IIUC, we cannot just remove the original method but just mark it as
> >>> deprecated so two methods have to exist together in the short term.
> >>> >
> >>> > Users may also need to migrate their own external serializers which
> is
> >>> a long run.
> >>> >
> >>> > I'd like to make jobs where users could always work without modifying
> >>> any codes before removing the deprecated method so I provide a default
> >>> implementation in the proposal.
> >>> >
> >>> >
> >>> > On Mon, Oct 17, 2022 at 10:10 PM Dawid Wysakowicz <
> >>> dwysakow...@apache.org> wrote:
> >>> >>
> >>> >> Hi Han,
> >>> >>
> >>> >> I think in principle your proposal makes sense and the compatibility
> >>> >> check indeed should be done in the opposite direction.
> >>> >>
> >>> >> However, I have two suggestions:
> >>> >>
> >>> >> 1. Should we introduce the new method to the TypeSerializerSnapshot
> >>> >> instead? E.g.
> >>> >>
> >>> >>     TypeSerializerSnapshot {
> >>> >>
> >>> >>         TypeSerializerSchemaCompatibility<T>
> >>> resolveSchemaCompatibility(
> >>> >>              TypeSerializerSnapshot<T> oldSnapshot);
> >>> >>
> >>> >>     }
> >>> >>
> >>> >> I know it has the downside that we'd need to create a snapshot for
> the
> >>> >> new serializer during a restore, but the there is a lot of tooling
> and
> >>> >> subclasses in the *Snapshot stack that it won't be possible to
> migrate
> >>> >> to TypeSerializer stack. I have classes such as
> >>> >> CompositeTypeSerializerSnapshot and alike in mind.
> >>> >>
> >>> >> 2. I'd actually be fine with breaking some external serializers and
> >>> not
> >>> >> provide a default implementation of the new method. If we don't do
> >>> that
> >>> >> we will have two methods with default implementation which call each
> >>> >> other. It makes it also a bit harder to figure out which methods are
> >>> >> mandatory to be implemented going forward.
> >>> >>
> >>> >> What do you think?
> >>> >>
> >>> >> Best,
> >>> >>
> >>> >> Dawid
> >>> >>
> >>> >> On 17/10/2022 04:43, Hangxiang Yu wrote:
> >>> >> > Hi, Han.
> >>> >> >
> >>> >> > Both the old method and the new method can get previous and new
> >>> inner
> >>> >> > information.
> >>> >> >
> >>> >> > The new serializer will decide it just like the old serializer did
> >>> before.
> >>> >> >
> >>> >> > The method just specify the schema compatibility result so that
> >>> other
> >>> >> > behaviours is same as before.
> >>> >> >
> >>> >> > On Fri, Oct 14, 2022 at 11:40 AM Han Yin <alexyin...@gmail.com>
> >>> wrote:
> >>> >> >
> >>> >> >> Hi Hangxiang,
> >>> >> >>
> >>> >> >> Thanks for the proposal. It seems more reasonable to let the new
> >>> >> >> serializer claim the compatibility in the cases you mentioned.
> >>> >> >>
> >>> >> >> I have but one question here. What happens in the case of
> >>> >> >> “compatibleAfterMigration” after we completely reverse the
> >>> direction (in
> >>> >> >> step 3)?  To be specific, migration from an old schema calls for
> >>> the
> >>> >> >> previous serializer to read bytes into state objects. How should
> a
> >>> new
> >>> >> >> serializer decide whether the migration is possible?
> >>> >> >>
> >>> >> >> Best,
> >>> >> >> Han
> >>> >> >>
> >>> >> >> On 2022/10/12 12:41:07 Hangxiang Yu wrote:
> >>> >> >>> Dear Flink developers,
> >>> >> >>>
> >>> >> >>> I would like to start a discussion thread on FLIP-263[1]
> >>> proposing to
> >>> >> >>> improve the usability of resolving schema compatibility.
> >>> >> >>>
> >>> >> >>> Currently, the place for compatibility checks is
> >>> >> >>> TypeSerializerSnapshot#resolveSchemaCompatibility
> >>> >> >>> which belongs to the old serializer, There are no ways for users
> >>> to
> >>> >> >> specify the
> >>> >> >>> compatibility with the old serializer in the new customized
> >>> serializer.
> >>> >> >>>
> >>> >> >>> The FLIP hopes to reverse the direction of resolving schema
> >>> compatibility
> >>> >> >>> to improve the usability of resolving schema compatibility.
> >>> >> >>>
> >>> >> >>> [1]
> >>> >> >>>
> >>> >> >>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility
> >>> >> >
> >>> >> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Best,
> >>> > Hangxiang.
> >>>
> >>
> >
> > --
> > Best,
> > Hangxiang.
> >
>
>
> --
> Best,
> Hangxiang.
>

Reply via email to