Hey Huangxiang, The section of `Rejected Alternatives` may also need an update.
Current plan sounds like a reasonable one. I am fine with it. Thanks for driving this. Best Yuan On Tue, Oct 25, 2022 at 5:11 PM Hangxiang Yu <master...@gmail.com> wrote: > (Resend the mail to fix the format issue) > Hi, everyone. > > Thanks for your suggestions! > > Let me summarize the remaining questions in the thread and share my ideas > based on your suggestions: > > > 1. Should we put the new opposite interface in TypeSerializer or > TypeSerializerSnapshot ? > > Just as I replied to Dawid, I'd like to put it in > TypeSerializerSnapshot so that we could still follow the contract between > two classes and make later code migration easier based on current tools. > > Thanks Dawid for the initial suggestion and Godfrey for the additional > supplement. > > > > 2. Do we just break changes and make user codes incompatible or make sure > compatibility using a more suitable migration plan ? > > I agree with Yuan that we should make sure that user jobs still work > without modifying any codes before removing the deprecated method. > > Thanks Yuan for the migration plan. Let me try to add something to the > suggestion of Yuan: > > > a. In Step 1, I prefer to make the new interface like: > > > default TypeSerializerSchemaCompatibility<T> > resolveSchemaCompatibility( > // Use 'oldSnapshot' not 'oldSerializer' > TypeSerializerSnapshot<T> oldSnapshot) { > return INCOMPATIBLE; > } > } > > I think using 'oldSnapshot' as the parameter will make the > logic clear --- TypeSerializerSnapshot will take all responsibility for > compatibility checks. > BTW, It's also easy to migrate original check logic to this > interface. > > b. In Step 1, In addition to introducing default implementations > for both interfaces, we also need to implement the new interface in all > inner TypeSerializerSnapshots. > Users may implement their own serializers based on inner > serializers, we should make sure that the new interface of inner > TypeSerializerSnapshots is usable. > > Then I think it could work for both old custom serializers or new > custom serializers. > No matter which interface the user implements, it could always work. > Of course, we will deprecate the old interface and encourage users to > use the new one. > > > 3. Do we need to squash this with > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w ? > We will not break the compatibility based on 2, so it's not necessary > to squash them together. > > > Do you have any other suggestions ? Look forward to your reply! > > On Tue, Oct 25, 2022 at 1:31 PM Hangxiang Yu <master...@gmail.com> wrote: > > > Hi, everyone. > > > > Thanks for your suggestions! > > > > Let me summarize the remaining questions in the thread and share my ideas > > based on your suggestions: > > > > 1. Should we put the new opposite interface in TypeSerializer or > > TypeSerializerSnapshot ? > > > > Just as I replied to Dawid, I'd like to put it in TypeSerializerSnapshot > > so that we could still follow the contract between two classes and make > > later code migration easier based on current tools. > > > > Thanks Dawid for the initial suggestion and Godfrey for the additional > > supplement. > > > > > > > > 1. Do we just break changes and make user codes incompatible or make > > sure compatibility using a more suitable migration plan ? > > > > I agree with Yuan that we should make sure that user jobs still work > > without modifying any codes before removing the deprecated method. > > > > Thanks Yuan for the migration plan. Let me try to add something to the > > suggestion of Yuan: > > > > 1. In Step 1, I prefer to make the new interface like: > > > > default TypeSerializerSchemaCompatibility<T> resolveSchemaCompatibility( > > > > // Use 'oldSnapshot' not 'oldSerializer' > > TypeSerializerSnapshot<T> oldSnapshot) { > > > > return INCOMPATIBLE; > > > > } > > > > I think using 'oldSnapshot' as the parameter will make the logic clear > --- > > TypeSerializerSnapshot will take all responsibility for compatibility > > checks. > > > > BTW, It's also easy to migrate original check logic to this interface. > > > > 1. In Step 1, In addition to introducing default implementations for > > both interfaces, we also need to implement the new interface in > all inner > > TypeSerializerSnapshots. > > > > Users may implement their own serializers based on inner serializers, we > > should make sure that the new interface of inner TypeSerializerSnapshots > is > > usable. > > > > > > Then I think it could work for both old custom serializers or new custom > > serializers. > > > > No matter which interface the user implements, it could always work. > > > > Of course, we will deprecate the old interface and encourage users to use > > the new one. > > > > > > > > 1. Do we need to squash this with > > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w ? > > > > We will not break the compatibility based on 2, so it's not necessary to > > squash them together. > > > > > > Do you have any other suggestions ? Look forward to your reply! > > > > On Mon, Oct 24, 2022 at 8:30 PM Yuan Mei <yuanmei.w...@gmail.com> wrote: > > > >> Hey Hangxiang, > >> > >> Thanks for driving this issue. I've read through all the discussions and > >> suggestions in this thread, and here is my take: > >> > >> 1. I agree that the compatibility check should be done in the opposite > >> direction. > >> The current interface *causes some real issues* for users using > >> their own `TypeSerializer`, and we should help resolve this issue. > >> > >> 2. The new opposite interface would better still stay within the class > >> `TypeSerializerSnapshot`. > >> I am suggesting this mostly considering the designed contract > between > >> `TypeSerializerSnapshot` and `TypeSerializer`. I do not think it is > >> necessary to break the current contract because of this change. > >> > >> 3. It seems to me that "breaking changes" are unavoidable in each of the > >> strategies proposed. However, *I'd be very concerned and worried *to > >> make a breaking change *in the very first step*. > >> A more user-friendly way is to make sure of compatibility and mark > >> the method to drop as `@Deprecated`. After several versions of adoption, > >> carefully and publicly discussing with users to drop the method. By > >> compatibilities, I mean the following things: > >> 1). upgrading from a lower Flink version to the new version does not > >> need the user to do extra work for the change. > >> 2). In the new version, both the old and the new method should be > >> supported, and of course, we would encourage users to use the new > method. > >> For example, dropping "TypeSerializerConfigSnapshot" started to take > >> into action after almost 10 releases the time it was marked deprecated. > >> > >> Due to the above reason, I am hesitant to squash this FLIP with > >> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > >> > >> 4. What do I suggest doing? > >> > >> *STEP 1: In `TypeSerializerSnapshot`, we do something like this, and > >> keep everything else exactly the same* > >> > >> @Deprecated > >> default TypeSerializerSchemaCompatibility<T> > >> resolveSchemaCompatibility( > >> TypeSerializer<T> newSerializer) { > >> // use new method > >> return newSerializer.snapshotConfiguration(). > >> > >> resolveSerializerSchemaCompatibility(this.restoreSerializer()); > >> > >> } > >> > >> default TypeSerializerSchemaCompatibility<T> > >> resolveSerializerSchemaCompatibility( > >> TypeSerializer<T> oldSerializer) { > >> return INCOMPATIBLE; > >> } > >> > >> *STEP 2: (This is a breaking change), after several releases* > >> 1. Remove the deprecated method > >> 2. Substitute every usage with the new method > >> 3. Make the new method not have a default implementation > >> > >> STEP1 is compatible, and STEP2 is a breaking change. > >> > >> Best > >> Yuan > >> > >> > >> > >> On Fri, Oct 21, 2022 at 10:01 AM godfrey he <godfre...@gmail.com> > wrote: > >> > >>> Hi Hangxiang, Dawid, > >>> > >>> I also prefer to add method into TypeSerializerSnapshot, which looks > >>> more natural. TypeSerializerSnapshot has `Version` concept, which also > >>> can be used for compatibility. > >>> ` > >>> TypeSerializerSnapshot { > >>> > >>> TypeSerializerSchemaCompatibility<T> > resolveSchemaCompatibility( > >>> TypeSerializerSnapshot<T> oldSnapshot); > >>> > >>> } > >>> ` > >>> > >>> Best, > >>> Godfrey > >>> > >>> Dawid Wysakowicz <dwysakow...@apache.org> 于2022年10月20日周四 16:59写道: > >>> > > >>> > That's the final status we will arrive at. > >>> > > >>> > IIUC, we cannot just remove the original method but just mark it as > >>> deprecated so two methods have to exist together in the short term. > >>> > > >>> > Users may also need to migrate their own external serializers which > is > >>> a long run. > >>> > > >>> > I'd like to make jobs where users could always work without modifying > >>> any codes before removing the deprecated method so I provide a default > >>> implementation in the proposal. > >>> > > >>> > > >>> > I understand it works for old serializers, but it makes writing new > >>> serializers more complicated. It's not as simple as extending an > interface > >>> and implementing all the methods it tells you to. You have to dig into > >>> documentation and check that you must implement one of the methods > (either > >>> the old or the new one). Otherwise you end up in an ifinite loop. > >>> > > >>> > > >>> > If we break the compatibility, yes we cause some annoyance, but I > feel > >>> it's a safer option. BTW, there is a proposal around serializer that > also > >>> causes some incompatibilities so we could squash this together: > >>> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > >>> > > >>> > > >>> > I don't want to force that, but would be really happy to hear what > >>> others think. > >>> > > >>> > > >>> > Best, > >>> > > >>> > Dawid > >>> > > >>> > On 19/10/2022 04:05, Hangxiang Yu wrote: > >>> > > >>> > Hi, Dawid. > >>> > > >>> > > >>> > Thanks for your reply. > >>> > > >>> > Should we introduce the new method to the TypeSerializerSnapshot > >>> instead? > >>> > > >>> > You provided a valuable option, let me try to list the benefit and > >>> cost. > >>> > > >>> > benefit: > >>> > > >>> > TypeSerializerSnapshot still owns the responsibility of resolving > >>> schema compatibility so TypeSerializer could just pay attention to its > >>> serialization and deserialization as before. > >>> > It's very convenient to implement it based on current implementation > >>> by all information in TypeSerializerSnapshot and tools which is also > >>> helpful for users to migrate their external serializer. > >>> > > >>> > cost: > >>> > > >>> > The interface is a bit strange because we have to construct and use > >>> the snapshot of the new serializer to check the compatibility. > >>> > > >>> > I appreciate this because users could migrate more conveniently. > >>> > > >>> > For its cost, We could see TypeSerializerSnapshot as a point-in-time > >>> view of TypeSerializer used for forward compatibility checks and > persisting > >>> serializer within checkpoints. It seems reasonable to generate a > >>> point-in-time view of the serializer when we need to resolve > compatibility. > >>> > > >>> > > >>> > I'd actually be fine with breaking some external serializers and not > >>> provide a default implementation of the new method. > >>> > > >>> > That's the final status we will arrive at. > >>> > > >>> > IIUC, we cannot just remove the original method but just mark it as > >>> deprecated so two methods have to exist together in the short term. > >>> > > >>> > Users may also need to migrate their own external serializers which > is > >>> a long run. > >>> > > >>> > I'd like to make jobs where users could always work without modifying > >>> any codes before removing the deprecated method so I provide a default > >>> implementation in the proposal. > >>> > > >>> > > >>> > On Mon, Oct 17, 2022 at 10:10 PM Dawid Wysakowicz < > >>> dwysakow...@apache.org> wrote: > >>> >> > >>> >> Hi Han, > >>> >> > >>> >> I think in principle your proposal makes sense and the compatibility > >>> >> check indeed should be done in the opposite direction. > >>> >> > >>> >> However, I have two suggestions: > >>> >> > >>> >> 1. Should we introduce the new method to the TypeSerializerSnapshot > >>> >> instead? E.g. > >>> >> > >>> >> TypeSerializerSnapshot { > >>> >> > >>> >> TypeSerializerSchemaCompatibility<T> > >>> resolveSchemaCompatibility( > >>> >> TypeSerializerSnapshot<T> oldSnapshot); > >>> >> > >>> >> } > >>> >> > >>> >> I know it has the downside that we'd need to create a snapshot for > the > >>> >> new serializer during a restore, but the there is a lot of tooling > and > >>> >> subclasses in the *Snapshot stack that it won't be possible to > migrate > >>> >> to TypeSerializer stack. I have classes such as > >>> >> CompositeTypeSerializerSnapshot and alike in mind. > >>> >> > >>> >> 2. I'd actually be fine with breaking some external serializers and > >>> not > >>> >> provide a default implementation of the new method. If we don't do > >>> that > >>> >> we will have two methods with default implementation which call each > >>> >> other. It makes it also a bit harder to figure out which methods are > >>> >> mandatory to be implemented going forward. > >>> >> > >>> >> What do you think? > >>> >> > >>> >> Best, > >>> >> > >>> >> Dawid > >>> >> > >>> >> On 17/10/2022 04:43, Hangxiang Yu wrote: > >>> >> > Hi, Han. > >>> >> > > >>> >> > Both the old method and the new method can get previous and new > >>> inner > >>> >> > information. > >>> >> > > >>> >> > The new serializer will decide it just like the old serializer did > >>> before. > >>> >> > > >>> >> > The method just specify the schema compatibility result so that > >>> other > >>> >> > behaviours is same as before. > >>> >> > > >>> >> > On Fri, Oct 14, 2022 at 11:40 AM Han Yin <alexyin...@gmail.com> > >>> wrote: > >>> >> > > >>> >> >> Hi Hangxiang, > >>> >> >> > >>> >> >> Thanks for the proposal. It seems more reasonable to let the new > >>> >> >> serializer claim the compatibility in the cases you mentioned. > >>> >> >> > >>> >> >> I have but one question here. What happens in the case of > >>> >> >> “compatibleAfterMigration” after we completely reverse the > >>> direction (in > >>> >> >> step 3)? To be specific, migration from an old schema calls for > >>> the > >>> >> >> previous serializer to read bytes into state objects. How should > a > >>> new > >>> >> >> serializer decide whether the migration is possible? > >>> >> >> > >>> >> >> Best, > >>> >> >> Han > >>> >> >> > >>> >> >> On 2022/10/12 12:41:07 Hangxiang Yu wrote: > >>> >> >>> Dear Flink developers, > >>> >> >>> > >>> >> >>> I would like to start a discussion thread on FLIP-263[1] > >>> proposing to > >>> >> >>> improve the usability of resolving schema compatibility. > >>> >> >>> > >>> >> >>> Currently, the place for compatibility checks is > >>> >> >>> TypeSerializerSnapshot#resolveSchemaCompatibility > >>> >> >>> which belongs to the old serializer, There are no ways for users > >>> to > >>> >> >> specify the > >>> >> >>> compatibility with the old serializer in the new customized > >>> serializer. > >>> >> >>> > >>> >> >>> The FLIP hopes to reverse the direction of resolving schema > >>> compatibility > >>> >> >>> to improve the usability of resolving schema compatibility. > >>> >> >>> > >>> >> >>> [1] > >>> >> >>> > >>> >> >> > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility > >>> >> > > >>> >> > > >>> > > >>> > > >>> > > >>> > -- > >>> > Best, > >>> > Hangxiang. > >>> > >> > > > > -- > > Best, > > Hangxiang. > > > > > -- > Best, > Hangxiang. >