Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

wenlong.lwl Thu, 09 Dec 2021 21:02:38 -0800

Hi, Timo, thanks for updating the doc.

I have a comment on plan migration:
I think we may need to add a version field for every exec node when
serialising. In earlier discussions, I think we have a conclusion that
treating the version of plan as the version of node, but in this case it
would be broken.
Take the following example in FLIP into consideration, there is a bad case:
when in 1.17, we introduced an incompatible version 3 and dropped version
1, we can only update the version to 2, so the version should be per exec
node.


ExecNode version *1* is not supported anymore. Even though the state is
actually compatible. The plan restore will fail with a helpful exception
that forces users to perform plan migration.

COMPILE PLAN '/mydir/plan_new.json' FROM '/mydir/plan_old.json';

The plan migration will safely replace the old version *1* with *2. The
JSON plan flinkVersion changes to 1.17.*


Best,

Wenlong

On Thu, 9 Dec 2021 at 18:36, Timo Walther <[email protected]> wrote:

> Hi Jing and Godfrey,
>
> I had another iteration over the document. There are two major changes:
>
> 1. Supported Flink Upgrade Versions
>
> I got the feedback via various channels that a step size of one minor
> version is not very convenient. As you said, "because upgrading to a new
> version is a time-consuming process". I rephrased this section:
>
> Upgrading usually involves work which is why many users perform this
> task rarely (e.g. only once per year). Also skipping a versions is
> common until a new feature has been introduced for which is it worth to
> upgrade. We will support the upgrade to the most recent Flink version
> from a set of previous versions. We aim to support upgrades from the
> last 2-3 releases on a best-effort basis; maybe even more depending on
> the maintenance overhead. However, in order to not grow the testing
> matrix infinitely and to perform important refactoring if necessary, we
> only guarantee upgrades with a step size of a single minor version (i.e.
> a cascade of upgrades).
>
> 2. Annotation Design
>
> I also adopted the multiple annotations design for the previous
> supportPlanFormat. So no array of versions anymore. I reworked the
> section, please have a look with updated examples:
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489#FLIP190:SupportVersionUpgradesforTableAPI&SQLPrograms-ExecNodeTests
>
> I also got the feedback offline that `savepoint` might not be the right
> terminology for the annotation. I changed that to minPlanVersion and
> minStateVersion.
>
> Let me know what you think.
>
> Regards,
> Timo
>
>
>
> On 09.12.21 08:44, Jing Zhang wrote:
> > Hi Timo,
> > Thanks a lot for driving this discussion.
> > I believe it could solve many problems what we are suffering in
> upgrading.
> >
> > I only have a little complain on the following point.
> >
> >> For simplification of the design, we assume that upgrades use a step
> size
> > of a single minor version. We don't guarantee skipping minor versions
> (e.g.
> > 1.11 to
> > 1.14).
> >
> > In our internal production environment, we follow up with the community's
> > latest stable release version almost once a year because upgrading to a
> new
> > version is a time-consuming process.
> > So we might missed 1~3 version after we upgrade to the latest version.
> This
> > might also appears in other company too.
> > Could we guarantee FLIP-190 work if we skip minor versions less than
> > specified threshold?
> > Then we could know which version is good for us when prepare upgrading.
> >
> > Best,
> > Jing Zhang
> >
> > godfrey he <[email protected]> 于2021年12月8日周三 22:16写道：
> >
> >> Hi Timo,
> >>
> >> Thanks for the explanation, it's much clearer now.
> >>
> >> One thing I want to confirm about `supportedPlanFormat `
> >> and `supportedSavepointFormat `:
> >> `supportedPlanFormat ` supports multiple versions,
> >> while `supportedSavepointFormat ` supports only one version ?
> >> A json plan  can be deserialized by multiple versions
> >> because default value will be set for new fields.
> >> In theory, a Savepoint can be restored by more than one version
> >> of the operators even if a state layout is changed,
> >> such as deleting a whole state and starting job with
> >> `allowNonRestoredState`=true.
> >> I think this is a corner case, and it's hard to understand comparing
> >> to `supportedPlanFormat ` supporting multiple versions.
> >> So, for most cases, when the state layout is changed, the savepoint is
> >> incompatible,
> >> and `supportedSavepointFormat` and version need to be changed.
> >>
> >> I think we need a detail explanation about the annotations change story
> in
> >> the java doc of  `ExecNodeMetadata` class for all developers
> >> (esp. those unfamiliar with this part).
> >>
> >> Best,
> >> Godfrey
> >>
> >> Timo Walther <[email protected]> 于2021年12月8日周三 下午4:57写道：
> >>>
> >>> Hi Wenlong,
> >>>
> >>> thanks for the feedback. Great that we reached consensus here. I will
> >>> update the entire document with my previous example shortly.
> >>>
> >>>   > if we don't update the version when plan format changes, we can't
> >>> find that the plan can't not be deserialized in 1.15
> >>>
> >>> This should not be a problem as the entire plan file has a version as
> >>> well. We should not allow reading a 1.16 plan in 1.15. We can throw a
> >>> helpful exception early.
> >>>
> >>> Reading a 1.15 plan in 1.16 is possible until we drop the old
> >>> `supportedPlanFormat` from one of used ExecNodes. Afterwards all
> >>> `supportedPlanFormat` of ExecNodes must be equal or higher then the
> plan
> >>> version.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> On 08.12.21 03:07, wenlong.lwl wrote:
> >>>> Hi, Timo,  +1 for multi metadata.
> >>>>
> >>>> The compatible change I mean in the last email is the slight state
> >> change
> >>>> example you gave, so we have got  consensus on this actually, IMO.
> >>>>
> >>>> Another question based on the example you gave:
> >>>> In the example "JSON node gets an additional property in 1.16", if we
> >> don't
> >>>> update the version when plan format changes, we can't find that the
> >> plan
> >>>> can't not be deserialized in 1.15, although the savepoint state is
> >>>> compatible.
> >>>> The error message may be not so friendly if we just throw
> >> deserialization
> >>>> failure.
> >>>>
> >>>> On Tue, 7 Dec 2021 at 16:49, Timo Walther <[email protected]> wrote:
> >>>>
> >>>>> Hi Wenlong,
> >>>>>
> >>>>>    > First,  we add a newStateLayout because of some improvement in
> >> state, in
> >>>>>    > order to keep compatibility we may still keep the old state for
> >> the
> >>>>> first
> >>>>>    > version. We need to update the version, so that we can generate
> a
> >> new
> >>>>>    > version plan for the new job and keep the exec node compatible
> >> with
> >>>>> the old
> >>>>>    > version plan.
> >>>>>
> >>>>> The problem that I see here for contributors is that the actual
> update
> >>>>> of a version is more complicated than just updating an integer value.
> >> It
> >>>>> means copying a lot of ExecNode code for a change that happens
> locally
> >>>>> in an operator. Let's assume multiple ExecNodes use a similar
> >> operator.
> >>>>> Why do we need to update all ExecNode versions, if the operator
> itself
> >>>>> can deal with the incompatibility. The ExecNode version is meant for
> >>>>> topology changes or fundamental state changes.
> >>>>>
> >>>>> If we don't find consensus on this topic, I would at least vote for
> >>>>> supporting multiple annotations for an ExecNode class. This way we
> >> don't
> >>>>> need to copy code but only add two ExecNode annotations with
> different
> >>>>> ExecNode versions.
> >>>>>
> >>>>>    > Maybe we can add support for this case :
> >>>>>    > when an exec node is changed in 1.16, but is compatible with
> 1.15,
> >>>>>    > we can use the node of 1.16 to deserialize the plan of 1.15.
> >>>>>
> >>>>> If the ExecNode is compatible, there is no reason to increase the
> >>>>> ExecNode version.
> >>>>>
> >>>>>
> >>>>>
> >>>>> I tried to come up with a reworked solution to make all parties
> happy:
> >>>>>
> >>>>> 1. Let's assume the following annotations:
> >>>>>
> >>>>> supportedPlanFormat = [1.15]
> >>>>>
> >>>>> supportedSavepointFormat = 1.15
> >>>>>
> >>>>> we drop `added` as it is equal to `supportedSavepointFormat`
> >>>>>
> >>>>> 2. Multiple annotations over ExecNodes are possible:
> >>>>>
> >>>>> // operator state changes
> >>>>>
> >>>>> // initial introduction in 1.15
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.15)
> >>>>>
> >>>>> // state layout changed slightly in 1.16
> >>>>> // - operator migration is possible
> >>>>> // - operator supports state of both versions and will perform
> >> operator
> >>>>> state migration
> >>>>> // - new plans will get new ExecNode version
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.15)
> >>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.16)
> >>>>>
> >>>>> // we force a plan migration in 1.17
> >>>>> // - we assume that all operator states have been migrated in the
> >>>>> previous version
> >>>>> // - we can safely replace the old version `1` with `2` and only keep
> >>>>> the new savepoint format
> >>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.16)
> >>>>>
> >>>>>
> >>>>> // plan changes
> >>>>>
> >>>>> // initial introduction in 1.15
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.15)
> >>>>>
> >>>>> // JSON node gets an additional property in 1.16
> >>>>> // e.g. { some-prop: 42 } -> { some-prop: 42, some-flag: false}
> >>>>> // - ExecNode version does not change
> >>>>> // - ExecNode version only changes when topology or state is affected
> >>>>> // - we support both JSON plan formats, the old and the newest one
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=[1.15,
> 1.16],
> >>>>> supportedSavepointFormat=1.15)
> >>>>>
> >>>>> // we force a plan migration in 1.17
> >>>>> // - now we only support 1.16 plan format
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.16,
> >>>>> supportedSavepointFormat=1.15)
> >>>>>
> >>>>>
> >>>>> // topology change
> >>>>>
> >>>>> // initial introduction in 1.15
> >>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.15)
> >>>>>
> >>>>> // complete new class structure in 1.16 annotated with
> >>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> >>>>> supportedSavepointFormat=1.16)
> >>>>>
> >>>>>
> >>>>>
> >>>>> What do you think?
> >>>>>
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 07.12.21 08:20, wenlong.lwl wrote:
> >>>>>> Maybe we can add support for this case :
> >>>>>>            when an exec node is changed in 1.16, but is compatible
> >> with
> >>>>> 1.15,
> >>>>>> we can use the node of 1.16 to deserialize the plan of 1.15.
> >>>>>> By this way, we don't need to fork the code if the change is
> >> compatible,
> >>>>>> and can avoid fork code frequently.
> >>>>>>
> >>>>>>
> >>>>>> Best,
> >>>>>> Wenlong
> >>>>>>
> >>>>>>
> >>>>>> On Tue, 7 Dec 2021 at 15:08, wenlong.lwl <[email protected]>
> >>>>> wrote:
> >>>>>>
> >>>>>>> hi, Timo, I would prefer to update the version every time we change
> >> the
> >>>>>>> state layer too.
> >>>>>>>
> >>>>>>> It could be possible that we change the exec node in 2 steps:
> >>>>>>> First,  we add a newStateLayout because of some improvement in
> >> state, in
> >>>>>>> order to keep compatibility we may still keep the old state for the
> >>>>> first
> >>>>>>> version. We need to update the version, so that we can generate a
> >> new
> >>>>>>> version plan for the new job and keep the exec node compatible with
> >> the
> >>>>> old
> >>>>>>> version plan.
> >>>>>>> After some versions, we may remove the old version state layout and
> >>>>> clean
> >>>>>>> up the deprecated code. We still need to update the version, so
> >> that we
> >>>>> can
> >>>>>>> verify that we are compatible with the plan after the first change,
> >> but
> >>>>> not
> >>>>>>> compatible with the plan earlier.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Wenlong
> >>>>>>>
> >>>>>>> On Mon, 6 Dec 2021 at 21:27, Timo Walther <[email protected]>
> >> wrote:
> >>>>>>>
> >>>>>>>> Hi Godfrey,
> >>>>>>>>
> >>>>>>>>     > design makes thing more complex.
> >>>>>>>>
> >>>>>>>> Yes, the design might be a bit more complex. But operator
> >> migration is
> >>>>>>>> way easier than ExecNode migration at a later point in time for
> >> code
> >>>>>>>> maintenance. We know that ExecNodes can become pretty complex.
> Even
> >>>>>>>> though we have put a lot of code into `CommonXXExecNode` it will
> >> be a
> >>>>>>>> lot of work to maintain multiple versions of ExecNodes. If we can
> >> avoid
> >>>>>>>> this with operator state migration, this should always be
> preferred
> >>>>> over
> >>>>>>>> a new ExecNode version.
> >>>>>>>>
> >>>>>>>> I'm aware that operator state migration might only be important
> for
> >>>>>>>> roughly 10 % of all changes. A new ExecNode version will be used
> >> for
> >>>>> 90%
> >>>>>>>> of all changes.
> >>>>>>>>
> >>>>>>>>     > If there are multiple state layouts, which layout the
> ExecNode
> >>>>> should
> >>>>>>>> use?
> >>>>>>>>
> >>>>>>>> It is not the responsibility of the ExecNode to decide this but
> the
> >>>>>>>> operator. Something like:
> >>>>>>>>
> >>>>>>>> class X extends ProcessFunction {
> >>>>>>>>       ValueState<A> oldStateLayout;
> >>>>>>>>       ValueState<B> newStateLayout;
> >>>>>>>>
> >>>>>>>>       open() {
> >>>>>>>>         if (oldStateLayout.get() != null) {
> >>>>>>>>           performOperatorMigration();
> >>>>>>>>         }
> >>>>>>>>         useNewStateLayout();
> >>>>>>>>       }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> Operator migration is meant for smaller "more local" changes
> >> without
> >>>>>>>> touching the ExecNode layer. The CEP library and DataStream API
> >> sources
> >>>>>>>> are performing operator migration for years already.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     > `supportedPlanChanges ` and `supportedSavepointChanges ` are
> >> a bit
> >>>>>>>> obscure.
> >>>>>>>>
> >>>>>>>> Let me try to come up with more examples why I think both
> >> annotation
> >>>>>>>> make sense and are esp. important *for test coverage*.
> >>>>>>>>
> >>>>>>>> supportedPlanChanges:
> >>>>>>>>
> >>>>>>>> Let's assume we have some JSON in Flink 1.15:
> >>>>>>>>
> >>>>>>>> {
> >>>>>>>>       some-prop: 42
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> And we want to extend the JSON in Flink 1.16:
> >>>>>>>>
> >>>>>>>> {
> >>>>>>>>       some-prop: 42,
> >>>>>>>>       some-flag: false
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> Maybe we don't need to increase the ExecNode version but only
> >> ensure
> >>>>>>>> that the flag is set to `false` by default for the older versions.
> >>>>>>>>
> >>>>>>>> We need a location to track changes and document the changelog.
> >> With
> >>>>> the
> >>>>>>>> help of the annotation supportedPlanChanges = [1.15, 1.16] we can
> >>>>> verify
> >>>>>>>> that we have tests for both JSON formats.
> >>>>>>>>
> >>>>>>>> And once we decide to drop the 1.15 format, we enforce plan
> >> migration
> >>>>>>>> and fill-in the default value `false` into the old plans and bump
> >> their
> >>>>>>>> JSON plan version to 1.16 or higher.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     > once the state layout is changed, the ExecNode version needs
> >> also
> >>>>> be
> >>>>>>>> updated
> >>>>>>>>
> >>>>>>>> This will still be the majority of cases. But if we can avoid
> >> this, we
> >>>>>>>> should do it for not having too much duplicate code to maintain.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 06.12.21 09:58, godfrey he wrote:
> >>>>>>>>> Hi, Timo,
> >>>>>>>>>
> >>>>>>>>> Thanks for the detailed explanation.
> >>>>>>>>>
> >>>>>>>>>> We change an operator state of B in Flink 1.16. We perform the
> >> change
> >>>>>>>> in the operator of B in a way to support both state layouts. Thus,
> >> no
> >>>>> need
> >>>>>>>> for a new ExecNode version.
> >>>>>>>>>
> >>>>>>>>> I think this design makes thing more complex.
> >>>>>>>>> 1. If there are multiple state layouts, which layout the ExecNode
> >>>>>>>> should use ?
> >>>>>>>>> It increases the cost of understanding for developers (especially
> >> for
> >>>>>>>>> Flink newer),
> >>>>>>>>> making them prone to mistakes.
> >>>>>>>>> 2. `supportedPlanChanges ` and `supportedSavepointChanges ` are a
> >> bit
> >>>>>>>> obscure.
> >>>>>>>>>
> >>>>>>>>> The purpose of ExecNode annotations are not only to support
> >> powerful
> >>>>>>>> validation,
> >>>>>>>>> but more importantly to make it easy for developers to understand
> >>>>>>>>> to ensure that every modification is easy and state compatible.
> >>>>>>>>>
> >>>>>>>>> I prefer, once the state layout is changed, the ExecNode version
> >> needs
> >>>>>>>>> also be updated.
> >>>>>>>>> which could make thing simple. How about
> >>>>>>>>> rename `supportedPlanChanges ` to `planCompatibleVersion`
> >>>>>>>>> (which means the plan is compatible with the plan generated by
> the
> >>>>>>>>> given version node)
> >>>>>>>>>      and rename `supportedSavepointChanges` to
> >>>>> `savepointCompatibleVersion
> >>>>>>>> `
> >>>>>>>>> (which means the state is compatible with the state generated by
> >> the
> >>>>>>>>> given version node) ?
> >>>>>>>>> The names also indicate that only one version value can be set.
> >>>>>>>>>
> >>>>>>>>> WDYT?
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Godfrey
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Timo Walther <[email protected]> 于2021年12月2日周四 下午11:42写道：
> >>>>>>>>>>
> >>>>>>>>>> Response to Marios's feedback:
> >>>>>>>>>>
> >>>>>>>>>>      > there should be some good logging in place when the
> >> upgrade is
> >>>>>>>> taking
> >>>>>>>>>> place
> >>>>>>>>>>
> >>>>>>>>>> Yes, I agree. I added this part to the FLIP.
> >>>>>>>>>>
> >>>>>>>>>>      > config option instead that doesn't provide the
> flexibility
> >> to
> >>>>>>>>>> overwrite certain plans
> >>>>>>>>>>
> >>>>>>>>>> One can set the config option also around sections of the
> >>>>>>>>>> multi-statement SQL script.
> >>>>>>>>>>
> >>>>>>>>>> SET 'table.plan.force-recompile'='true';
> >>>>>>>>>>
> >>>>>>>>>> COMPILE ...
> >>>>>>>>>>
> >>>>>>>>>> SET 'table.plan.force-recompile'='false';
> >>>>>>>>>>
> >>>>>>>>>> But the question is why a user wants to run COMPILE multiple
> >> times.
> >>>>> If
> >>>>>>>>>> it is during development, then running EXECUTE (or just the
> >> statement
> >>>>>>>>>> itself) without calling COMPILE should be sufficient. The file
> >> can
> >>>>> also
> >>>>>>>>>> manually be deleted if necessary.
> >>>>>>>>>>
> >>>>>>>>>> What do you think?
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Timo
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 02.12.21 16:09, Timo Walther wrote:
> >>>>>>>>>>> Hi Till,
> >>>>>>>>>>>
> >>>>>>>>>>> Yes, you might have to. But not a new plan from the SQL query
> >> but a
> >>>>>>>>>>> migration from the old plan to the new plan. This will not
> >> happen
> >>>>>>>> often.
> >>>>>>>>>>> But we need a way to evolve the format of the JSON plan itself.
> >>>>>>>>>>>
> >>>>>>>>>>> Maybe this confuses a bit, so let me clarify it again: Mostly
> >>>>> ExecNode
> >>>>>>>>>>> versions and operator state layouts will evolve. Not the plan
> >> files,
> >>>>>>>>>>> those will be pretty stable. But also not infinitely.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Timo
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 02.12.21 16:01, Till Rohrmann wrote:
> >>>>>>>>>>>> Then for migrating from Flink 1.10 to 1.12, I might have to
> >> create
> >>>>> a
> >>>>>>>> new
> >>>>>>>>>>>> plan using Flink 1.11 in order to migrate from Flink 1.11 to
> >> 1.12,
> >>>>>>>> right?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Till
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Dec 2, 2021 at 3:39 PM Timo Walther <
> >> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Response to Till's feedback:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>       > compiled plan won't be changed after being written
> >> initially
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This is not entirely correct. We give guarantees for keeping
> >> the
> >>>>>>>> query
> >>>>>>>>>>>>> up and running. We reserve us the right to force plan
> >> migrations.
> >>>>> In
> >>>>>>>>>>>>> this case, the plan might not be created from the SQL
> >> statement
> >>>>> but
> >>>>>>>> from
> >>>>>>>>>>>>> the old plan. I have added an example in section 10.1.1. In
> >>>>> general,
> >>>>>>>>>>>>> both persisted entities "plan" and "savepoint" can evolve
> >>>>>>>> independently
> >>>>>>>>>>>>> from each other.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 02.12.21 15:10, Timo Walther wrote:
> >>>>>>>>>>>>>> Response to Godfrey's feedback:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>       > "EXPLAIN PLAN EXECUTE STATEMENT SET BEGIN ... END"
> is
> >>>>> missing.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the hint. I added a dedicated section 7.1.3.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>       > it's hard to maintain the supported versions for
> >>>>>>>>>>>>>> "supportedPlanChanges" and "supportedSavepointChanges"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Actually, I think we are mostly on the same page.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The annotation does not need to be updated for every Flink
> >>>>>>>> version. As
> >>>>>>>>>>>>>> the name suggests it is about "Changes" (in other words:
> >>>>>>>>>>>>>> incompatibilities) that require some kind of migration.
> >> Either
> >>>>> plan
> >>>>>>>>>>>>>> migration (= PlanChanges) or savepoint migration
> >>>>>>>> (=SavepointChanges,
> >>>>>>>>>>>>>> using operator migration or savepoint migration).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Let's assume we introduced two ExecNodes A and B in Flink
> >> 1.15.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The annotations are:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
> >>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> @ExecNodeMetadata(name=B, supportedPlanChanges=1.15,
> >>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We change an operator state of B in Flink 1.16.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We perform the change in the operator of B in a way to
> >> support
> >>>>> both
> >>>>>>>>>>>>>> state layouts. Thus, no need for a new ExecNode version.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The annotations in 1.16 are:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
> >>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> @ExecNodeMetadata(name=B, supportedPlanChanges=1.15,
> >>>>>>>>>>>>>> supportedSavepointChanges=1.15, 1.16)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So the versions in the annotations are "start version"s.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I don't think we need end versions? End version would mean
> >> that
> >>>>> we
> >>>>>>>> drop
> >>>>>>>>>>>>>> the ExecNode from the code base?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please check the section 10.1.1 again. I added a more
> complex
> >>>>>>>> example.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 01.12.21 16:29, Timo Walther wrote:
> >>>>>>>>>>>>>>> Response to Francesco's feedback:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > *Proposed changes #6*: Other than defining this
> rule
> >> of
> >>>>>>>> thumb, we
> >>>>>>>>>>>>>>> must also make sure that compiling plans with these objects
> >> that
> >>>>>>>>>>>>>>> cannot be serialized in the plan must fail hard
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Yes, I totally agree. We will fail hard with a helpful
> >>>>> exception.
> >>>>>>>> Any
> >>>>>>>>>>>>>>> mistake e.g. using a inline object in Table API or an
> >> invalid
> >>>>>>>>>>>>>>> DataStream API source without uid should immediately fail a
> >> plan
> >>>>>>>>>>>>>>> compilation step. I added a remark to the FLIP again.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > What worries me is breaking changes, in particular
> >>>>>>>> behavioural
> >>>>>>>>>>>>>>> changes that might happen in connectors/formats
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Breaking changes in connectors and formats need to be
> >> encoded in
> >>>>>>>> the
> >>>>>>>>>>>>>>> options. I could also imagine to versioning in the factory
> >>>>>>>> identifier
> >>>>>>>>>>>>>>> `connector=kafka` and `connector=kafka-2`. If this is
> >> necessary.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> After thinking about your question again, I think we will
> >> also
> >>>>>>>> need
> >>>>>>>>>>>>>>> the same testing infrastructure for our connectors and
> >> formats.
> >>>>>>>> Esp.
> >>>>>>>>>>>>>>> restore tests and completeness test. I updated the document
> >>>>>>>>>>>>>>> accordingly. Also I added a way to generate UIDs for
> >> DataStream
> >>>>>>>> API
> >>>>>>>>>>>>>>> providers.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > *Functions:* Are we talking about the function name
> >> or the
> >>>>>>>>>>>>>>> function
> >>>>>>>>>>>>>>> complete signature?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For catalog functions, the identifier contains catalog name
> >> and
> >>>>>>>>>>>>>>> database name. For system functions, identifier contains
> >> only a
> >>>>>>>> name
> >>>>>>>>>>>>>>> which make function name and identifier identical. I
> >> reworked
> >>>>> the
> >>>>>>>>>>>>>>> section again and also fixed some of the naming conflicts
> >> you
> >>>>>>>>>>>>>>> mentioned.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > we should perhaps use a logically defined unique id
> >> like
> >>>>>>>>>>>>>>> /bigIntToTimestamp/
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I added a concrete example for the resolution and
> >> restoration.
> >>>>> The
> >>>>>>>>>>>>>>> unique id is composed of name + version. Internally, this
> is
> >>>>>>>>>>>>>>> represented as `$TO_TIMESTAMP_LTZ$1`.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > I think we should rather keep JSON out of the
> concept
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sounds ok to me. In SQL we also just call it "plan". I will
> >>>>>>>> change the
> >>>>>>>>>>>>>>> file sections. But would suggest to keep the fromJsonString
> >>>>>>>> method.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>       > write it back in the original plan file
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I updated the terminology section for what we consider an
> >>>>>>>> "upgrade".
> >>>>>>>>>>>>>>> We might need to update the orginal plan file. This is
> >> already
> >>>>>>>>>>>>>>> considered in the COMPILE PLAN ... FROM ... even though
> >> this is
> >>>>>>>> future
> >>>>>>>>>>>>>>> work. Also savepoint migration.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for all the feedback!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 30.11.21 14:28, Timo Walther wrote:
> >>>>>>>>>>>>>>>> Response to Wenlongs's feedback:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>       > I would prefer not to provide such a shortcut, let
> >> users
> >>>>> use
> >>>>>>>>>>>>>>>> COMPILE PLAN IF NOT EXISTS and EXECUTE explicitly, which
> >> can be
> >>>>>>>>>>>>>>>> understood by new users even without inferring the docs.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would like to hear more opinions on this topic.
> >> Personally, I
> >>>>>>>> find
> >>>>>>>>>>>>>>>> a combined statement very useful. Not only for quicker
> >>>>>>>> development
> >>>>>>>>>>>>>>>> and debugging but also for readability. It helps in
> >> keeping the
> >>>>>>>> JSON
> >>>>>>>>>>>>>>>> path and the query close to each other in order to know
> the
> >>>>>>>> origin of
> >>>>>>>>>>>>>>>> the plan.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>       > but the plan and SQL are not matched. The result
> >> would be
> >>>>>>>> quite
> >>>>>>>>>>>>>>>> confusing if we still execute the plan directly, we may
> >> need to
> >>>>>>>> add a
> >>>>>>>>>>>>>>>> validation.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> You are right that there could be a mismatch. But we have
> a
> >>>>>>>> similar
> >>>>>>>>>>>>>>>> problem when executing CREATE TABLE IF NOT EXISTS. The
> >> schema
> >>>>> or
> >>>>>>>>>>>>>>>> options of a table could have changed completely in the
> >> catalog
> >>>>>>>> but
> >>>>>>>>>>>>>>>> the CREATE TABLE IF NOT EXISTS is not executed again. So a
> >>>>>>>> mismatch
> >>>>>>>>>>>>>>>> could also occur there.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 30.11.21 14:17, Timo Walther wrote:
> >>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> thanks for the feedback so far. Let me answer each email
> >>>>>>>>>>>>>>>>> indvidually.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I will start with a response to Ingo's feedback:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       > Will the JSON plan's schema be considered an API?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> No, not in the first version. This is explicitly
> >> mentioned in
> >>>>>>>> the
> >>>>>>>>>>>>>>>>> `General JSON Plan Assumptions`. I tried to improve the
> >>>>> section
> >>>>>>>> once
> >>>>>>>>>>>>>>>>> more to make it clearer. However, the JSON plan is
> >> definitely
> >>>>>>>> stable
> >>>>>>>>>>>>>>>>> per minor version. And since the plan is versioned by
> >> Flink
> >>>>>>>> version,
> >>>>>>>>>>>>>>>>> external tooling could be build around it. We might make
> >> it
> >>>>>>>> public
> >>>>>>>>>>>>>>>>> API once the design has settled.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       > Given that upgrades across multiple versions at
> >> once are
> >>>>>>>>>>>>>>>>> unsupported, do we verify this somehow?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Good question. I extended the `General JSON Plan
> >> Assumptions`.
> >>>>>>>> Now
> >>>>>>>>>>>>>>>>> yes: the Flink version is part of the JSON plan and will
> >> be
> >>>>>>>> verified
> >>>>>>>>>>>>>>>>> during restore. But keep in mind that we might support
> >> more
> >>>>> that
> >>>>>>>>>>>>>>>>> just the last version at least until the JSON plan has
> >> been
> >>>>>>>>>>>>>>>>> migrated.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 30.11.21 09:39, Marios Trivyzas wrote:
> >>>>>>>>>>>>>>>>>> I have a question regarding the `COMPILE PLAN OVEWRITE`.
> >> If
> >>>>> we
> >>>>>>>>>>>>>>>>>> choose to go
> >>>>>>>>>>>>>>>>>> with the config option instead,
> >>>>>>>>>>>>>>>>>> that doesn't provide the flexibility to overwrite
> certain
> >>>>>>>> plans but
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>> others, since the config applies globally, isn't that
> >>>>>>>>>>>>>>>>>> something to consider?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Mon, Nov 29, 2021 at 10:15 AM Marios Trivyzas <
> >>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Timo!
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks a lot for taking all that time and effort to put
> >>>>>>>> together
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> proposal!
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Regarding:
> >>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume that
> >> upgrades
> >>>>>>>> use a
> >>>>>>>>>>>>>>>>>>>> step size
> >>>>>>>>>>>>>>>>>>> of a single
> >>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping minor
> >> versions
> >>>>>>>> (e.g.
> >>>>>>>>>>>>>>>>>>> 1.11 to
> >>>>>>>>>>>>>>>>>>> 1.14).
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think that for this first step we should make it
> >>>>> absolutely
> >>>>>>>>>>>>>>>>>>> clear to the
> >>>>>>>>>>>>>>>>>>> users that they would need to go through all
> >> intermediate
> >>>>>>>> versions
> >>>>>>>>>>>>>>>>>>> to end up with the target version they wish. If we are
> >> to
> >>>>>>>> support
> >>>>>>>>>>>>>>>>>>> skipping
> >>>>>>>>>>>>>>>>>>> versions in the future, i.e. upgrade from 1.14 to 1.17,
> >> this
> >>>>>>>> means
> >>>>>>>>>>>>>>>>>>> that we need to have a testing infrastructure in place
> >> that
> >>>>>>>> would
> >>>>>>>>>>>>>>>>>>> test all
> >>>>>>>>>>>>>>>>>>> possible combinations of version upgrades, i.e. from
> >> 1.14 to
> >>>>>>>> 1.15,
> >>>>>>>>>>>>>>>>>>> from 1.14 to 1.16 and so forth, while still testing and
> >> of
> >>>>>>>> course
> >>>>>>>>>>>>>>>>>>> supporting all the upgrades from the previous minor
> >> version.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I like a lot the idea of introducing HINTS to define
> >> some
> >>>>>>>>>>>>>>>>>>> behaviour in the
> >>>>>>>>>>>>>>>>>>> programs!
> >>>>>>>>>>>>>>>>>>> - the hints live together with the sql statements and
> >>>>>>>> consequently
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> (JSON) plans.
> >>>>>>>>>>>>>>>>>>> - If multiple queries are involved in a program, each
> >> one of
> >>>>>>>> them
> >>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>> define its own config (regarding plan optimisation, not
> >> null
> >>>>>>>>>>>>>>>>>>> enforcement,
> >>>>>>>>>>>>>>>>>>> etc)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I agree with Francesco on his argument regarding the
> >> *JSON*
> >>>>>>>>>>>>>>>>>>> plan. I
> >>>>>>>>>>>>>>>>>>> believe we should already provide flexibility here,
> >> since
> >>>>> (who
> >>>>>>>>>>>>>>>>>>> knows) in
> >>>>>>>>>>>>>>>>>>> the future
> >>>>>>>>>>>>>>>>>>> a JSON plan might not fulfil the desired functionality.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I also agree that we need some very obvious way (i.e.
> >> not
> >>>>> log
> >>>>>>>>>>>>>>>>>>> entry) to
> >>>>>>>>>>>>>>>>>>> show the users that their program doesn't support
> >> version
> >>>>>>>>>>>>>>>>>>> upgrades, and
> >>>>>>>>>>>>>>>>>>> prevent them from being negatively surprised in the
> >> future,
> >>>>>>>> when
> >>>>>>>>>>>>>>>>>>> trying to
> >>>>>>>>>>>>>>>>>>> upgrade their production pipelines.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> This is an implementation detail, but I'd like to add
> >> that
> >>>>>>>> there
> >>>>>>>>>>>>>>>>>>> should be
> >>>>>>>>>>>>>>>>>>> some good logging in place when the upgrade is taking
> >> place,
> >>>>>>>> to be
> >>>>>>>>>>>>>>>>>>> able to track every restoration action, and help debug
> >> any
> >>>>>>>>>>>>>>>>>>> potential
> >>>>>>>>>>>>>>>>>>> issues arising from that.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Fri, Nov 26, 2021 at 2:54 PM Till Rohrmann
> >>>>>>>>>>>>>>>>>>> <[email protected]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks for writing this FLIP Timo. I think this will
> >> be a
> >>>>>>>> very
> >>>>>>>>>>>>>>>>>>>> important
> >>>>>>>>>>>>>>>>>>>> improvement for Flink and our SQL user :-)
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Similar to Francesco I would like to understand the
> >>>>> statement
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume that
> >> upgrades
> >>>>>>>> use a
> >>>>>>>>>>>>>>>>>>>>> step
> >>>>>>>>>>>>>>>>>>>> size
> >>>>>>>>>>>>>>>>>>>> of a single
> >>>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping minor
> >> versions
> >>>>>>>> (e.g.
> >>>>>>>>>>>>>>>>>>>> 1.11 to
> >>>>>>>>>>>>>>>>>>>> 1.14).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> a bit better. Is it because Flink does not guarantee
> >> that a
> >>>>>>>>>>>>>>>>>>>> savepoint
> >>>>>>>>>>>>>>>>>>>> created by version 1.x can be directly recovered by
> >> version
> >>>>>>>> 1.y
> >>>>>>>>>>>>>>>>>>>> with x + 1
> >>>>>>>>>>>>>>>>>>>> < y but users might have to go through a cascade of
> >>>>> upgrades?
> >>>>>>>>>>>>>>>>>>>>      From how I
> >>>>>>>>>>>>>>>>>>>> understand your proposal, the compiled plan won't be
> >>>>> changed
> >>>>>>>>>>>>>>>>>>>> after being
> >>>>>>>>>>>>>>>>>>>> written initially. Hence, I would assume that for the
> >> plan
> >>>>>>>> alone
> >>>>>>>>>>>>>>>>>>>> Flink
> >>>>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>> have to give backwards compatibility guarantees for
> all
> >>>>>>>> versions.
> >>>>>>>>>>>>>>>>>>>> Am I
> >>>>>>>>>>>>>>>>>>>> understanding this part correctly?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Thu, Nov 25, 2021 at 4:55 PM Francesco Guardiani <
> >>>>>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi Timo,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks for putting this amazing work together, I have
> >> some
> >>>>>>>>>>>>>>>>>>>>> considerations/questions
> >>>>>>>>>>>>>>>>>>>>> about the FLIP:
> >>>>>>>>>>>>>>>>>>>>> *Proposed changes #6*: Other than defining this rule
> >> of
> >>>>>>>> thumb,
> >>>>>>>>>>>>>>>>>>>>> we must
> >>>>>>>>>>>>>>>>>>>>> also make sure
> >>>>>>>>>>>>>>>>>>>>> that compiling plans with these objects that cannot
> be
> >>>>>>>>>>>>>>>>>>>>> serialized in the
> >>>>>>>>>>>>>>>>>>>>> plan must fail hard,
> >>>>>>>>>>>>>>>>>>>>> so users don't bite themselves with such issues, or
> at
> >>>>>>>> least we
> >>>>>>>>>>>>>>>>>>>>> need to
> >>>>>>>>>>>>>>>>>>>>> output warning
> >>>>>>>>>>>>>>>>>>>>> logs. In general, whenever the user is trying to use
> >> the
> >>>>>>>>>>>>>>>>>>>>> CompiledPlan
> >>>>>>>>>>>>>>>>>>>> APIs
> >>>>>>>>>>>>>>>>>>>>> and at the same
> >>>>>>>>>>>>>>>>>>>>> time, they're trying to do something "illegal" for
> the
> >>>>>>>> plan, we
> >>>>>>>>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>>>>> immediately either
> >>>>>>>>>>>>>>>>>>>>> log or fail depending on the issue, in order to avoid
> >> any
> >>>>>>>>>>>>>>>>>>>>> surprises once
> >>>>>>>>>>>>>>>>>>>>> the user upgrades.
> >>>>>>>>>>>>>>>>>>>>> I would also say the same for things like registering
> >> a
> >>>>>>>>>>>>>>>>>>>>> function,
> >>>>>>>>>>>>>>>>>>>>> registering a DataStream,
> >>>>>>>>>>>>>>>>>>>>> and for every other thing which won't end up in the
> >> plan,
> >>>>> we
> >>>>>>>>>>>>>>>>>>>>> should log
> >>>>>>>>>>>>>>>>>>>>> such info to the
> >>>>>>>>>>>>>>>>>>>>> user by default.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> *General JSON Plan Assumptions #9:* When thinking to
> >>>>>>>> connectors
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> formats, I think
> >>>>>>>>>>>>>>>>>>>>> it's reasonable to assume and keep out of the feature
> >>>>> design
> >>>>>>>>>>>>>>>>>>>>> that no
> >>>>>>>>>>>>>>>>>>>>> feature/ability can
> >>>>>>>>>>>>>>>>>>>>> deleted from a connector/format. I also don't think
> >> new
> >>>>>>>>>>>>>>>>>>>> features/abilities
> >>>>>>>>>>>>>>>>>>>>> can influence
> >>>>>>>>>>>>>>>>>>>>> this FLIP as well, given the plan is static, so if
> for
> >>>>>>>> example,
> >>>>>>>>>>>>>>>>>>>>> MyCoolTableSink in the next
> >>>>>>>>>>>>>>>>>>>>> flink version implements SupportsProjectionsPushDown,
> >> then
> >>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>> shouldn't
> >>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>> a problem
> >>>>>>>>>>>>>>>>>>>>> for the upgrade story since the plan is still
> >> configured
> >>>>> as
> >>>>>>>>>>>>>>>>>>>>> computed
> >>>>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>> the previous flink
> >>>>>>>>>>>>>>>>>>>>> version. What worries me is breaking changes, in
> >>>>> particular
> >>>>>>>>>>>>>>>>>>>>> behavioural
> >>>>>>>>>>>>>>>>>>>>> changes that
> >>>>>>>>>>>>>>>>>>>>> might happen in connectors/formats. Although this
> >> argument
> >>>>>>>>>>>>>>>>>>>>> doesn't seem
> >>>>>>>>>>>>>>>>>>>>> relevant for
> >>>>>>>>>>>>>>>>>>>>> the connectors shipped by the flink project itself,
> >>>>> because
> >>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>> try to
> >>>>>>>>>>>>>>>>>>>> keep
> >>>>>>>>>>>>>>>>>>>>> them as stable as
> >>>>>>>>>>>>>>>>>>>>> possible and avoid eventual breaking changes, it's
> >>>>>>>> compelling to
> >>>>>>>>>>>>>>>>>>>> external
> >>>>>>>>>>>>>>>>>>>>> connectors and
> >>>>>>>>>>>>>>>>>>>>> formats, which might be decoupled from the flink
> >> release
> >>>>>>>> cycle
> >>>>>>>>>>>>>>>>>>>>> and might
> >>>>>>>>>>>>>>>>>>>>> have different
> >>>>>>>>>>>>>>>>>>>>> backward compatibility guarantees. It's totally
> >> reasonable
> >>>>>>>> if we
> >>>>>>>>>>>>>>>>>>>>> don't
> >>>>>>>>>>>>>>>>>>>>> want to tackle it in
> >>>>>>>>>>>>>>>>>>>>> this first iteration of the feature, but it's
> >> something we
> >>>>>>>> need
> >>>>>>>>>>>>>>>>>>>>> to keep
> >>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>> mind for the future.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> *Functions:* It's not clear to me what you mean for
> >>>>>>>>>>>>>>>>>>>>> "identifier",
> >>>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> then somewhere
> >>>>>>>>>>>>>>>>>>>>> else in the same context you talk about "name". Are
> we
> >>>>>>>> talking
> >>>>>>>>>>>>>>>>>>>>> about the
> >>>>>>>>>>>>>>>>>>>>> function name
> >>>>>>>>>>>>>>>>>>>>> or the function complete signature? Let's assume for
> >>>>>>>> example we
> >>>>>>>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>>> these
> >>>>>>>>>>>>>>>>>>>>> function
> >>>>>>>>>>>>>>>>>>>>> definitions:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(BIGINT)
> >>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(STRING)
> >>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(STRING, STRING)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> These for me are very different functions with
> >> different
> >>>>>>>>>>>>>>>>>>>> implementations,
> >>>>>>>>>>>>>>>>>>>>> where each of
> >>>>>>>>>>>>>>>>>>>>> them might evolve separately at a different pace.
> >> Hence
> >>>>>>>> when we
> >>>>>>>>>>>>>>>>>>>>> store
> >>>>>>>>>>>>>>>>>>>> them
> >>>>>>>>>>>>>>>>>>>>> in the json
> >>>>>>>>>>>>>>>>>>>>> plan we should perhaps use a logically defined unique
> >> id
> >>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>> /bigIntToTimestamp/, /
> >>>>>>>>>>>>>>>>>>>>> stringToTimestamp/ and /stringToTimestampWithFormat/.
> >> This
> >>>>>>>> also
> >>>>>>>>>>>>>>>>>>>>> solves
> >>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> issue of
> >>>>>>>>>>>>>>>>>>>>> correctly referencing the functions when restoring
> the
> >>>>> plan,
> >>>>>>>>>>>>>>>>>>>>> without
> >>>>>>>>>>>>>>>>>>>>> running again the
> >>>>>>>>>>>>>>>>>>>>> inference logic (which might have been changed in the
> >>>>>>>> meantime)
> >>>>>>>>>>>>>>>>>>>>> and it
> >>>>>>>>>>>>>>>>>>>>> might also solve
> >>>>>>>>>>>>>>>>>>>>> the versioning, that is the function identifier can
> >>>>> contain
> >>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> function
> >>>>>>>>>>>>>>>>>>>>> version like /
> >>>>>>>>>>>>>>>>>>>>> stringToTimestampWithFormat_1_1 /or
> >>>>>>>>>>>>>>>>>>>>> /stringToTimestampWithFormat_1_2/.
> >>>>>>>>>>>>>>>>>>>> An
> >>>>>>>>>>>>>>>>>>>>> alternative could be to use the string signature
> >>>>>>>> representation,
> >>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>> might not be trivial
> >>>>>>>>>>>>>>>>>>>>> to compute, given the complexity of our type
> inference
> >>>>>>>> logic.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> *The term "JSON plan"*: I think we should rather keep
> >> JSON
> >>>>>>>> out
> >>>>>>>>>>>>>>>>>>>>> of the
> >>>>>>>>>>>>>>>>>>>>> concept and just
> >>>>>>>>>>>>>>>>>>>>> name it "Compiled Plan" (like the proposed API) or
> >>>>> something
> >>>>>>>>>>>>>>>>>>>>> similar,
> >>>>>>>>>>>>>>>>>>>> as I
> >>>>>>>>>>>>>>>>>>>>> see how in
> >>>>>>>>>>>>>>>>>>>>> future we might decide to support/modify our
> >> persistence
> >>>>>>>>>>>>>>>>>>>>> format to
> >>>>>>>>>>>>>>>>>>>>> something more
> >>>>>>>>>>>>>>>>>>>>> efficient storage wise like BSON. For example, I
> would
> >>>>>>>> rename /
> >>>>>>>>>>>>>>>>>>>>> CompiledPlan.fromJsonFile/ to simply
> >>>>>>>> /CompiledPlan.fromFile/.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> *Who is the owner of the plan file?* I asked myself
> >> this
> >>>>>>>>>>>>>>>>>>>>> question when
> >>>>>>>>>>>>>>>>>>>>> reading this:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume that
> >> upgrades
> >>>>>>>> use a
> >>>>>>>>>>>>>>>>>>>>>> step
> >>>>>>>>>>>>>>>>>>>>> size of a single
> >>>>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping minor
> >> versions
> >>>>>>>> (e.g.
> >>>>>>>>>>>>>>>>>>>>> 1.11 to
> >>>>>>>>>>>>>>>>>>>>> 1.14).
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> My understanding of this statement is that a user can
> >>>>>>>> upgrade
> >>>>>>>>>>>>>>>>>>>>> between
> >>>>>>>>>>>>>>>>>>>>> minors but then
> >>>>>>>>>>>>>>>>>>>>> following all the minors, the same query can remain
> >> up and
> >>>>>>>>>>>>> running.
> >>>>>>>>>>>>>>>>>>>> E.g. I
> >>>>>>>>>>>>>>>>>>>>> upgrade from
> >>>>>>>>>>>>>>>>>>>>> 1.15 to 1.16, and then from 1.16 to 1.17 and I still
> >>>>> expect
> >>>>>>>> my
> >>>>>>>>>>>>>>>>>>>>> original
> >>>>>>>>>>>>>>>>>>>>> query to work
> >>>>>>>>>>>>>>>>>>>>> without recomputing the plan. This necessarily means
> >> that
> >>>>> at
> >>>>>>>>>>>>>>>>>>>>> some point
> >>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>> future
> >>>>>>>>>>>>>>>>>>>>> releases we'll need some basic "migration" tool to
> >> keep
> >>>>> the
> >>>>>>>>>>>>>>>>>>>>> queries up
> >>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> running,
> >>>>>>>>>>>>>>>>>>>>> ending up modifying the compiled plan. So I guess
> >> flink
> >>>>>>>> should
> >>>>>>>>>>>>>>>>>>>>> write it
> >>>>>>>>>>>>>>>>>>>>> back in the original
> >>>>>>>>>>>>>>>>>>>>> plan file, perhaps doing a backup of the previous
> >> one? Can
> >>>>>>>> you
> >>>>>>>>>>>>>>>>>>>>> please
> >>>>>>>>>>>>>>>>>>>>> clarify this aspect?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Except these considerations, the proposal looks good
> >> to me
> >>>>>>>>>>>>>>>>>>>>> and I'm
> >>>>>>>>>>>>>>>>>>>> eagerly
> >>>>>>>>>>>>>>>>>>>>> waiting to see
> >>>>>>>>>>>>>>>>>>>>> it in play.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>> FG
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>> Francesco Guardiani | Software Engineer
> >>>>>>>>>>>>>>>>>>>>> [email protected][1]
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
> >>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>> Join Flink Forward[2] - The Apache Flink Conference
> >>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
> >>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin,
> >>>>> Germany
> >>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>> Ververica GmbH
> >>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244
> B
> >>>>>>>>>>>>>>>>>>>>> Managing Directors: Karl Anton Wehner, Holger Temme,
> >> Yip
> >>>>>>>> Park
> >>>>>>>>>>>>>>>>>>>>> Tung
> >>>>>>>>>>>>>>>>>>>> Jason,
> >>>>>>>>>>>>>>>>>>>>> Jinwei (Kevin)
> >>>>>>>>>>>>>>>>>>>>> Zhang
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> --------
> >>>>>>>>>>>>>>>>>>>>> [1] mailto:[email protected]
> >>>>>>>>>>>>>>>>>>>>> [2] https://flink-forward.org/
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>> Marios
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

Reply via email to