Re:Re: [VOTE] FLIP-287: Extend Sink#InitContext to expose TypeSerializer, ObjectReuse and JobID
+1 (non-binding) Thanks, Roc 在 2023-06-19 10:55:05,"Zhu Zhu" 写道: >+1 (binding) > >Thanks, >Zhu > >Tzu-Li (Gordon) Tai 于2023年6月17日周六 11:32写道: >> >> +1 (binding) >> >> On Fri, Jun 16, 2023, 09:53 Jing Ge wrote: >> >> > +1(binding) >> > >> > Best Regards, >> > Jing >> > >> > On Fri, Jun 16, 2023 at 10:10 AM Lijie Wang >> > wrote: >> > >> > > +1 (binding) >> > > >> > > Thanks for driving it, Joao. >> > > >> > > Best, >> > > Lijie >> > > >> > > Joao Boto 于2023年6月16日周五 15:53写道: >> > > >> > > > Hi all, >> > > > >> > > > Thank you to everyone for the feedback on FLIP-287[1]. Based on the >> > > > discussion thread [2], we have come to a consensus on the design and >> > are >> > > > ready to take a vote to contribute this to Flink. >> > > > >> > > > I'd like to start a vote for it. The vote will be open for at least 72 >> > > > hours(excluding weekends, unless there is an objection or an >> > insufficient >> > > > number of votes. >> > > > >> > > > [1] >> > > > >> > > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853 >> > > > [2]https://lists.apache.org/thread/wb3myhqsdz81h08ygwx057mkn1hc3s8f >> > > > >> > > > >> > > > Best, >> > > > Joao Boto >> > > > >> > > >> >
Re: [VOTE] FLIP-287: Extend Sink#InitContext to expose TypeSerializer, ObjectReuse and JobID
+1 (binding) Thanks, Zhu Tzu-Li (Gordon) Tai 于2023年6月17日周六 11:32写道: > > +1 (binding) > > On Fri, Jun 16, 2023, 09:53 Jing Ge wrote: > > > +1(binding) > > > > Best Regards, > > Jing > > > > On Fri, Jun 16, 2023 at 10:10 AM Lijie Wang > > wrote: > > > > > +1 (binding) > > > > > > Thanks for driving it, Joao. > > > > > > Best, > > > Lijie > > > > > > Joao Boto 于2023年6月16日周五 15:53写道: > > > > > > > Hi all, > > > > > > > > Thank you to everyone for the feedback on FLIP-287[1]. Based on the > > > > discussion thread [2], we have come to a consensus on the design and > > are > > > > ready to take a vote to contribute this to Flink. > > > > > > > > I'd like to start a vote for it. The vote will be open for at least 72 > > > > hours(excluding weekends, unless there is an objection or an > > insufficient > > > > number of votes. > > > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853 > > > > [2]https://lists.apache.org/thread/wb3myhqsdz81h08ygwx057mkn1hc3s8f > > > > > > > > > > > > Best, > > > > Joao Boto > > > > > > > > >
Re: [DISCUSS] FLIP-321: Introduce an API deprecation process
Hi Becket, Thanks for the reply! I’d like to continue the conversation about compatibility outside of this FLIP thread, but for now, I can accept your decision. It’s certainly an improvement. Thanks again, John On Sun, Jun 18, 2023, at 21:42, Becket Qin wrote: > Hi John, > > Completely agree with all you said. > > Can we consider only dropping deprecated APIs in major releases across the >> board? I understand that Experimental and PublicEvolving APIs are by >> definition less stable, but it seems like this should be reflected in the >> required deprecation period alone. I.e. that we must keep them around for >> at least zero or one minor release, not that we can drop them in a minor or >> patch release. > > Personally speaking, I would love to do this, for exactly the reason you > mentioned. However, I did not propose this due to the following reasons: > > 1. I am hesitating a little bit about changing the accepted FLIPs too soon. > 2. More importantly, to avoid slowing down our development. At this point, > Flink still lacks some design / routines to support good API evolvability / > extensibility. Just like you said, it takes some time to be good at this. > In this case, my concern is that only removing Experimental / > PublicEvolving APIs in major version changes may result in too much > overhead and dramatically slow down the development of Flink. So, I was > thinking that we can start with the current status. Hopefully after we are > more comfortable with the maintenance overhead of deprecated APIs, we can > then have a stronger guarantee for Experimental / PublicEvolving APIs. > > Thanks, > > Jiangjie (Becket) Qin > > > > On Sun, Jun 18, 2023 at 6:44 AM John Roesler wrote: > >> Hi Becket, >> >> Thanks for this FLIP! Having a deprecation process is really important. I >> understand some people’s concerns about the additional burden for project >> maintainers, but my personal experience with Kafka has been that it’s very >> liveable and that it’s well worth the benefit to users. In fact, users >> being able to confidently upgrade is also a benefit to maintainers, as we >> will get fewer questions from people stuck on very old versions. >> >> One question: >> Can we consider only dropping deprecated APIs in major releases across the >> board? I understand that Experimental and PublicEvolving APIs are by >> definition less stable, but it seems like this should be reflected in the >> required deprecation period alone. I.e. that we must keep them around for >> at least zero or one minor release, not that we can drop them in a minor or >> patch release. >> >> The advantage of forbidding the removal of any API in minor or patch >> releases is that users will get a strong guarantee that they can bump the >> minor or patch version and still be able to compile, or even just re-link >> and know that they won’t face “MethodDef” exceptions at run time. This is a >> binary guarantee: if we allow removing even Experimental APIs outside of >> major releases, users can no longer confidently upgrade. >> >> Aside from that, I’d share my 2 cents on a couple of points: >> * I’d use the official Deprecated annotation instead of introducing our >> own flavor (Retired, etc), since Deprecated is well integrated into build >> tools and IDEs. >> * I wouldn’t worry about a demotion process in this FLIP; it seems >> orthogonal, and something that should probably be taken case-by-case >> anyway. >> * Aside from deprecation and removal, there have been some discussions >> about how to evolve APIs and behavior in compatible ways. This is somewhat >> of an art, and if folks haven’t wrestled with it before, it’ll take some >> time to become good at it. I feel like this topic should also be orthogonal >> to this FLIP, but FWIW, my suggestion would be to adopt a simple policy not >> to break existing user programs, and leave the “how” up to implementers and >> reviewers. >> >> Thanks again, >> John >> >> On Sat, Jun 17, 2023, at 11:03, Jing Ge wrote: >> > Hi All, >> > >> > The @Public -> @PublicEvolving proposed by Xintong is a great idea. >> > Especially, after he suggest @PublicRetired, i.e. @PublicEvolving --(2 >> > minor release)--> @Public --> @deprecated --(1 major >> > release)--> @PublicRetired. It will provide a lot of flexibility without >> > breaking any rules we had. @Public APIs are allowed to change between >> major >> > releases. Changing annotations is acceptable and provides additional >> > tolerance i.e. user-friendliness, since the APIs themself are not >> changed. >> > >> > I had similar thoughts when I was facing those issues. I want to move one >> > step further and suggest introducing one more annotation @Retired. >> > >> > Not like the @PublicRetired which is a compromise of downgrading @Public >> to >> > @PublicEvolving. As I mentioned earlier in my reply, Java standard >> > @deprecated should be used in the early stage of the deprecation process >> > and doesn't really meet our requirement. Since Java
Re: [DISCUSS] FLIP-321: Introduce an API deprecation process
Hi John, Completely agree with all you said. Can we consider only dropping deprecated APIs in major releases across the > board? I understand that Experimental and PublicEvolving APIs are by > definition less stable, but it seems like this should be reflected in the > required deprecation period alone. I.e. that we must keep them around for > at least zero or one minor release, not that we can drop them in a minor or > patch release. Personally speaking, I would love to do this, for exactly the reason you mentioned. However, I did not propose this due to the following reasons: 1. I am hesitating a little bit about changing the accepted FLIPs too soon. 2. More importantly, to avoid slowing down our development. At this point, Flink still lacks some design / routines to support good API evolvability / extensibility. Just like you said, it takes some time to be good at this. In this case, my concern is that only removing Experimental / PublicEvolving APIs in major version changes may result in too much overhead and dramatically slow down the development of Flink. So, I was thinking that we can start with the current status. Hopefully after we are more comfortable with the maintenance overhead of deprecated APIs, we can then have a stronger guarantee for Experimental / PublicEvolving APIs. Thanks, Jiangjie (Becket) Qin On Sun, Jun 18, 2023 at 6:44 AM John Roesler wrote: > Hi Becket, > > Thanks for this FLIP! Having a deprecation process is really important. I > understand some people’s concerns about the additional burden for project > maintainers, but my personal experience with Kafka has been that it’s very > liveable and that it’s well worth the benefit to users. In fact, users > being able to confidently upgrade is also a benefit to maintainers, as we > will get fewer questions from people stuck on very old versions. > > One question: > Can we consider only dropping deprecated APIs in major releases across the > board? I understand that Experimental and PublicEvolving APIs are by > definition less stable, but it seems like this should be reflected in the > required deprecation period alone. I.e. that we must keep them around for > at least zero or one minor release, not that we can drop them in a minor or > patch release. > > The advantage of forbidding the removal of any API in minor or patch > releases is that users will get a strong guarantee that they can bump the > minor or patch version and still be able to compile, or even just re-link > and know that they won’t face “MethodDef” exceptions at run time. This is a > binary guarantee: if we allow removing even Experimental APIs outside of > major releases, users can no longer confidently upgrade. > > Aside from that, I’d share my 2 cents on a couple of points: > * I’d use the official Deprecated annotation instead of introducing our > own flavor (Retired, etc), since Deprecated is well integrated into build > tools and IDEs. > * I wouldn’t worry about a demotion process in this FLIP; it seems > orthogonal, and something that should probably be taken case-by-case > anyway. > * Aside from deprecation and removal, there have been some discussions > about how to evolve APIs and behavior in compatible ways. This is somewhat > of an art, and if folks haven’t wrestled with it before, it’ll take some > time to become good at it. I feel like this topic should also be orthogonal > to this FLIP, but FWIW, my suggestion would be to adopt a simple policy not > to break existing user programs, and leave the “how” up to implementers and > reviewers. > > Thanks again, > John > > On Sat, Jun 17, 2023, at 11:03, Jing Ge wrote: > > Hi All, > > > > The @Public -> @PublicEvolving proposed by Xintong is a great idea. > > Especially, after he suggest @PublicRetired, i.e. @PublicEvolving --(2 > > minor release)--> @Public --> @deprecated --(1 major > > release)--> @PublicRetired. It will provide a lot of flexibility without > > breaking any rules we had. @Public APIs are allowed to change between > major > > releases. Changing annotations is acceptable and provides additional > > tolerance i.e. user-friendliness, since the APIs themself are not > changed. > > > > I had similar thoughts when I was facing those issues. I want to move one > > step further and suggest introducing one more annotation @Retired. > > > > Not like the @PublicRetired which is a compromise of downgrading @Public > to > > @PublicEvolving. As I mentioned earlier in my reply, Java standard > > @deprecated should be used in the early stage of the deprecation process > > and doesn't really meet our requirement. Since Java does not allow us to > > extend annotation, I think it would be feasible to have the new @Retired > to > > help us monitor and manage the deprecation process, house cleaning, etc. > > > > Some ideas could be(open for discussion): > > > > @Retired: > > > > 1. There must be a replacement with functionality compatibility before > APIs > > can be marked as @Retired, i.e. DISCUSS
[jira] [Created] (FLINK-32374) ExecNodeGraphInternalPlan#writeToFile should support TRUNCATE_EXISTING for overwriting
Jane Chan created FLINK-32374: - Summary: ExecNodeGraphInternalPlan#writeToFile should support TRUNCATE_EXISTING for overwriting Key: FLINK-32374 URL: https://issues.apache.org/jira/browse/FLINK-32374 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.17.1, 1.16.2, 1.16.1, 1.17.0, 1.16.0, 1.18.0 Reporter: Jane Chan Fix For: 1.18.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [NOTICE] Experimental Java 17 support now available on master
I think there is some confusion: Chesnay, not me, recently checked in changes into master so that Flink will build + test + run with experimental support for Java 17 but with Kryo 2.x as-is so this will error with Java records. Chesnay created this particular email thread related to this work. I (Kurt), created a PR+FLIP several weeks ago for upgrading Kryo from 2.x to 5.x, with full backward compatibility for existing savepoints/checkpoints, that enables Flink to run on Java 17 with support for Java records. This isn't merged into master. I haven't gotten much feedback on this. I recently rebased the Kryo upgrade PR onto the master branch, whicch includes Chesnay commits. The PR branch was already running successfully on Java 17, Chesnay's changes enable Flink to build and run the CI test suite in Java 17 as well. However, without the Kryo upgrade, Flink isn't compatible with Java records. I'd be happy to follow the standard process and do the the FLIP vote, but before this is ready for a vote, this PR needs review + testing by someone other than me. Specifically, I'd like someone to try to create a Flink application that tries to break the upgrade process: either confirm that everything works or demonstrate an error scenario. The Kryo PR code is passing all automated CI tests, which include several tests covering backwards compatibility scenarios. I also created this simple application https://github.com/kurtostfeld/flink-kryo-upgrade-demo to create state with Flink 1.17 and test the upgrade process. From what I can see it works, but this would definitely need more testing from people other than just me. --- Original Message --- On Sunday, June 18th, 2023 at 7:41 AM, Jing Ge wrote: > > > Hi Kurt, > > Thanks for your contribution. I am a little bit confused about the email > title, since your PR[1] is not merged into the master yet. I guess, with > "Experimental Java 17 support", you meant it is available on your branch > which is based on the master. > > If I am not mistaken, there is no vote thread of FLIP 317 on ML. Would you > like to follow the standard process[2] defined by the Flink community? > Thanks! > > > Best regards, > Jing > > [1] https://github.com/apache/flink/pull/22660 > [2] > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals > > On Sun, Jun 18, 2023 at 1:18 AM Kurt Ostfeld kurtostf...@proton.me.invalid > > wrote: > > > I built the Flink master branch and tried running this simple Flink app > > that uses a Java record: > > > > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java > > > > It fails with the normal exception that Kryo 2.x throws when you try to > > serialize a Java record. The full stack trace is here: > > https://pastebin.com/HGhGKUWt > > > > I tried removing this line: > > > > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java#L36 > > > > and that had no impact, I got the same error. > > > > In the other thread, you said that the plan was to use PojoSerializer to > > serialize records rather than Kryo. Currently, the Flink code bases uses > > Kryo 2.x by default for generic user data types, and that will fail when > > the data type is a record or contains records. Ultimately, if Flink wants > > to fully support Java records, it seems that it has to move off of Kryo > > 2.x. PojoSerializer is part of what is basically a custom serialization > > library internal to Flink that is an alternative to Kryo. That's one > > option: move off of Kryo to a Flink-internal serialization library. The > > other two options are upgrade to the new Kryo or use a different > > serialization library. > > > > The Kryo 5.5.0 upgrade PR I submitted ( > > https://github.com/apache/flink/pull/22660) with FLIP 317 ( > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0) > > works with records. The Flink app linked above that uses records works with > > the PR and that's what I posted to this mailing list a few weeks ago. I > > rebased the pull request on to the latest master branch and it's passing > > all tests. From my testing, it supports stateful upgrades, including > > checkpoints. If you can demonstrate a scenario where stateful upgrades > > error I can try to resolve that.
Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction
Hi all, Sorry for the late reply, I have a discussion with Lincoln offline, mainly about the naming of the hints option. Thanks Lincoln for the valuable suggestions. Let me answer the last email inline. >For `JavaAsyncTableFunc0` in flip, can you use a scenario like RPC call as an example? Sure, will give an example when adding the doc of async udtf and will update the FLIP simultaneously >For the name of this query hint, 'LATERAL' (include its internal options) don't show any relevance to async, but I haven't thought of a suitable name at the moment, After some discussion with Lincoln, We prefer to choose one of the `ASYNC_TABLE_FUNC` and `ASYNC_LATERAL`. Besides, In my opinion the keyword `lateral`'s use scenario is wider than the table function join, but in this case we only want to config the async table function, So I'm a bit more lean to the `ASYNC_TABLE_FUNC`. Looking forward to some inputs if you guys have some better suggestion on the naming. For the usage of the hints config option, I have updated the section of ConfigOption, you can refer to the FLIP for more details. >Also, the terms 'correlate join' and 'lateral join' are not the same as in the current joins page[1], so maybe it would be better if we unified them into 'join table function' Yes, we should unified to the 'join table function', updated. Best, Aitozi Lincoln Lee 于2023年6月15日周四 09:15写道: > Hi Aitozi, > > Thanks for your reply! Gives sql users more flexibility to get > asynchronous processing capabilities via lateral join table function +1 for > this > > For `JavaAsyncTableFunc0` in flip, can you use a scenario like RPC call as > an example? > > For the name of this query hint, 'LATERAL' (include its internal options) > don't show any relevance to async, but I haven't thought of a suitable name > at the moment, > maybe we need to highlight the async keyword directly, we can also see if > others have better candidates > > For the hint option "timeout = '180s'" should be "'timeout' = '180s'", > seems a typo in the flip. And use upper case for all keywords in sql > examples. > Also, the terms 'correlate join' and 'lateral join' are not the same as in > the current joins page[1], so maybe it would be better if we unified them > into 'join table function' > > [1] > > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#table-function > > Best, > Lincoln Lee > > > Aitozi 于2023年6月14日周三 16:11写道: > > > Hi Lincoln > > > > Very thanks for your valuable question. I will try to answer your > > questions inline. > > > > >Does the async udtf bring any additional benefits besides a > > lighter implementation? > > > > IMO, async udtf is more than a lighter implementation. It can act as a > > general way for sql users to use the async operator. And they don't have > to > > bind the async function with a table (a LookupTable), and they are not > > forced to join on an equality join condition, and they can use it to do > > more than enrich data. > > > > The async lookup join is more like a subset/specific usage of async udtf. > > The specific version has more opportunity to be optimized (like push > down) > > is acceptable. Async table function should be categorized to used-defined > > function. > > > > >Should users > > > > migrate to the lookup source when they encounter similar requirements or > > > > problems, or should we develop an additional set of similar mechanisms? > > > > As I clarified above, the lookup join is a specific usage of async udtf. > So > > it deserves more refined optimization like caching / retryable. But it > may > > not all > > > > suitable for the async udtf. As function, it can be deterministic/or > > non-deterministic. So caching is not suitable, and we also do not have a > > common cache for the udf now. So I think optimization like caching/retry > > should be handed over to the function implementor. > > > > > the newly added query hint need a different name that > > can be easier related to the lateral operation as the current join > hints[5] > > do. > > > > > > What about using LATERAL? > > > > as below > > > > SELECT /*+ LATERAL('output-mode' = 'ordered', 'capacity' = '200', > timeout = > > '180s') */ a, c1, c2 > > > > FROM T1 > > > > LEFT JOIN lateral TABLE (async_split(b)) AS T(c1, c2) ON true > > > > >For the async func example, since the target scenario is an external io > > operation, it's better to add the `close` method to actively release > > resources as a good example for users > > > > > > Make sense to me, will update the FLIP > > > > Best, > > > > Aitozi. > > > > Lincoln Lee 于2023年6月14日周三 14:24写道: > > > > > Hi Aitozi, > > > > > > Sorry for the lately reply here! Supports async > > udtf(`AsyncTableFunction`) > > > directly in sql seems like an attractive feature, but there're two > issues > > > that need to be addressed before we can be sure to add it: > > > 1. As mentioned in the flip[1], the current lookup function can already > > > implement the
Re: [NOTICE] Experimental Java 17 support now available on master
Hi Kurt, Thanks for your contribution. I am a little bit confused about the email title, since your PR[1] is not merged into the master yet. I guess, with "Experimental Java 17 support", you meant it is available on your branch which is based on the master. If I am not mistaken, there is no vote thread of FLIP 317 on ML. Would you like to follow the standard process[2] defined by the Flink community? Thanks! Best regards, Jing [1] https://github.com/apache/flink/pull/22660 [2] https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals On Sun, Jun 18, 2023 at 1:18 AM Kurt Ostfeld wrote: > I built the Flink master branch and tried running this simple Flink app > that uses a Java record: > > > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java > > It fails with the normal exception that Kryo 2.x throws when you try to > serialize a Java record. The full stack trace is here: > https://pastebin.com/HGhGKUWt > > I tried removing this line: > > > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java#L36 > > and that had no impact, I got the same error. > > In the other thread, you said that the plan was to use PojoSerializer to > serialize records rather than Kryo. Currently, the Flink code bases uses > Kryo 2.x by default for generic user data types, and that will fail when > the data type is a record or contains records. Ultimately, if Flink wants > to fully support Java records, it seems that it has to move off of Kryo > 2.x. PojoSerializer is part of what is basically a custom serialization > library internal to Flink that is an alternative to Kryo. That's one > option: move off of Kryo to a Flink-internal serialization library. The > other two options are upgrade to the new Kryo or use a different > serialization library. > > The Kryo 5.5.0 upgrade PR I submitted ( > https://github.com/apache/flink/pull/22660) with FLIP 317 ( > https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0) > works with records. The Flink app linked above that uses records works with > the PR and that's what I posted to this mailing list a few weeks ago. I > rebased the pull request on to the latest master branch and it's passing > all tests. From my testing, it supports stateful upgrades, including > checkpoints. If you can demonstrate a scenario where stateful upgrades > error I can try to resolve that.