Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable
Thanks Peter for driving this. The FLIP change is a big improvement of the existing sink v2 interfaces, but the FLIP's overall API design taste and interface style looks good to me. +1(binding) from my side. I also left a little minor comments we can improve later. (1) The FLIP title dosen't cover the proposed change scope well, right? (2) The pseudocode filename of StatefulSinkWriter has a typo. (3) I’ve found you’ve open a PR in Github, we’d better create issue and prepare a formal PR until the FLIP passed instead of discussion status. Best, Leonard > 2023年12月21日 上午11:47,Jiabao Sun 写道: > > Thanks Peter for driving this. > > +1 (non-binding) > > Best, > Jiabao > > > On 2023/12/18 12:06:05 Gyula Fóra wrote: >> +1 (binding) >> >> Gyula >> >> On Mon, 18 Dec 2023 at 13:04, Márton Balassi >> wrote: >> >>> +1 (binding) >>> >>> On Mon 18. Dec 2023 at 09:34, Péter Váry >>> wrote: >>> Hi everyone, Since there were no further comments on the discussion thread [1], I >>> would like to start the vote for FLIP-372 [2]. The FLIP started as a small new feature, but in the discussion thread and in a similar parallel thread [3] we opted for a somewhat bigger change in the Sink V2 API. Please read the FLIP and cast your vote. The vote will remain open for at least 72 hours and only concluded if >>> there are no objections and enough (i.e. at least 3) binding votes. Thanks, Peter [1] - https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd [2] - >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable [3] - https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57 >>>
Re: Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable
Thanks for the FLIP. +1 (non-binding) Best, Hang Jiabao Sun 于2023年12月21日周四 11:48写道: > Thanks Peter for driving this. > > +1 (non-binding) > > Best, > Jiabao > > > On 2023/12/18 12:06:05 Gyula Fóra wrote: > > +1 (binding) > > > > Gyula > > > > On Mon, 18 Dec 2023 at 13:04, Márton Balassi > > wrote: > > > > > +1 (binding) > > > > > > On Mon 18. Dec 2023 at 09:34, Péter Váry > > > wrote: > > > > > > > Hi everyone, > > > > > > > > Since there were no further comments on the discussion thread [1], I > > > would > > > > like to start the vote for FLIP-372 [2]. > > > > > > > > The FLIP started as a small new feature, but in the discussion > thread and > > > > in a similar parallel thread [3] we opted for a somewhat bigger > change in > > > > the Sink V2 API. > > > > > > > > Please read the FLIP and cast your vote. > > > > > > > > The vote will remain open for at least 72 hours and only concluded if > > > there > > > > are no objections and enough (i.e. at least 3) binding votes. > > > > > > > > Thanks, > > > > Peter > > > > > > > > [1] - > https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd > > > > [2] - > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable > > > > [3] - > https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57 > > > > > > > > >
Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5
+1 (non-binding) Best, Jiabao > 2023年12月21日 13:25,weijie guo 写道: > > +1(binding) > > Best regards, > > Weijie > > > Xin Jiang 于2023年12月21日周四 11:21写道: > >> 1.5 multiplier is indeed more reasonable. >> >> +1 (non-binding) >> >>> 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道: >>> >>> Hi everyone, >>> >>> Thank you to everyone for the feedback on FLIP-364: >>> Improve the restart-strategy[1] which has been voted in this thread[2]. >>> >>> After the vote on FLIP-364, there was some feedback on the user mail >>> list[3] >>> suggesting changing the default value of backoff-multiplier from 1.2 to >> 1.5. >>> >>> I would like to start a vote for it. The vote will be open for at least >> 72 >>> hours >>> unless there is an objection or not enough votes. >>> >>> [1] https://cwiki.apache.org/confluence/x/uJqzDw >>> [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9 >>> [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw >>> >>> Best, >>> Rui >> >>
Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5
+1(binding) Best regards, Weijie Xin Jiang 于2023年12月21日周四 11:21写道: > 1.5 multiplier is indeed more reasonable. > > +1 (non-binding) > > > 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道: > > > > Hi everyone, > > > > Thank you to everyone for the feedback on FLIP-364: > > Improve the restart-strategy[1] which has been voted in this thread[2]. > > > > After the vote on FLIP-364, there was some feedback on the user mail > > list[3] > > suggesting changing the default value of backoff-multiplier from 1.2 to > 1.5. > > > > I would like to start a vote for it. The vote will be open for at least > 72 > > hours > > unless there is an objection or not enough votes. > > > > [1] https://cwiki.apache.org/confluence/x/uJqzDw > > [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9 > > [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw > > > > Best, > > Rui > >
Re: [DISCUSS] Should Configuration support getting value based on String key?
Sounds make sense. We will do it in the FLIP. Best, Rui On Thu, 21 Dec 2023 at 10:20, Xintong Song wrote: > Ideally, public API changes should go through the FLIP process. > > I see the point that starting a FLIP for such a tiny change might be > overkill. However, one could also argue that anything that is too trivial > to go through a FLIP should also be easy enough to quickly get through the > process. > > In this particular case, since you and Xuannan are working on another FLIP > regarding the usage of string-keys in configurations, why not make removing > of the @Deprecated annotation part of that FLIP? > > Best, > > Xintong > > > > On Wed, Dec 20, 2023 at 9:27 PM Rui Fan <1996fan...@gmail.com> wrote: > > > Thanks Xuannan and Xintong for the quick feedback! > > > > > However, we should let the user know that we encourage using > > ConfigOptions over the string-based > > > configuration key, as Timo said, we should add the message to `String > > > getString(String key, String defaultValue)` method. > > > > Sure, we can add some comments to guide users to use ConfigOption. > > > > If so, I will remove the @Deprecated annotation for > > `getString(String key, String defaultValue)` method`, and add > > some comments for it. > > > > Also, it's indeed a small change related to Public class(or Api), > > is voting necessary? > > > > Best, > > Rui > > > > On Wed, 20 Dec 2023 at 16:55, Xintong Song > wrote: > > > > > Sounds good to me. > > > > > > Best, > > > > > > Xintong > > > > > > > > > > > > On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su > > wrote: > > > > > > > Hi Rui, > > > > > > > > I am fine with keeping the `String getString(String key, String > > > > defaultValue)` if more people favor it. However, we should let the > > > > user know that we encourage using ConfigOptions over the string-based > > > > configuration key, as Timo said, we should add the message to `String > > > > getString(String key, String defaultValue)` method. > > > > > > > > Best, > > > > Xuannan > > > > > > > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> > wrote: > > > > > > > > > > > > I noticed that Configuration is used in > > > > > > > DistributedCache#writeFileInfoToConfig and > readFileInfoFromConfig > > > > > > > to store some cacheFile meta-information. Their keys are > > > > > > > temporary(key name with number) and it is not convenient > > > > > > > to predefine ConfigOption. > > > > > > > > > > > > > > > > > > True, this one requires a bit more effort to migrate from > > string-key > > > to > > > > > > ConfigOption, but still should be doable. Looking at how the two > > > > mentioned > > > > > > methods are implemented and used, it seems what is really needed > is > > > > > > serialization and deserialization of `DistributedCacheEntry`-s. > And > > > > all the > > > > > > entries are always written / read at once. So I think we can > > > serialize > > > > the > > > > > > whole set of entries into a JSON string (or something similar), > and > > > > use one > > > > > > ConfigOption with a deterministic key for it, rather than having > > one > > > > > > ConfigOption for each field in each entry. WDYT? > > > > > > > > > > Hi Xintong, thanks for the good suggestion! Most of the entries can > > be > > > > > serialized to a json string, and we can only write/read them at > once. > > > > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to > > > > > store it by the setBytes/getBytes. > > > > > > > > > > Also, I have an offline discussion with @Xuannan Su : refactoring > all > > > > code > > > > > that uses String as key requires a separate FLIP. And we will > provide > > > > > detailed FLIP later. > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Hi all, thanks everyone for the lively discussion. It's really a > > > > trade-off to > > > > > keep "String getString(String key, String defaultValue)" or not. > > > > > (It's not a right or wrong thing.) > > > > > > > > > > Judging from the discussion, most discussants can accept that > keeping > > > > > `String getString(String key, String defaultValue)` and depreate > the > > > > > rest of `getXxx(String key, Xxx defaultValue)`. > > > > > > > > > > cc @Xintong Song @Xuannan Su , WDYT? > > > > > > > > > > Best, > > > > > Rui > > > > > > > > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu > wrote: > > > > >> > > > > >> I think it's not clear whether forcing using ConfigOption would > hurt > > > > >> the user experience. > > > > >> > > > > >> Maybe it does at the beginning, because using string keys to > access > > > > >> Flink configuration can be simpler for new components/jobs. > > > > >> However, problems may happen later if the configuration usages > > become > > > > >> more complex, like key renaming, using types other than strings, > and > > > > >> other problems that ConfigOption was invented to address. > > > > >> > > > > >> Personally I
Re: [VOTE] Release 1.18.1, release candidate #2
Thanks Jing Ge for driving this release. +1 (non-binding), I have checked: [✓] The checksums and signatures are validated [✓] The tag checked is fine [✓] Built from source is passed [✓] The flink-web PR is reviewed and checked Best, Zhongqiang Gong
RE: Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable
Thanks Peter for driving this. +1 (non-binding) Best, Jiabao On 2023/12/18 12:06:05 Gyula Fóra wrote: > +1 (binding) > > Gyula > > On Mon, 18 Dec 2023 at 13:04, Márton Balassi > wrote: > > > +1 (binding) > > > > On Mon 18. Dec 2023 at 09:34, Péter Váry > > wrote: > > > > > Hi everyone, > > > > > > Since there were no further comments on the discussion thread [1], I > > would > > > like to start the vote for FLIP-372 [2]. > > > > > > The FLIP started as a small new feature, but in the discussion thread and > > > in a similar parallel thread [3] we opted for a somewhat bigger change in > > > the Sink V2 API. > > > > > > Please read the FLIP and cast your vote. > > > > > > The vote will remain open for at least 72 hours and only concluded if > > there > > > are no objections and enough (i.e. at least 3) binding votes. > > > > > > Thanks, > > > Peter > > > > > > [1] - https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd > > > [2] - > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable > > > [3] - https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57 > > > > > >
Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5
1.5 multiplier is indeed more reasonable. +1 (non-binding) > 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道: > > Hi everyone, > > Thank you to everyone for the feedback on FLIP-364: > Improve the restart-strategy[1] which has been voted in this thread[2]. > > After the vote on FLIP-364, there was some feedback on the user mail > list[3] > suggesting changing the default value of backoff-multiplier from 1.2 to 1.5. > > I would like to start a vote for it. The vote will be open for at least 72 > hours > unless there is an objection or not enough votes. > > [1] https://cwiki.apache.org/confluence/x/uJqzDw > [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9 > [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw > > Best, > Rui
Re:Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures
Hi, Benchao. When Feng Jin and I tried the poc together, we found that when using udaf, Calcite directly using the function's input parameters from SqlCall#getOperandList. But in fact, these input parameters may use named arguments, the order of parameters may be wrong, and they may not include optional parameters that need to set default values. Actually, we should use new SqlCallBinding(this, scope, call).operands() to let this method correct the order and add default values. (You can see the modification in SqlToRelConverter in poc branch[1]) We have not reported this bug to the calcite community yet. Our original plan was to report this bug to the calcite community during the process of doing this flip, and fix it separately in flink's own calcite file. Because the time for Calcite to release the version is uncertain. And the time to upgrade flink to the latest calcite version is also unknown. The link to the poc code is at the bottom of the flip[2]. I'm post it here again[1]. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures [2] https://github.com/apache/flink/compare/master...hackergin:flink:poc_named_argument -- Best! Xuyang 在 2023-12-20 13:31:26,"Benchao Li" 写道: >I didn't see your POC code, so I assumed that you'll need to add >SqlStdOperatorTable#DEFAULT and >SqlStdOperatorTable#ARGUMENT_ASSIGNMENT to FlinkSqlOperatorTable, am I >right? > >If yes, this would enable many builtin functions to allow default and >optional arguments, for example, `select md5(DEFAULT)`, I guess this >is not what we want to support right? If so, I would suggest to throw >proper errors for these unexpected usages. > >Benchao Li 于2023年12月20日周三 13:16写道: >> >> Thanks Feng for driving this, it's a very useful feature. >> >> In the FLIP, you mentioned that >> > During POC verification, bugs were discovered in Calcite that caused >> > issues during the validation phase. We need to modify the SqlValidatorImpl >> > and SqlToRelConverter to address these problems. >> >> Could you log bugs in Calcite, and reference the corresponding Jira >> number in your code. We want to upgrade Calcite version to latest as >> much as possible, and maintaining many bugfixes in Flink will add >> additional burdens for upgrading Calcite. By adding corresponding >> issue numbers, we can easily make sure that we can remove these Flink >> hosted bugfixes when we upgrade to a version that already contains the >> fix. >> >> Feng Jin 于2023年12月14日周四 19:30写道: >> > >> > Hi Timo >> > Thanks for your reply. >> > >> > > 1) ArgumentNames annotation >> > >> > I'm sorry for my incorrect expression. argumentNames is a method of >> > FunctionHints. We should introduce a new arguments method to replace this >> > method and return Argument[]. >> > I updated the FLIP doc about this part. >> > >> > > 2) Evolution of FunctionHint >> > >> > +1 define DataTypeHint as part of ArgumentHint. I'll update the FLIP doc. >> > >> > > 3) Semantical correctness >> > >> > I realized that I forgot to submit the latest modification of the FLIP >> > document. Xuyang and I had a prior discussion before starting this discuss. >> > Let's restrict it to supporting only one eval() function, which will >> > simplify the overall design. >> > >> > Therefore, I also concur with not permitting overloaded named parameters. >> > >> > >> > Best, >> > Feng >> > >> > On Thu, Dec 14, 2023 at 6:15 PM Timo Walther wrote: >> > >> > > Hi Feng, >> > > >> > > thank you for proposing this FLIP. This nicely completes FLIP-65 which >> > > is great for usability. >> > > >> > > I read the FLIP and have some feedback: >> > > >> > > >> > > 1) ArgumentNames annotation >> > > >> > > > Deprecate the ArgumentNames annotation as it is not user-friendly for >> > > specifying argument names with optional configuration. >> > > >> > > Which annotation does the FLIP reference here? I cannot find it in the >> > > Flink code base. >> > > >> > > 2) Evolution of FunctionHint >> > > >> > > Introducing @ArgumentHint makes a lot of sense to me. However, using it >> > > within @FunctionHint looks complex, because there is both `input=` and >> > > `arguments=`. Ideally, the @DataTypeHint can be defined inline as part >> > > of the @ArgumentHint. It could even be the `value` such that >> > > `@ArgumentHint(@DataTypeHint("INT"))` is valid on its own. >> > > >> > > We could deprecate `input=`. Or let both `input` and `arguments=` >> > > coexist but never be defined at the same time. >> > > >> > > 3) Semantical correctness >> > > >> > > As you can see in the `TypeInference` class, named parameters are >> > > prepared in the stack already. However, we need to watch out between >> > > helpful explanation (see `InputTypeStrategy#getExpectedSignatures`) and >> > > named parameters (see `TypeInference.Builder#namedArguments`) that can >> > > be used in SQL. >> > > >> > > If I remember correctly, named parameters can be
Re: [DISCUSS] Should Configuration support getting value based on String key?
Ideally, public API changes should go through the FLIP process. I see the point that starting a FLIP for such a tiny change might be overkill. However, one could also argue that anything that is too trivial to go through a FLIP should also be easy enough to quickly get through the process. In this particular case, since you and Xuannan are working on another FLIP regarding the usage of string-keys in configurations, why not make removing of the @Deprecated annotation part of that FLIP? Best, Xintong On Wed, Dec 20, 2023 at 9:27 PM Rui Fan <1996fan...@gmail.com> wrote: > Thanks Xuannan and Xintong for the quick feedback! > > > However, we should let the user know that we encourage using > ConfigOptions over the string-based > > configuration key, as Timo said, we should add the message to `String > > getString(String key, String defaultValue)` method. > > Sure, we can add some comments to guide users to use ConfigOption. > > If so, I will remove the @Deprecated annotation for > `getString(String key, String defaultValue)` method`, and add > some comments for it. > > Also, it's indeed a small change related to Public class(or Api), > is voting necessary? > > Best, > Rui > > On Wed, 20 Dec 2023 at 16:55, Xintong Song wrote: > > > Sounds good to me. > > > > Best, > > > > Xintong > > > > > > > > On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su > wrote: > > > > > Hi Rui, > > > > > > I am fine with keeping the `String getString(String key, String > > > defaultValue)` if more people favor it. However, we should let the > > > user know that we encourage using ConfigOptions over the string-based > > > configuration key, as Timo said, we should add the message to `String > > > getString(String key, String defaultValue)` method. > > > > > > Best, > > > Xuannan > > > > > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote: > > > > > > > > > > I noticed that Configuration is used in > > > > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig > > > > > > to store some cacheFile meta-information. Their keys are > > > > > > temporary(key name with number) and it is not convenient > > > > > > to predefine ConfigOption. > > > > > > > > > > > > > > > True, this one requires a bit more effort to migrate from > string-key > > to > > > > > ConfigOption, but still should be doable. Looking at how the two > > > mentioned > > > > > methods are implemented and used, it seems what is really needed is > > > > > serialization and deserialization of `DistributedCacheEntry`-s. And > > > all the > > > > > entries are always written / read at once. So I think we can > > serialize > > > the > > > > > whole set of entries into a JSON string (or something similar), and > > > use one > > > > > ConfigOption with a deterministic key for it, rather than having > one > > > > > ConfigOption for each field in each entry. WDYT? > > > > > > > > Hi Xintong, thanks for the good suggestion! Most of the entries can > be > > > > serialized to a json string, and we can only write/read them at once. > > > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to > > > > store it by the setBytes/getBytes. > > > > > > > > Also, I have an offline discussion with @Xuannan Su : refactoring all > > > code > > > > that uses String as key requires a separate FLIP. And we will provide > > > > detailed FLIP later. > > > > > > > > > > > > > > -- > > > > > > > > Hi all, thanks everyone for the lively discussion. It's really a > > > trade-off to > > > > keep "String getString(String key, String defaultValue)" or not. > > > > (It's not a right or wrong thing.) > > > > > > > > Judging from the discussion, most discussants can accept that keeping > > > > `String getString(String key, String defaultValue)` and depreate the > > > > rest of `getXxx(String key, Xxx defaultValue)`. > > > > > > > > cc @Xintong Song @Xuannan Su , WDYT? > > > > > > > > Best, > > > > Rui > > > > > > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu wrote: > > > >> > > > >> I think it's not clear whether forcing using ConfigOption would hurt > > > >> the user experience. > > > >> > > > >> Maybe it does at the beginning, because using string keys to access > > > >> Flink configuration can be simpler for new components/jobs. > > > >> However, problems may happen later if the configuration usages > become > > > >> more complex, like key renaming, using types other than strings, and > > > >> other problems that ConfigOption was invented to address. > > > >> > > > >> Personally I prefer to encourage the usage of ConfigOption. > > > >> Jobs should use GlobalJobParameter for custom config, which is > > different > > > >> from the Configuration interface. Therefore, Configuration is mostly > > > >> used in other components/plugins, in which case the long-term > > > maintenance > > > >> can be important. > > > >> > > > >> However, since it is not a right or wrong choice, I'd
Re:Re: [VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
Thanks for driving this FLIP, +1 for it (non binding) -- Best! Xuyang 在 2023-12-21 06:08:02,"Natea Eshetu Beshada" 写道: >Thanks for the FLIP Alan, this will be a great addition +1 (non binding) > >On Wed, Dec 20, 2023 at 11:41 AM Alan Sheinberg > wrote: > >> Hi everyone, >> >> I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF >> type, AsyncScalarFunction for completing invocations asynchronously. It >> has been discussed in this thread [2]. >> >> I would like to start a vote. The vote will be open for at least 72 hours >> (until December 28th 18:00 GMT) unless there is an objection or >> insufficient votes. >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support >> [2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2 >> >> Thanks, >> Alan >>
Re: [DISCUSS] Release flink-connector-mongodb v1.1.0
Thanks Jiabao for driving this. +1 to release flink-connector-mongodb v1.1.0 which supports Flink 1.18 and Flink 1.17. I’d like to help review the pending PRs and manage the release as well. Best, Leonard > 2023年12月20日 下午9:59,Jiabao Sun 写道: > > Hi, > > Now Flink 1.18 is released. I propose to release flink-connector-mongodb > v1.1.0 > to support Flink 1.18 and nominate Leonard Xu as the release manager. > > This release includes support for filter pushdown[1] and JDK 17[2]. > We plan to proceed with the release after merging these two pull requests. > If you have time, please help review them as well. > > Please let me know if the proposal is a good idea. > > Best, > Jiabao Sun > > [1] https://github.com/apache/flink-connector-mongodb/pull/17 > [2] https://github.com/apache/flink-connector-mongodb/pull/21
Re: [DISCUSS] FLIP-389: Annotate SingleThreadFetcherManager and FutureCompletingBlockingQueue as PublicEvolving
Hi Hongshun, I think the proposal in the FLIP is basically fine. A few minor comments: 1. In FLIPs, we define all the user-sensible changes as public interfaces. The public interface section should list all of them. So, the code blocks currently in the proposed changes section should be put into the public interface section instead. 2. It would be good to put all the changes of one class together. For example, for SplitFetcherManager, we can say: - Change SplitFetcherManager from Internal to PublicEvolving. - Deprecate the old constructor exposing the FutureCompletingBlockingQueue, and add new constructors as replacements which creates the FutureCompletingBlockingQueue instance internally. - Add a few new methods to expose the functionality of the internal FutureCompletingBlockingQueue via the SplitFetcherManager. And then follows the code block containing all the changes above. Ideally, the changes should come with something like "// <-- New", so that it is. easier to be found. 3. In the proposed changes section, it would be good to add some more detailed explanation of the idea behind the public interface changes. So even people new to Flink can understand better how exactly the interface changes will help fulfill the motivation. For example, regarding the constructor signature change, we can say the following. We can mention a few things in this section: - By exposing the SplitFetcherManager / SingleThreadFetcheManager, by implementing addSplits() and removeSplits(), connector developers can easily create their own threading models in the SourceReaderBase. - Note that the SplitFetcher constructor is package private, so users can only create SplitFetchers via SplitFetcherManager.createSplitFetcher(). This ensures each SplitFetcher is always owned by the SplitFetcherManager. - This FLIP essentially embedded the element queue (a FutureCompletingBlockingQueue) instance into the SplitFetcherManager. This hides the element queue from the connector developers and simplifies the SourceReaderBase to consist of only SplitFetcherManager and RecordEmitter as major components. In short, the public interface section answers the question of "what". We should list all the user-sensible changes in the public interface section, without verbose explanation. The proposed changes section answers "how", where we can add more details to explain the changes listed in the public interface section. Thanks, Jiangjie (Becket) Qin On Wed, Dec 20, 2023 at 10:07 AM Hongshun Wang wrote: > Hi Becket, > > > It has been a long time since we last discussed. Are there any other > problems with this Flip from your side? I am looking forward to hearing > from you. > > > Thanks, > Hongshun Wang >
Re: [VOTE] Release flink-connector-pulsar 4.1.0, release candidate #1
+1 (non-binding) - Downloaded artifacts - Maven staging artifacts look good - Verified checksum && keys - Verified LICENSE and NOTICE files - Built from source On Wed, Dec 20, 2023 at 5:13 AM tison wrote: > Hi Leonard, > > You are a PMC member also. Perhaps you can check the candidate and > vote on what you do :D > > Best, > tison. > > Leonard Xu 于2023年12月20日周三 11:35写道: > > > > Bubble up, I need more votes, especially from PMC members. > > > > Best, > > Leonard > > > > > 2023年12月14日 下午11:03,Hang Ruan 写道: > > > > > > +1 (non-binding) > > > > > > - Validated checksum hash > > > - Verified signature > > > - Verified that no binaries exist in the source archive > > > - Build the source with jdk8 > > > - Verified web PR > > > - Make sure flink-connector-base have the provided scope > > > > > > Best, > > > Hang > > > > > > tison 于2023年12月14日周四 11:51写道: > > > > > >> Thanks Leonard for driving this release! > > >> > > >> +1 (non-binding) > > >> > > >> * Download link valid > > >> * Maven staging artifacts look good. > > >> * Checksum and gpg matches > > >> * LICENSE and NOTICE exist > > >> * Can build from source. > > >> > > >> Best, > > >> tison. > > >> > > >> Rui Fan <1996fan...@gmail.com> 于2023年12月14日周四 09:23写道: > > >>> > > >>> Thanks Leonard for driving this release! > > >>> > > >>> +1 (non-binding) > > >>> > > >>> - Validated checksum hash > > >>> - Verified signature > > >>> - Verified that no binaries exist in the source archive > > >>> - Build the source with Maven and jdk8 > > >>> - Verified licenses > > >>> - Verified web PRs, left a minor comment > > >>> > > >>> Best, > > >>> Rui > > >>> > > >>> On Wed, Dec 13, 2023 at 7:15 PM Leonard Xu > wrote: > > > > Hey all, > > > > Please review and vote on the release candidate #1 for the version > > >> 4.1.0 of the Apache Flink Pulsar Connector as follows: > > > > [ ] +1, Approve the release > > [ ] -1, Do not approve the release (please provide specific > comments) > > > > The complete staging area is available for your review, which > includes: > > * JIRA release notes [1], > > * The official Apache source release to be deployed to > dist.apache.org > > >> [2], which are signed with the key with fingerprint > > 5B2F6608732389AEB67331F5B197E1F1108998AD [3], > > * All artifacts to be deployed to the Maven Central Repository [4], > > * Source code tag v4.1.0-rc1 [5], > > * Website pull request listing the new release [6]. > > > > The vote will be open for at least 72 hours. It is adopted by > majority > > >> approval, with at least 3 PMC affirmative votes. > > > > > > Best, > > Leonard > > > > [1] > > >> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353431 > > [2] > > >> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-pulsar-4.1.0-rc1/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] > > >> > https://repository.apache.org/content/repositories/orgapacheflink-1688/ > > [5] > https://github.com/apache/flink-connector-pulsar/tree/v4.1.0-rc1 > > [6] https://github.com/apache/flink-web/pull/703 > > >> > > > -- Best regards, Sergey
Re: [VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
Thanks for the FLIP Alan, this will be a great addition +1 (non binding) On Wed, Dec 20, 2023 at 11:41 AM Alan Sheinberg wrote: > Hi everyone, > > I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF > type, AsyncScalarFunction for completing invocations asynchronously. It > has been discussed in this thread [2]. > > I would like to start a vote. The vote will be open for at least 72 hours > (until December 28th 18:00 GMT) unless there is an objection or > insufficient votes. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support > [2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2 > > Thanks, > Alan >
[jira] [Created] (FLINK-33919) AutoRescalingITCase hangs on AZP
Sergey Nuyanzin created FLINK-33919: --- Summary: AutoRescalingITCase hangs on AZP Key: FLINK-33919 URL: https://issues.apache.org/jira/browse/FLINK-33919 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Affects Versions: 1.19.0 Reporter: Sergey Nuyanzin This build fails on AZP https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55700=logs=a657ddbf-d986-5381-9649-342d9c92e7fb=dc085d4a-05c8-580e-06ab-21f5624dab16=8608 because of waiting {noformat} Dec 20 02:07:46 "main" #1 [14299] prio=5 os_prio=0 cpu=12675.70ms elapsed=3115.94s tid=0x7f3f71481600 nid=14299 waiting on condition [0x7f3f74913000] Dec 20 02:07:46java.lang.Thread.State: TIMED_WAITING (sleeping) Dec 20 02:07:46 at java.lang.Thread.sleep0(java.base@21.0.1/Native Method) Dec 20 02:07:46 at java.lang.Thread.sleep(java.base@21.0.1/Thread.java:509) Dec 20 02:07:46 at org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:152) Dec 20 02:07:46 at org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145) Dec 20 02:07:46 at org.apache.flink.runtime.testutils.CommonTestUtils.waitForOneMoreCheckpoint(CommonTestUtils.java:374) Dec 20 02:07:46 at org.apache.flink.test.checkpointing.AutoRescalingITCase.testCheckpointRescalingKeyedState(AutoRescalingITCase.java:265) Dec 20 02:07:46 at org.apache.flink.test.checkpointing.AutoRescalingITCase.testCheckpointRescalingInKeyedState(AutoRescalingITCase.java:196) Dec 20 02:07:46 at java.lang.invoke.LambdaForm$DMH/0x7f3f0f201400.invokeVirtual(java.base@21.0.1/LambdaForm$DMH) Dec 20 02:07:46 at java.lang.invoke.LambdaForm$MH/0x7f3f0f20c000.invoke(java.base@21.0.1/LambdaForm$MH) Dec 20 02:07:46 at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@21.0.1/Invokers$Holder) Dec 20 02:07:46 at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base@21.0.1/DirectMethodHandleAccessor.java:153) Dec 20 02:07:46 at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base@21.0.1/DirectMethodHandleAccessor.java:103) Dec 20 02:07:46 at java.lang.reflect.Method.invoke(java.base@21.0.1/Method.java:580) Dec 20 02:07:46 at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) Dec 20 02:07:46 at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) Dec 20 02:07:46 at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) Dec 20 02:07:46 at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) Dec 20 02:07:46 at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) Dec 20 02:07:46 at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) Dec 20 02:07:46 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) Dec 20 02:07:46 at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
Hi everyone, I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF type, AsyncScalarFunction for completing invocations asynchronously. It has been discussed in this thread [2]. I would like to start a vote. The vote will be open for at least 72 hours (until December 28th 18:00 GMT) unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support [2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2 Thanks, Alan
[jira] [Created] (FLINK-33918) Fix AsyncSinkWriterThrottlingTest test failure
Jim Hughes created FLINK-33918: -- Summary: Fix AsyncSinkWriterThrottlingTest test failure Key: FLINK-33918 URL: https://issues.apache.org/jira/browse/FLINK-33918 Project: Flink Issue Type: Bug Affects Versions: 1.19.0 Reporter: Jim Hughes >From >[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55700=logs=1c002d28-a73d-5309-26ee-10036d8476b4=d1c117a6-8f13-5466-55f0-d48dbb767fcd] ``` Dec 20 03:09:03 03:09:03.411 [ERROR] org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize -- Time elapsed: 0.879 s <<< ERROR! Dec 20 03:09:03 java.lang.IllegalStateException: Illegal thread detected. This method must be called from inside the mailbox thread! Dec 20 03:09:03 at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkIsMailboxThread(TaskMailboxImpl.java:262) Dec 20 03:09:03 at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:137) Dec 20 03:09:03 at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.yield(MailboxExecutorImpl.java:84) Dec 20 03:09:03 at org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.flush(AsyncSinkWriter.java:367) Dec 20 03:09:03 at org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.lambda$registerCallback$3(AsyncSinkWriter.java:315) Dec 20 03:09:03 at org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService$CallbackTask.onProcessingTime(TestProcessingTimeService.java:199) Dec 20 03:09:03 at org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService.setCurrentTime(TestProcessingTimeService.java:76) Dec 20 03:09:03 at org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize(AsyncSinkWriterThrottlingTest.java:64) Dec 20 03:09:03 at java.lang.reflect.Method.invoke(Method.java:498) ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
I'm delighted to see the progress on this. This is going to be a major enabler for some important use cases. The proposed simplifications (global config and ordered mode) for V1 make a lot of sense to me. +1 David On Wed, Dec 20, 2023 at 12:31 PM Alan Sheinberg wrote: > Thanks for that feedback Lincoln, > > Only one question with the async `timeout` parameter[1](since I > > haven't seen the POC code), current description is: 'The time which can > > pass before a restart strategy is triggered', > > but in the previous flip-232[2] and flip-234[3], in retry scenario, this > > timeout is the total time, do we keep the behavior of the parameter > > consistent? > > That's a good catch. I was intending to use *AsyncWaitOperator*, and to > pass this timeout directly. Looking through the code a bit, it appears > that it doesn't restart the timer on a retry, and this timeout is total, as > you're saying. I do intend on being consistent with the other FLIPs and > retaining this behavior, so I will update the wording on my FLIP to reflect > that. > > -Alan > > On Wed, Dec 20, 2023 at 1:36 AM Lincoln Lee > wrote: > > > +1 for this useful feature! > > Hope this reply isn't too late. Agree that we start with global > > async-scalar configuration and ordered mode first. > > > > @Alan Only one question with the async `timeout` parameter[1](since I > > haven't seen the POC code), current description is: 'The time which can > > pass before a restart strategy is triggered', > > but in the previous flip-232[2] and flip-234[3], in retry scenario, this > > timeout is the total time, do we keep the behavior of the parameter > > consistent? > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support > > [2] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963 > > [3] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems > > > > Best, > > Lincoln Lee > > > > > > Alan Sheinberg 于2023年12月20日周三 08:41写道: > > > > > Thanks for the comments Timo. > > > > > > > > > > Can you remove the necessary parts? Esp.: > > > > > > @Override > > > > public Set getRequirements() { > > > > return Collections.singleton(FunctionRequirement.ORDERED); > > > > } > > > > > > > > > I removed this section from the FLIP since presumably, there's no use > in > > > adding to the public API if it's ignored, with handling just ORDERED > for > > > the first version. I'm not sure how quickly I'll want to add UNORDERED > > > support, but I guess I can always do another FLIP. > > > > > > Otherwise I have no objections to start a VOTE soonish. If others are > > > > fine as well? > > > > > > That would be great. Any areas that people are interested in > discussing > > > further before a vote? > > > > > > -Alan > > > > > > On Tue, Dec 19, 2023 at 5:49 AM Timo Walther > wrote: > > > > > > > > I would be totally fine with the first version only having ORDERED > > > > > mode. For a v2, we could attempt to do the next most conservative > > > > > thing > > > > > > > > Sounds good to me. > > > > > > > > I also cheked AsyncWaitOperator and could not find n access of > > > > StreamRecord's timestamp but only watermarks. But as we said, let's > > > > focus on ORDERED first. > > > > > > > > Can you remove the necessary parts? Esp.: > > > > > > > > @Override > > > > public Set getRequirements() { > > > > return Collections.singleton(FunctionRequirement.ORDERED); > > > > } > > > > > > > > Otherwise I have no objections to start a VOTE soonish. If others are > > > > fine as well? > > > > > > > > Regards, > > > > Timo > > > > > > > > > > > > On 19.12.23 07:32, Alan Sheinberg wrote: > > > > > Thanks for the helpful comments, Xuyang and Timo. > > > > > > > > > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take > > kafka > > > as > > > > >> source and mysql as sink as an example. > > > > >> Although kafka is an append-only source, one of its fields is used > > as > > > pk > > > > >> when writing to mysql. If async udx is executed > > > > >> in an unordered mode, there may be problems with the data in > mysql > > > in > > > > the > > > > >> end. In this case, we need to ensure that > > > > >> the sink-based pk is in order actually. > > > > > > > > > > > > > > > @Xuyang: That's a great point. If some node downstream of my > > operator > > > > > cares about ordering, there's no way for it to reconstruct the > > original > > > > > ordering of the rows as they were input to my operator. So even if > > > they > > > > > want to preserve ordering by key, the order in which they see it > may > > > > > already be incorrect. Somehow I thought that maybe the analysis of > > the > > > > > changelog mode at a given operator was aware of downstream > > operations, > > > > but > > > > > it seems not. > > > > > >
[jira] [Created] (FLINK-33917) IllegalArgumentException: hostname can't be null
Tom created FLINK-33917: --- Summary: IllegalArgumentException: hostname can't be null Key: FLINK-33917 URL: https://issues.apache.org/jira/browse/FLINK-33917 Project: Flink Issue Type: Bug Components: Kubernetes Operator Reporter: Tom In certain scenarios, if the hostname contains certain characters it will throw an exception when it tries to initialize the InetSocketAddress @Override public boolean isJobManagerPortReady(Configuration config) { final URI uri; try (var clusterClient = getClusterClient(config)) { uri = URI.create(clusterClient.getWebInterfaceURL()); } catch (Exception ex) { throw new FlinkRuntimeException(ex); } SocketAddress socketAddress = new InetSocketAddress(uri.getHost(), uri.getPort()); Socket socket = new Socket(); try { socket.connect(socketAddress, 1000); socket.close(); return true; } catch (IOException e) { return false; } } -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
Thanks for that feedback Lincoln, Only one question with the async `timeout` parameter[1](since I > haven't seen the POC code), current description is: 'The time which can > pass before a restart strategy is triggered', > but in the previous flip-232[2] and flip-234[3], in retry scenario, this > timeout is the total time, do we keep the behavior of the parameter > consistent? That's a good catch. I was intending to use *AsyncWaitOperator*, and to pass this timeout directly. Looking through the code a bit, it appears that it doesn't restart the timer on a retry, and this timeout is total, as you're saying. I do intend on being consistent with the other FLIPs and retaining this behavior, so I will update the wording on my FLIP to reflect that. -Alan On Wed, Dec 20, 2023 at 1:36 AM Lincoln Lee wrote: > +1 for this useful feature! > Hope this reply isn't too late. Agree that we start with global > async-scalar configuration and ordered mode first. > > @Alan Only one question with the async `timeout` parameter[1](since I > haven't seen the POC code), current description is: 'The time which can > pass before a restart strategy is triggered', > but in the previous flip-232[2] and flip-234[3], in retry scenario, this > timeout is the total time, do we keep the behavior of the parameter > consistent? > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support > [2] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963 > [3] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems > > Best, > Lincoln Lee > > > Alan Sheinberg 于2023年12月20日周三 08:41写道: > > > Thanks for the comments Timo. > > > > > > > Can you remove the necessary parts? Esp.: > > > > @Override > > > public Set getRequirements() { > > > return Collections.singleton(FunctionRequirement.ORDERED); > > > } > > > > > > I removed this section from the FLIP since presumably, there's no use in > > adding to the public API if it's ignored, with handling just ORDERED for > > the first version. I'm not sure how quickly I'll want to add UNORDERED > > support, but I guess I can always do another FLIP. > > > > Otherwise I have no objections to start a VOTE soonish. If others are > > > fine as well? > > > > That would be great. Any areas that people are interested in discussing > > further before a vote? > > > > -Alan > > > > On Tue, Dec 19, 2023 at 5:49 AM Timo Walther wrote: > > > > > > I would be totally fine with the first version only having ORDERED > > > > mode. For a v2, we could attempt to do the next most conservative > > > > thing > > > > > > Sounds good to me. > > > > > > I also cheked AsyncWaitOperator and could not find n access of > > > StreamRecord's timestamp but only watermarks. But as we said, let's > > > focus on ORDERED first. > > > > > > Can you remove the necessary parts? Esp.: > > > > > > @Override > > > public Set getRequirements() { > > > return Collections.singleton(FunctionRequirement.ORDERED); > > > } > > > > > > Otherwise I have no objections to start a VOTE soonish. If others are > > > fine as well? > > > > > > Regards, > > > Timo > > > > > > > > > On 19.12.23 07:32, Alan Sheinberg wrote: > > > > Thanks for the helpful comments, Xuyang and Timo. > > > > > > > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take > kafka > > as > > > >> source and mysql as sink as an example. > > > >> Although kafka is an append-only source, one of its fields is used > as > > pk > > > >> when writing to mysql. If async udx is executed > > > >> in an unordered mode, there may be problems with the data in mysql > > in > > > the > > > >> end. In this case, we need to ensure that > > > >> the sink-based pk is in order actually. > > > > > > > > > > > > @Xuyang: That's a great point. If some node downstream of my > operator > > > > cares about ordering, there's no way for it to reconstruct the > original > > > > ordering of the rows as they were input to my operator. So even if > > they > > > > want to preserve ordering by key, the order in which they see it may > > > > already be incorrect. Somehow I thought that maybe the analysis of > the > > > > changelog mode at a given operator was aware of downstream > operations, > > > but > > > > it seems not. > > > > > > > > Clear "no" on this. Changelog semantics make the planner complex and > we > > > >> need to be careful. Therefore I would strongly suggest we introduce > > > >> ORDERED and slowly enable UNORDERED whenever we see a good fit for > it > > in > > > >> plans with appropriate planner rules that guard it. > > > > > > > > > > > > @Timo: The better I understand the complexity, the more I agree with > > > this. > > > > I would be totally fine with the first version only having ORDERED > > mode. > > > > For a v2, we could attempt to do the next
Re: [VOTE] Release 1.18.1, release candidate #2
+1 (non-binding) - Validated hashes - Verified signature - Build the source with Maven - Test with the kafka connector 3.0.2: read and write records from kafka in sql client - Verified web PRs Best, Hang Jiabao Sun 于2023年12月20日周三 16:57写道: > Thanks Jing for driving this release. > > +1 (non-binding) > > - Validated hashes > - Verified signature > - Checked the tag > - Build the source with Maven > - Verified web PRs > > Best, > Jiabao > > > > 2023年12月20日 07:38,Jing Ge 写道: > > > > Hi everyone, > > > > The release candidate #1 has been skipped. Please review and vote on the > > release candidate #2 for the version 1.18.1, > > > > as follows: > > > > [ ] +1, Approve the release > > > > [ ] -1, Do not approve the release (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > > > * the official Apache source release and binary convenience releases to > be > > deployed to dist.apache.org [2], which are signed with the key with > > fingerprint 96AE0E32CBE6E0753CE6 [3], > > > > * all artifacts to be deployed to the Maven Central Repository [4], > > > > * source code tag "release-1.18.1-rc2" [5], > > > > * website pull request listing the new release and adding announcement > blog > > post [6]. > > > > > > The vote will be open for at least 72 hours. It is adopted by majority > > approval, with at least 3 PMC affirmative votes. > > > > > > [1] > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640 > > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.1-rc2/ > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > > > [4] > https://repository.apache.org/content/repositories/orgapacheflink-1689 > > > > [5] https://github.com/apache/flink/releases/tag/release-1.18.1-rc2 > > > > [6] https://github.com/apache/flink-web/pull/706 > > > > Thanks, > > Release Manager > >
Re: [DISCUSS] FLIP-377: Support configuration to disable filter push down for Table/SQL Sources
Hi, Thank you to everyone for the discussion on this FLIP, especially Becket for providing guidance that made it more reasonable. The FLIP document[1] has been updated with the recent discussed content. Please take a look to double-check it when you have time. If we can reach a consensus on this, I will open the voting thread in recent days. Best, Jiabao [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768 > 2023年12月20日 11:38,Jiabao Sun 写道: > > Thanks Becket, > > The behavior description has been added to the Public Interfaces section. > > Best, > Jiabao > > >> 2023年12月20日 08:17,Becket Qin 写道: >> >> Hi Jiabao, >> >> Thanks for updating the FLIP. >> Can you add the behavior of the policies that are only applicable to some >> but not all of the databases? This is a part of the intended behavior of >> the proposed configuration. So, we should include that in the FLIP. >> Otherwise, the FLIP looks good to me. >> >> Cheers, >> >> Jiangjie (Becket) Qin >> >> On Tue, Dec 19, 2023 at 11:00 PM Jiabao Sun >> wrote: >> >>> Hi Becket, >>> >>> I share the same view as you regarding the prefix for this configuration >>> option. >>> >>> For the JDBC connector, I prefer setting 'filter.handling.policy' = 'FOO' >>> and throwing an exception when the database do not support that specific >>> policy. >>> >>> Not using a prefix can reduce the learning curve for users and avoid >>> introducing a new set of configuration options for every supported JDBC >>> database. >>> I think the policies we provide can be compatible with most databases that >>> follow the JDBC protocol. >>> However, there may be cases where certain databases cannot support some >>> policies. >>> Nevertheless, we can ensure fast failure and allow users to choose a >>> suitable policy in such situations. >>> >>> I have removed the contents about the configuration prefix. >>> Please help review it again. >>> >>> Thanks, >>> Jiabao >>> >>> 2023年12月19日 19:46,Becket Qin 写道: Hi Jiabao, Thanks for updating the FLIP. One more question regarding the JDBC connector, since it is a connector shared by multiple databases, what if there is a filter handling policy that is only applicable to one of the databases, but not the others? In that case, how would the users specify that policy? Unlike the example of orc format with 2nd+ level config, JDBC connector only looks at the URL to decide which driver to use. For example, MySql supports policy FOO while other databases do not. If users want to use FOO for MySql, what should they do? Will they set '*mysql.filter.hanlding.policy' = 'FOO', *which will only be picked up when the MySql driver is used? Or they should just set* 'filter.handling.policy' = 'FOO', *and throw exceptions when other JDBC drivers are used? Personally, I prefer the latter. If we pick that, do we still need to mention the following? *The prefix is needed when the option is for a 2nd+ level. * > 'connector' = 'filesystem', > 'format' = 'orc', > 'orc.filter.handling.policy' = 'NUBERIC_ONY' > > *In this case, the values of this configuration may be different >>> depending > on the format option. For example, orc format may have INDEXED_ONLY >>> while > parquet format may have something else. * > I found this is somewhat misleading, because the example here is not a >>> part of the proposal of this FLIP. It is just an example explaining when a prefix is needed, which seems orthogonal to the proposal in this FLIP. Thanks, Jiangjie (Becket) Qin On Tue, Dec 19, 2023 at 10:09 AM Jiabao Sun >> .invalid> wrote: > Thanks Becket for the suggestions, > > Updated. > Please help review it again when you have time. > > Best, > Jiabao > > >> 2023年12月19日 09:06,Becket Qin 写道: >> >> Hi JIabao, >> >> Thanks for updating the FLIP. It looks better. Some suggestions / > questions: >> >> 1. In the motivation section: >> >>> *Currently, Flink Table/SQL does not expose fine-grained control for > users >>> to control filter pushdown. **However, filter pushdown has some side >>> effects, such as additional computational pressure on external >>> systems. Moreover, Improper queries can lead to issues such as full > table >>> scans, which in turn can impact the stability of external systems.* >> >> This statement sounds like the side effects are there for all the > systems, >> which is inaccurate. Maybe we can say: >> *Currently, Flink Table/SQL does not expose fine-grained control for > users >> to control filter pushdown. **However, filter pushdown may have side >> effects in some cases, **such as additional computational pressure on >> external systems. The JDBC
[DISCUSS] Release flink-connector-mongodb v1.1.0
Hi, Now Flink 1.18 is released. I propose to release flink-connector-mongodb v1.1.0 to support Flink 1.18 and nominate Leonard Xu as the release manager. This release includes support for filter pushdown[1] and JDK 17[2]. We plan to proceed with the release after merging these two pull requests. If you have time, please help review them as well. Please let me know if the proposal is a good idea. Best, Jiabao Sun [1] https://github.com/apache/flink-connector-mongodb/pull/17 [2] https://github.com/apache/flink-connector-mongodb/pull/21
Re: [DISCUSS] Should Configuration support getting value based on String key?
Thanks Xuannan and Xintong for the quick feedback! > However, we should let the user know that we encourage using ConfigOptions over the string-based > configuration key, as Timo said, we should add the message to `String > getString(String key, String defaultValue)` method. Sure, we can add some comments to guide users to use ConfigOption. If so, I will remove the @Deprecated annotation for `getString(String key, String defaultValue)` method`, and add some comments for it. Also, it's indeed a small change related to Public class(or Api), is voting necessary? Best, Rui On Wed, 20 Dec 2023 at 16:55, Xintong Song wrote: > Sounds good to me. > > Best, > > Xintong > > > > On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su wrote: > > > Hi Rui, > > > > I am fine with keeping the `String getString(String key, String > > defaultValue)` if more people favor it. However, we should let the > > user know that we encourage using ConfigOptions over the string-based > > configuration key, as Timo said, we should add the message to `String > > getString(String key, String defaultValue)` method. > > > > Best, > > Xuannan > > > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote: > > > > > > > > I noticed that Configuration is used in > > > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig > > > > > to store some cacheFile meta-information. Their keys are > > > > > temporary(key name with number) and it is not convenient > > > > > to predefine ConfigOption. > > > > > > > > > > > > True, this one requires a bit more effort to migrate from string-key > to > > > > ConfigOption, but still should be doable. Looking at how the two > > mentioned > > > > methods are implemented and used, it seems what is really needed is > > > > serialization and deserialization of `DistributedCacheEntry`-s. And > > all the > > > > entries are always written / read at once. So I think we can > serialize > > the > > > > whole set of entries into a JSON string (or something similar), and > > use one > > > > ConfigOption with a deterministic key for it, rather than having one > > > > ConfigOption for each field in each entry. WDYT? > > > > > > Hi Xintong, thanks for the good suggestion! Most of the entries can be > > > serialized to a json string, and we can only write/read them at once. > > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to > > > store it by the setBytes/getBytes. > > > > > > Also, I have an offline discussion with @Xuannan Su : refactoring all > > code > > > that uses String as key requires a separate FLIP. And we will provide > > > detailed FLIP later. > > > > > > > > > -- > > > > > > Hi all, thanks everyone for the lively discussion. It's really a > > trade-off to > > > keep "String getString(String key, String defaultValue)" or not. > > > (It's not a right or wrong thing.) > > > > > > Judging from the discussion, most discussants can accept that keeping > > > `String getString(String key, String defaultValue)` and depreate the > > > rest of `getXxx(String key, Xxx defaultValue)`. > > > > > > cc @Xintong Song @Xuannan Su , WDYT? > > > > > > Best, > > > Rui > > > > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu wrote: > > >> > > >> I think it's not clear whether forcing using ConfigOption would hurt > > >> the user experience. > > >> > > >> Maybe it does at the beginning, because using string keys to access > > >> Flink configuration can be simpler for new components/jobs. > > >> However, problems may happen later if the configuration usages become > > >> more complex, like key renaming, using types other than strings, and > > >> other problems that ConfigOption was invented to address. > > >> > > >> Personally I prefer to encourage the usage of ConfigOption. > > >> Jobs should use GlobalJobParameter for custom config, which is > different > > >> from the Configuration interface. Therefore, Configuration is mostly > > >> used in other components/plugins, in which case the long-term > > maintenance > > >> can be important. > > >> > > >> However, since it is not a right or wrong choice, I'd also be fine > > >> to keep the `getString()` method if more devs/users are in favor of > it. > > >> > > >> Thanks, > > >> Zhu > > >> > > >> Timo Walther 于2023年12月14日周四 17:41写道: > > >> > > >> > The configuration in Flink is complicated and I fear we won't have > > >> > enough capacity to substantially fix it. The introduction of > > >> > ReadableConfig, WritableConfig, and typed ConfigOptions was a right > > step > > >> > into making the code more maintainable. From the Flink side, every > > read > > >> > access should go through ConfigOption. > > >> > > > >> > However, I also understand Gyula pragmatism here because > (practically > > >> > speaking) users get access `getString()` via `toMap().get()`. So I'm > > >> > fine with removing the deprecation for functionality that is > available > > >> >
Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink
Hi Ken, Thanks for your feedback. The purpose of this FLIP is to improve the use of serialization, including configurable serializer for users, providing serializer for composite data types, and resolving the default enabling of Kryo, etc. Introducing a better serialization framework would be a great help for Flink's performance, and it's great to see your tests on Fury. However, as @Xintong mentioned, this could be a huge work and beyond the scope of this FLIP. If you're interested, I think we could create a new FLIP for it and discuss it further. What do you think? Thanks. Best, Fang Yong On Mon, Dec 18, 2023 at 11:16 AM Xintong Song wrote: > Hi Ken, > > I think the main purpose of this FLIP is to change how users interact with > the knobs for customizing the serialization behaviors, from requiring code > changes to working with pure configurations. Redesigning the knobs (i.e., > names, semantics, etc.), on the other hand, is not the purpose of this > FLIP. Preserving the existing names and semantics should also help minimize > the migration cost for existing users. Therefore, I'm in favor of not > changing them. > > Concerning decoupling from Kryo, and introducing other serialization > frameworks like Fury, I think that's a bigger topic that is worth further > discussion. At the moment, I'm not aware of any community consensus on > doing so. And even if in the future we decide to do so, the changes needed > should be the same w/ or w/o this FLIP. So I'd suggest not to block this > FLIP on these issues. > > WDYT? > > Best, > > Xintong > > > > On Fri, Dec 15, 2023 at 1:40 AM Ken Krugler > wrote: > > > Hi Yong, > > > > Looks good, thanks for creating this. > > > > One comment - related to my recent email about Fury, I would love to see > > the v2 serialization decoupled from Kryo. > > > > As part of that, instead of using xxxKryo in methods, call them > xxxGeneric. > > > > A more extreme change would be to totally rely on Fury (so no more POJO > > serializer). Fury is faster than the POJO serializer in my tests, but > this > > would be a much bigger change. > > > > Though it could dramatically simplify the Flink serialization support. > > > > — Ken > > > > PS - a separate issue is how to migrate state from Kryo to something like > > Fury, which supports schema evolution. I think this might be possible, by > > having a smarter deserializer that identifies state as being created by > > Kryo, and using (shaded) Kryo to deserialize, while still writing as > Fury. > > > > > On Dec 6, 2023, at 6:35 PM, Yong Fang wrote: > > > > > > Hi devs, > > > > > > I'd like to start a discussion about FLIP-398: Improve Serialization > > > Configuration And Usage In Flink [1]. > > > > > > Currently, users can register custom data types and serializers in > Flink > > > jobs through various methods, including registration in code, > > > configuration, and annotations. These lead to difficulties in upgrading > > > Flink jobs and priority issues. > > > > > > In flink-2.0 we would like to manage job data types and serializers > > through > > > configurations. This FLIP will introduce a unified option for data type > > and > > > serializer and users can configure all custom data types and > > > pojo/kryo/custom serializers. In addition, this FLIP will add more > > built-in > > > serializers for complex data types such as List and Map, and optimize > the > > > management of Avro Serializers. > > > > > > Looking forward to hearing from you, thanks! > > > > > > [1] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink > > > > > > Best, > > > Fang Yong > > > > -- > > Ken Krugler > > http://www.scaleunlimited.com > > Custom big data solutions > > Flink & Pinot > > > > > > > > >
[jira] [Created] (FLINK-33916) Workflow: Add nightly build for release-1.18
Matthias Pohl created FLINK-33916: - Summary: Workflow: Add nightly build for release-1.18 Key: FLINK-33916 URL: https://issues.apache.org/jira/browse/FLINK-33916 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl Add nightly workflow for {{{}release-1.18{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33915) Workflow: Add nightly build for the dev version (currently called "master")
Matthias Pohl created FLINK-33915: - Summary: Workflow: Add nightly build for the dev version (currently called "master") Key: FLINK-33915 URL: https://issues.apache.org/jira/browse/FLINK-33915 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl The nightly builds run on master and the two most-recently released versions of Flink as those are the supported versions. This logic is currently captured in [flink-ci/git-repo-sync:sync_repo.sh|https://github.com/flink-ci/git-repo-sync/blob/master/sync_repo.sh#L28]. In [FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure] we decided to go ahead and provide nightly builds for {{master}} and {{{}release-1.18{}}}. This issue is about providing the nightly workflow for {{master}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33913) Template: Add CI template for running Flink's test suite
Matthias Pohl created FLINK-33913: - Summary: Template: Add CI template for running Flink's test suite Key: FLINK-33913 URL: https://issues.apache.org/jira/browse/FLINK-33913 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl We want to have a template that runs the entire Flink test suite. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33914) Workflow: Add basic CI that will run with the default configuration
Matthias Pohl created FLINK-33914: - Summary: Workflow: Add basic CI that will run with the default configuration Key: FLINK-33914 URL: https://issues.apache.org/jira/browse/FLINK-33914 Project: Flink Issue Type: Sub-task Reporter: Matthias Pohl Runs the Flink CI template with the default configuration (Java 8) and can be enabled in each push to the branch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33912) Template: Add CI template for pre-compile steps
Matthias Pohl created FLINK-33912: - Summary: Template: Add CI template for pre-compile steps Key: FLINK-33912 URL: https://issues.apache.org/jira/browse/FLINK-33912 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl We want to have a template that triggers all checks that do not require compilation. Those quick checks (e.g. code format) can run without waiting for the compilation step to succeed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33911) Custom Action: Select workflow configuration
Matthias Pohl created FLINK-33911: - Summary: Custom Action: Select workflow configuration Key: FLINK-33911 URL: https://issues.apache.org/jira/browse/FLINK-33911 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl During experiments, we noticed that the GHA UI isn't capable of utilizing a random count of compositions of workflows. If we get into the 3rd level of composite workflow, the job name will be cut off in the left menu which makes navigating the jobs harder (because you have duplicate of the same job, e.g. Compile, belonging to different job profiles). As a workaround, we came up with Flink CI workflow profiles to configure the CI template yaml that is used in every job. A profile configuration can be specified through a JSON file that lives in the {{.github/workflow}} folder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33910) Custom Action: Enable Java version in Flink's CI Docker image
Matthias Pohl created FLINK-33910: - Summary: Custom Action: Enable Java version in Flink's CI Docker image Key: FLINK-33910 URL: https://issues.apache.org/jira/browse/FLINK-33910 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl Flink's CI Docker image comes with multiple Java versions which can be enabled through environment variables. We should have a custom action that sets these variables properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33909) Custom Action: Select the right branch and commit hash
Matthias Pohl created FLINK-33909: - Summary: Custom Action: Select the right branch and commit hash Key: FLINK-33909 URL: https://issues.apache.org/jira/browse/FLINK-33909 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl For nightly builds, we want to select the release branches dynamically (rather than using the automatic selection through GHA schedule). We want to do this dynamically because the GHA feature for branch selection seems to be kind of limited right now, e.g.: * Don't run a branch that hasn't have any changes in the past 1 day (or any other time period) * Run only the most-recent release branches and ignore older release branches (similar to what we're doing in [flink-ci/git-repo-sync:sync_repo.sh|https://github.com/flink-ci/git-repo-sync/blob/master/sync_repo.sh#L28] right now) A custom action that selects the branch and commit hash enables us to overwrite this setting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33908) Custom Action: Move files within the Docker image to the root folder to match the user
Matthias Pohl created FLINK-33908: - Summary: Custom Action: Move files within the Docker image to the root folder to match the user Key: FLINK-33908 URL: https://issues.apache.org/jira/browse/FLINK-33908 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl The way the ci template is setup (right now) is to work in the root user's home folder. For this we're copying the checkout into /root. This copying is done in multiple places which makes it a candidate for a custom action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33907) Makes copying test jars being done earlier
Matthias Pohl created FLINK-33907: - Summary: Makes copying test jars being done earlier Key: FLINK-33907 URL: https://issues.apache.org/jira/browse/FLINK-33907 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl We experienced an issue in GHA which is due to the fact how test resources are pre-computed in GHA: {code:java} This fixes the following error when compiling flink-clients: Error: 2.054 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.2.0:copy-dependencies (copy-dependencies) on project flink-clients: Artifact has not been packaged yet. When used on reactor artifact, copy should be executed after packaging: see MDEP-187. -> [Help 1] {code} We need to move this goal to a later phase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33906) tools/azure-pipelines/debug_files_utils.sh should support GHA output as well
Matthias Pohl created FLINK-33906: - Summary: tools/azure-pipelines/debug_files_utils.sh should support GHA output as well Key: FLINK-33906 URL: https://issues.apache.org/jira/browse/FLINK-33906 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl {{tools/azure-pipelines/debug_files_utils.sh}} sets variables to reference the debug output. This is backend-specific and only supports Azure CI right now. We should add support for GHA. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33905) FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs
Wencong Liu created FLINK-33905: --- Summary: FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs Key: FLINK-33905 URL: https://issues.apache.org/jira/browse/FLINK-33905 Project: Flink Issue Type: Improvement Components: API / Core Affects Versions: 1.19.0 Reporter: Wencong Liu This ticket is proposed for [FLIP-382|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33904) Add zip as a package to GitHub Actions runners
Matthias Pohl created FLINK-33904: - Summary: Add zip as a package to GitHub Actions runners Key: FLINK-33904 URL: https://issues.apache.org/jira/browse/FLINK-33904 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl FLINK-33253 shows that {{test_pyflink.sh}} fails in GHA because it doesn't find {{{}zip{}}}. We should add this as a dependency in the e2e test. {code:java} /root/flink/flink-end-to-end-tests/test-scripts/test_pyflink.sh: line 107: zip: command not found {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33903) Reenable tests that edit file permissions
Matthias Pohl created FLINK-33903: - Summary: Reenable tests that edit file permissions Key: FLINK-33903 URL: https://issues.apache.org/jira/browse/FLINK-33903 Project: Flink Issue Type: Sub-task Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl In GitHub Actions the permissions seem to work differently to how it works in Azure CI. In both cases, we run the test suite as root. But in GHA runners, the file permission changes won't have any effects because the test started as a root user have permissions to adapt the files in any case. This issue is about enabling the tests again by rewriting the tests (ideally, because we shouldn't rely on OS features in the tests). Alternatively, we find a way to make those tests work in GHA as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33902) Switch to OpenSSL legacy algorithms
Matthias Pohl created FLINK-33902: - Summary: Switch to OpenSSL legacy algorithms Key: FLINK-33902 URL: https://issues.apache.org/jira/browse/FLINK-33902 Project: Flink Issue Type: Sub-task Components: Build System Affects Versions: 1.19.0 Reporter: Matthias Pohl In FLINK-33550 we discovered that the GHA runners provided by GitHub have a newer version of OpenSSL installed which caused errors in the SSL tests: {code:java} Certificate was added to keystore Certificate was added to keystore Certificate reply was installed in keystore Error outputting keys and certificates 40F767F1D97F:error:0308010C:digital envelope routines:inner_evp_generic_fetch:unsupported:../crypto/evp/evp_fetch.c:349:Global default library context, Algorithm (RC2-40-CBC : 0), Properties () Nov 14 15:39:21 [FAIL] Test script contains errors. {code} The workaround is to enable legacy algorithms using the {{-legacy}} parameter in 3.0.0+. We might need to check whether that works for older OpenSSL version (in Azure CI). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33901) Trial Period: GitHub Actions
Matthias Pohl created FLINK-33901: - Summary: Trial Period: GitHub Actions Key: FLINK-33901 URL: https://issues.apache.org/jira/browse/FLINK-33901 Project: Flink Issue Type: New Feature Components: Build System / CI Affects Versions: 1.19.0 Reporter: Matthias Pohl This issue is (in contrast to FLINK-27075 which is used for issues that were collected while preparing [FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure]) collecting all the subtasks that are necessary to initiate the trial phase for GitHub Actions (as discussed in [FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure]). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33900) Multiple failures in WindowRankITCase due to NoResourceAvailableException
Matthias Pohl created FLINK-33900: - Summary: Multiple failures in WindowRankITCase due to NoResourceAvailableException Key: FLINK-33900 URL: https://issues.apache.org/jira/browse/FLINK-33900 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Matthias Pohl [https://github.com/XComp/flink/actions/runs/7244405295/job/19733011527#step:12:14989] There are multiple tests in {{WindowRankITCase}} that fail due to a {{NoResourceAvailableException}} supposedly: {code:java} [...] Error: 09:19:33 09:19:32.966 [ERROR] WindowRankITCase.testTumbleWindowTVFWithOffset Time elapsed: 300.072 s <<< FAILURE! 14558Dec 18 09:19:33 org.opentest4j.MultipleFailuresError: 14559Dec 18 09:19:33 Multiple Failures (2 failures) 14560Dec 18 09:19:33org.apache.flink.runtime.client.JobExecutionException: Job execution failed. 14561Dec 18 09:19:33java.lang.AssertionError: 14562Dec 18 09:19:33at org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:200) 14563Dec 18 09:19:33at org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:248) 14564Dec 18 09:19:33at org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:214) 14565Dec 18 09:19:33at org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:88) 14566Dec 18 09:19:33at org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87) 14567Dec 18 09:19:33at org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225) 14568Dec 18 09:19:33at org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72) 14569Dec 18 09:19:33at org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222) 14570Dec 18 09:19:33at org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38) 14571Dec 18 09:19:33at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372) 14572Dec 18 09:19:33at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) 14573Dec 18 09:19:33at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) 14574Dec 18 09:19:33at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 14575Dec 18 09:19:33at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 14576Dec 18 09:19:33at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 14577Dec 18 09:19:33at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 14578Dec 18 09:19:33at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 14579Dec 18 09:19:33at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 14580Dec 18 09:19:33at org.junit.runners.Suite.runChild(Suite.java:128) 14581Dec 18 09:19:33at org.junit.runners.Suite.runChild(Suite.java:27) 14582Dec 18 09:19:33at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 14583Dec 18 09:19:33at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 14584Dec 18 09:19:33at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 14585Dec 18 09:19:33at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 14586Dec 18 09:19:33at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 14587Dec 18 09:19:33at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) 14588Dec 18 09:19:33at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) 14589Dec 18 09:19:33at org.junit.rules.RunRules.evaluate(RunRules.java:20) 14590Dec 18 09:19:33at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 14591Dec 18 09:19:33at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 14592Dec 18 09:19:33at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 14593Dec 18 09:19:33at org.junit.runner.JUnitCore.run(JUnitCore.java:115) 14594Dec 18 09:19:33at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) 14595Dec 18 09:19:33at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) 14596Dec 18 09:19:33at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) 14597Dec 18 09:19:33at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147) 14598Dec 18 09:19:33at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127) 14599Dec 18 09:19:33at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90) 14600Dec 18 09:19:33at
[jira] [Created] (FLINK-33899) Java 17 and 21 support for mongodb connector
Jiabao Sun created FLINK-33899: -- Summary: Java 17 and 21 support for mongodb connector Key: FLINK-33899 URL: https://issues.apache.org/jira/browse/FLINK-33899 Project: Flink Issue Type: Improvement Components: Connectors / MongoDB Affects Versions: mongodb-1.0.2 Reporter: Jiabao Sun Fix For: mongodb-1.1.0 After FLINK-33302 is finished it is now possible to specify jdk version That allows to add jdk17 and jdk21 support -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33898) Allow triggering unaligned checkpoint via REST api
Zakelly Lan created FLINK-33898: --- Summary: Allow triggering unaligned checkpoint via REST api Key: FLINK-33898 URL: https://issues.apache.org/jira/browse/FLINK-33898 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing, Runtime / REST Reporter: Zakelly Lan Assignee: Zakelly Lan See FLINK-33897. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33897) Allow triggering unaligned checkpoint via CLI
Zakelly Lan created FLINK-33897: --- Summary: Allow triggering unaligned checkpoint via CLI Key: FLINK-33897 URL: https://issues.apache.org/jira/browse/FLINK-33897 Project: Flink Issue Type: Improvement Components: Command Line Client, Runtime / Checkpointing Reporter: Zakelly Lan After FLINK-6755, user could trigger checkpoint through CLI. However I noticed there would be value supporting trigger it in unaligned way, since the job may encounter a high back-pressure and an aligned checkpoint would fail. I suggest we provide an option '-unaligned' in CLI to support that. Similar option would also be useful for REST api -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support
+1 for this useful feature! Hope this reply isn't too late. Agree that we start with global async-scalar configuration and ordered mode first. @Alan Only one question with the async `timeout` parameter[1](since I haven't seen the POC code), current description is: 'The time which can pass before a restart strategy is triggered', but in the previous flip-232[2] and flip-234[3], in retry scenario, this timeout is the total time, do we keep the behavior of the parameter consistent? [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support [2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963 [3] https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems Best, Lincoln Lee Alan Sheinberg 于2023年12月20日周三 08:41写道: > Thanks for the comments Timo. > > > > Can you remove the necessary parts? Esp.: > > @Override > > public Set getRequirements() { > > return Collections.singleton(FunctionRequirement.ORDERED); > > } > > > I removed this section from the FLIP since presumably, there's no use in > adding to the public API if it's ignored, with handling just ORDERED for > the first version. I'm not sure how quickly I'll want to add UNORDERED > support, but I guess I can always do another FLIP. > > Otherwise I have no objections to start a VOTE soonish. If others are > > fine as well? > > That would be great. Any areas that people are interested in discussing > further before a vote? > > -Alan > > On Tue, Dec 19, 2023 at 5:49 AM Timo Walther wrote: > > > > I would be totally fine with the first version only having ORDERED > > > mode. For a v2, we could attempt to do the next most conservative > > > thing > > > > Sounds good to me. > > > > I also cheked AsyncWaitOperator and could not find n access of > > StreamRecord's timestamp but only watermarks. But as we said, let's > > focus on ORDERED first. > > > > Can you remove the necessary parts? Esp.: > > > > @Override > > public Set getRequirements() { > > return Collections.singleton(FunctionRequirement.ORDERED); > > } > > > > Otherwise I have no objections to start a VOTE soonish. If others are > > fine as well? > > > > Regards, > > Timo > > > > > > On 19.12.23 07:32, Alan Sheinberg wrote: > > > Thanks for the helpful comments, Xuyang and Timo. > > > > > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take kafka > as > > >> source and mysql as sink as an example. > > >> Although kafka is an append-only source, one of its fields is used as > pk > > >> when writing to mysql. If async udx is executed > > >> in an unordered mode, there may be problems with the data in mysql > in > > the > > >> end. In this case, we need to ensure that > > >> the sink-based pk is in order actually. > > > > > > > > > @Xuyang: That's a great point. If some node downstream of my operator > > > cares about ordering, there's no way for it to reconstruct the original > > > ordering of the rows as they were input to my operator. So even if > they > > > want to preserve ordering by key, the order in which they see it may > > > already be incorrect. Somehow I thought that maybe the analysis of the > > > changelog mode at a given operator was aware of downstream operations, > > but > > > it seems not. > > > > > > Clear "no" on this. Changelog semantics make the planner complex and we > > >> need to be careful. Therefore I would strongly suggest we introduce > > >> ORDERED and slowly enable UNORDERED whenever we see a good fit for it > in > > >> plans with appropriate planner rules that guard it. > > > > > > > > > @Timo: The better I understand the complexity, the more I agree with > > this. > > > I would be totally fine with the first version only having ORDERED > mode. > > > For a v2, we could attempt to do the next most conservative thing and > > only > > > allow UNORDERED when the whole graph is in *INSERT *changelog mode. > The > > > next best type of optimization might understand what's the key required > > > downstream, and allow breaking the original order only between > unrelated > > > keys, but maintaining it between rows of the same key. Of course if > the > > > key used downstream is computed in some manner, that makes it all the > > > harder to know this beforehand. > > > > > > So unordering should be fine *within* watermarks. This is also what > > >> watermarks are good for, a trade-off between strict ordering and > making > > >> progress. The async operator from DataStream API also supports this > if I > > >> remember correctly. However, it assumes a timestamp is present in > > >> StreamRecord on which it can work. But this is not the case within the > > >> SQL engine. > > > > > > > > > *AsyncWaitOperator* and *UnorderedStreamElementQueue* (the > > implementations > > > I plan on using) seem to support exactly this behavior. I don't
[jira] [Created] (FLINK-33896) Implement restore tests for Correlate node
Jacky Lau created FLINK-33896: - Summary: Implement restore tests for Correlate node Key: FLINK-33896 URL: https://issues.apache.org/jira/browse/FLINK-33896 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Jacky Lau Fix For: 1.19.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33895) Implement restore tests for PythonGroupWindowAggregate node
Jacky Lau created FLINK-33895: - Summary: Implement restore tests for PythonGroupWindowAggregate node Key: FLINK-33895 URL: https://issues.apache.org/jira/browse/FLINK-33895 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Jacky Lau Fix For: 1.19.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33894) Implement restore tests for PythonGroupAggregate node
Jacky Lau created FLINK-33894: - Summary: Implement restore tests for PythonGroupAggregate node Key: FLINK-33894 URL: https://issues.apache.org/jira/browse/FLINK-33894 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Jacky Lau Fix For: 1.19.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33893) Implement restore tests for PythonCorrelate node
Jacky Lau created FLINK-33893: - Summary: Implement restore tests for PythonCorrelate node Key: FLINK-33893 URL: https://issues.apache.org/jira/browse/FLINK-33893 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Jacky Lau Fix For: 1.19.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] Release 1.18.1, release candidate #2
Thanks Jing for driving this release. +1 (non-binding) - Validated hashes - Verified signature - Checked the tag - Build the source with Maven - Verified web PRs Best, Jiabao > 2023年12月20日 07:38,Jing Ge 写道: > > Hi everyone, > > The release candidate #1 has been skipped. Please review and vote on the > release candidate #2 for the version 1.18.1, > > as follows: > > [ ] +1, Approve the release > > [ ] -1, Do not approve the release (please provide specific comments) > > > The complete staging area is available for your review, which includes: > * JIRA release notes [1], > > * the official Apache source release and binary convenience releases to be > deployed to dist.apache.org [2], which are signed with the key with > fingerprint 96AE0E32CBE6E0753CE6 [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * source code tag "release-1.18.1-rc2" [5], > > * website pull request listing the new release and adding announcement blog > post [6]. > > > The vote will be open for at least 72 hours. It is adopted by majority > approval, with at least 3 PMC affirmative votes. > > > [1] > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640 > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.1-rc2/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] https://repository.apache.org/content/repositories/orgapacheflink-1689 > > [5] https://github.com/apache/flink/releases/tag/release-1.18.1-rc2 > > [6] https://github.com/apache/flink-web/pull/706 > > Thanks, > Release Manager
Re: [DISCUSS] Should Configuration support getting value based on String key?
Sounds good to me. Best, Xintong On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su wrote: > Hi Rui, > > I am fine with keeping the `String getString(String key, String > defaultValue)` if more people favor it. However, we should let the > user know that we encourage using ConfigOptions over the string-based > configuration key, as Timo said, we should add the message to `String > getString(String key, String defaultValue)` method. > > Best, > Xuannan > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote: > > > > > > I noticed that Configuration is used in > > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig > > > > to store some cacheFile meta-information. Their keys are > > > > temporary(key name with number) and it is not convenient > > > > to predefine ConfigOption. > > > > > > > > > True, this one requires a bit more effort to migrate from string-key to > > > ConfigOption, but still should be doable. Looking at how the two > mentioned > > > methods are implemented and used, it seems what is really needed is > > > serialization and deserialization of `DistributedCacheEntry`-s. And > all the > > > entries are always written / read at once. So I think we can serialize > the > > > whole set of entries into a JSON string (or something similar), and > use one > > > ConfigOption with a deterministic key for it, rather than having one > > > ConfigOption for each field in each entry. WDYT? > > > > Hi Xintong, thanks for the good suggestion! Most of the entries can be > > serialized to a json string, and we can only write/read them at once. > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to > > store it by the setBytes/getBytes. > > > > Also, I have an offline discussion with @Xuannan Su : refactoring all > code > > that uses String as key requires a separate FLIP. And we will provide > > detailed FLIP later. > > > > > -- > > > > Hi all, thanks everyone for the lively discussion. It's really a > trade-off to > > keep "String getString(String key, String defaultValue)" or not. > > (It's not a right or wrong thing.) > > > > Judging from the discussion, most discussants can accept that keeping > > `String getString(String key, String defaultValue)` and depreate the > > rest of `getXxx(String key, Xxx defaultValue)`. > > > > cc @Xintong Song @Xuannan Su , WDYT? > > > > Best, > > Rui > > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu wrote: > >> > >> I think it's not clear whether forcing using ConfigOption would hurt > >> the user experience. > >> > >> Maybe it does at the beginning, because using string keys to access > >> Flink configuration can be simpler for new components/jobs. > >> However, problems may happen later if the configuration usages become > >> more complex, like key renaming, using types other than strings, and > >> other problems that ConfigOption was invented to address. > >> > >> Personally I prefer to encourage the usage of ConfigOption. > >> Jobs should use GlobalJobParameter for custom config, which is different > >> from the Configuration interface. Therefore, Configuration is mostly > >> used in other components/plugins, in which case the long-term > maintenance > >> can be important. > >> > >> However, since it is not a right or wrong choice, I'd also be fine > >> to keep the `getString()` method if more devs/users are in favor of it. > >> > >> Thanks, > >> Zhu > >> > >> Timo Walther 于2023年12月14日周四 17:41写道: > >> > >> > The configuration in Flink is complicated and I fear we won't have > >> > enough capacity to substantially fix it. The introduction of > >> > ReadableConfig, WritableConfig, and typed ConfigOptions was a right > step > >> > into making the code more maintainable. From the Flink side, every > read > >> > access should go through ConfigOption. > >> > > >> > However, I also understand Gyula pragmatism here because (practically > >> > speaking) users get access `getString()` via `toMap().get()`. So I'm > >> > fine with removing the deprecation for functionality that is available > >> > anyways. We should, however, add the message to `getString()` that > this > >> > method is discouraged and `get(ConfigOption)` should be the preferred > >> > way of accessting Configuration. > >> > > >> > In any case we should remove the getInt and related methods. > >> > > >> > Cheers, > >> > Timo > >> > > >> > > >> > On 14.12.23 09:56, Gyula Fóra wrote: > >> > > I see a strong value for user facing configs to use ConfigOption > and this > >> > > should definitely be an enforced convention. > >> > > > >> > > However with the Flink project growing and many other components > and even > >> > > users using the Configuration object I really don’t think that we > should > >> > > “force” this on the users/developers. > >> > > > >> > > If we make fromMap / toMap free with basically no overhead, that is > fine > >> > > but otherwise I think it would hurt the
[RESULT][VOTE] FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs
The vote for FLIP-382 [1]: Unify the Provision of Diverse Metadata for Context-like APIs (discussion thread [2]) concluded. The vote will be closed.[3]. There are 3 binding votes and 1 non-binding votes: Xintong Song (binding) Lijie Wang (binding) Weijie Guo (binding) Yuxin Tan (non-binding) There were no -1 votes. Therefore, FLIP-382 was accepted. I will prepare the necessary changes for Flink 1.19. Thanks everyone for the discussion! Wencong Liu [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs [2] https://lists.apache.org/thread/3mgsc31odtpmzzl32s4oqbhlhxd0mn6b [3] https://lists.apache.org/thread/5vlf3klv131x8oj45qohvg9c53qkd87c
[RESULT][VOTE] FLIP-380: Support Full Partition Processing On Non-keyed DataStream
The vote for FLIP-380 [1]: Support Full Partition Processing On Non-keyed DataStream (discussion thread [2]) concluded. The vote will be closed.[3]. There are 3 binding votes and 1 non-binding votes: Xintong Song (binding) YingJie Cao (binding) Weijie Guo (binding) Yuxin Tan (non-binding) There were no -1 votes. Therefore, FLIP-380 was accepted. I will prepare the necessary changes for Flink 1.19. Thanks everyone for the discussion! Wencong Liu [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-380%3A+Support+Full+Partition+Processing+On+Non-keyed+DataStream [2] https://lists.apache.org/thread/nn7myj7vsvytbkdrnbvj5h0homsjrn1h [3] https://lists.apache.org/thread/ns1my6ydctjfl9w89hm8gvldh00lqtq3