Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2023-12-20 Thread Leonard Xu
Thanks Peter for driving this.

The FLIP change is a big improvement of the existing sink v2 interfaces, but 
the FLIP's overall API design taste and interface style looks good to me.

+1(binding) from my side.

I also left a little minor comments we can improve later.
(1) The FLIP title dosen't cover the proposed change scope well, right?
(2) The pseudocode filename of StatefulSinkWriter has a typo.
(3) I’ve found you’ve open a PR in Github, we’d better create issue and prepare 
a formal PR 
until the FLIP passed instead of discussion status.


Best,
Leonard



> 2023年12月21日 上午11:47,Jiabao Sun  写道:
> 
> Thanks Peter for driving this. 
> 
> +1 (non-binding)
> 
> Best,
> Jiabao
> 
> 
> On 2023/12/18 12:06:05 Gyula Fóra wrote:
>> +1 (binding)
>> 
>> Gyula
>> 
>> On Mon, 18 Dec 2023 at 13:04, Márton Balassi 
>> wrote:
>> 
>>> +1 (binding)
>>> 
>>> On Mon 18. Dec 2023 at 09:34, Péter Váry 
>>> wrote:
>>> 
 Hi everyone,
 
 Since there were no further comments on the discussion thread [1], I
>>> would
 like to start the vote for FLIP-372 [2].
 
 The FLIP started as a small new feature, but in the discussion thread and
 in a similar parallel thread [3] we opted for a somewhat bigger change in
 the Sink V2 API.
 
 Please read the FLIP and cast your vote.
 
 The vote will remain open for at least 72 hours and only concluded if
>>> there
 are no objections and enough (i.e. at least 3) binding votes.
 
 Thanks,
 Peter
 
 [1] - https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd
 [2] -
 
 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
 [3] - https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57
 
>>> 



Re: Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2023-12-20 Thread Hang Ruan
Thanks for the FLIP.

+1 (non-binding)

Best,
Hang

Jiabao Sun  于2023年12月21日周四 11:48写道:

> Thanks Peter for driving this.
>
> +1 (non-binding)
>
> Best,
> Jiabao
>
>
> On 2023/12/18 12:06:05 Gyula Fóra wrote:
> > +1 (binding)
> >
> > Gyula
> >
> > On Mon, 18 Dec 2023 at 13:04, Márton Balassi 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > On Mon 18. Dec 2023 at 09:34, Péter Váry 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Since there were no further comments on the discussion thread [1], I
> > > would
> > > > like to start the vote for FLIP-372 [2].
> > > >
> > > > The FLIP started as a small new feature, but in the discussion
> thread and
> > > > in a similar parallel thread [3] we opted for a somewhat bigger
> change in
> > > > the Sink V2 API.
> > > >
> > > > Please read the FLIP and cast your vote.
> > > >
> > > > The vote will remain open for at least 72 hours and only concluded if
> > > there
> > > > are no objections and enough (i.e. at least 3) binding votes.
> > > >
> > > > Thanks,
> > > > Peter
> > > >
> > > > [1] -
> https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd
> > > > [2] -
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
> > > > [3] -
> https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57
> > > >
> > >
> >


Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5

2023-12-20 Thread Jiabao Sun
+1 (non-binding)

Best,
Jiabao


> 2023年12月21日 13:25,weijie guo  写道:
> 
> +1(binding)
> 
> Best regards,
> 
> Weijie
> 
> 
> Xin Jiang  于2023年12月21日周四 11:21写道:
> 
>> 1.5 multiplier is indeed more reasonable.
>> 
>> +1 (non-binding)
>> 
>>> 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道:
>>> 
>>> Hi everyone,
>>> 
>>> Thank you to everyone for the feedback on FLIP-364:
>>> Improve the restart-strategy[1] which has been voted in this thread[2].
>>> 
>>> After the vote on FLIP-364, there was some feedback on the user mail
>>> list[3]
>>> suggesting changing the default value of backoff-multiplier from 1.2 to
>> 1.5.
>>> 
>>> I would like to start a vote for it. The vote will be open for at least
>> 72
>>> hours
>>> unless there is an objection or not enough votes.
>>> 
>>> [1] https://cwiki.apache.org/confluence/x/uJqzDw
>>> [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9
>>> [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw
>>> 
>>> Best,
>>> Rui
>> 
>> 



Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5

2023-12-20 Thread weijie guo
+1(binding)

Best regards,

Weijie


Xin Jiang  于2023年12月21日周四 11:21写道:

> 1.5 multiplier is indeed more reasonable.
>
> +1 (non-binding)
>
> > 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道:
> >
> > Hi everyone,
> >
> > Thank you to everyone for the feedback on FLIP-364:
> > Improve the restart-strategy[1] which has been voted in this thread[2].
> >
> > After the vote on FLIP-364, there was some feedback on the user mail
> > list[3]
> > suggesting changing the default value of backoff-multiplier from 1.2 to
> 1.5.
> >
> > I would like to start a vote for it. The vote will be open for at least
> 72
> > hours
> > unless there is an objection or not enough votes.
> >
> > [1] https://cwiki.apache.org/confluence/x/uJqzDw
> > [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9
> > [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw
> >
> > Best,
> > Rui
>
>


Re: [DISCUSS] Should Configuration support getting value based on String key?

2023-12-20 Thread Rui Fan
Sounds make sense. We will do it in the FLIP.

Best,
Rui

On Thu, 21 Dec 2023 at 10:20, Xintong Song  wrote:

> Ideally, public API changes should go through the FLIP process.
>
> I see the point that starting a FLIP for such a tiny change might be
> overkill. However, one could also argue that anything that is too trivial
> to go through a FLIP should also be easy enough to quickly get through the
> process.
>
> In this particular case, since you and Xuannan are working on another FLIP
> regarding the usage of string-keys in configurations, why not make removing
> of the @Deprecated annotation part of that FLIP?
>
> Best,
>
> Xintong
>
>
>
> On Wed, Dec 20, 2023 at 9:27 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Thanks Xuannan and Xintong for the quick feedback!
> >
> > > However, we should let the user know that we encourage using
> > ConfigOptions over the string-based
> > > configuration key, as Timo said, we should add the message to `String
> > > getString(String key, String defaultValue)` method.
> >
> > Sure, we can add some comments to guide users to use ConfigOption.
> >
> > If so, I will remove the @Deprecated annotation for
> > `getString(String key, String defaultValue)` method`, and add
> > some comments  for it.
> >
> > Also, it's indeed a small change related to Public class(or Api),
> > is voting necessary?
> >
> > Best,
> > Rui
> >
> > On Wed, 20 Dec 2023 at 16:55, Xintong Song 
> wrote:
> >
> > > Sounds good to me.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su 
> > wrote:
> > >
> > > > Hi Rui,
> > > >
> > > > I am fine with keeping the `String getString(String key, String
> > > > defaultValue)` if more people favor it. However, we should let the
> > > > user know that we encourage using ConfigOptions over the string-based
> > > > configuration key, as Timo said, we should add the message to `String
> > > > getString(String key, String defaultValue)` method.
> > > >
> > > > Best,
> > > > Xuannan
> > > >
> > > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > > >
> > > > > > > I noticed that Configuration is used in
> > > > > > > DistributedCache#writeFileInfoToConfig and
> readFileInfoFromConfig
> > > > > > > to store some cacheFile meta-information. Their keys are
> > > > > > > temporary(key name with number) and it is not convenient
> > > > > > > to predefine ConfigOption.
> > > > > >
> > > > > >
> > > > > > True, this one requires a bit more effort to migrate from
> > string-key
> > > to
> > > > > > ConfigOption, but still should be doable. Looking at how the two
> > > > mentioned
> > > > > > methods are implemented and used, it seems what is really needed
> is
> > > > > > serialization and deserialization of `DistributedCacheEntry`-s.
> And
> > > > all the
> > > > > > entries are always written / read at once. So I think we can
> > > serialize
> > > > the
> > > > > > whole set of entries into a JSON string (or something similar),
> and
> > > > use one
> > > > > > ConfigOption with a deterministic key for it, rather than having
> > one
> > > > > > ConfigOption for each field in each entry. WDYT?
> > > > >
> > > > > Hi Xintong, thanks for the good suggestion! Most of the entries can
> > be
> > > > > serialized to a json string, and we can only write/read them at
> once.
> > > > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to
> > > > > store it by the setBytes/getBytes.
> > > > >
> > > > > Also, I have an offline discussion with @Xuannan Su : refactoring
> all
> > > > code
> > > > > that uses String as key requires a separate FLIP. And we will
> provide
> > > > > detailed FLIP  later.
> > > > >
> > > > >
> > > >
> > >
> >
> --
> > > > >
> > > > > Hi all, thanks everyone for the lively discussion. It's really a
> > > > trade-off to
> > > > > keep "String getString(String key, String defaultValue)" or not.
> > > > > (It's not a right or wrong thing.)
> > > > >
> > > > > Judging from the discussion, most discussants can accept that
> keeping
> > > > > `String getString(String key, String defaultValue)` and depreate
> the
> > > > > rest of `getXxx(String key, Xxx defaultValue)`.
> > > > >
> > > > > cc @Xintong Song @Xuannan Su , WDYT?
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu 
> wrote:
> > > > >>
> > > > >> I think it's not clear whether forcing using ConfigOption would
> hurt
> > > > >> the user experience.
> > > > >>
> > > > >> Maybe it does at the beginning, because using string keys to
> access
> > > > >> Flink configuration can be simpler for new components/jobs.
> > > > >> However, problems may happen later if the configuration usages
> > become
> > > > >> more complex, like key renaming, using types other than strings,
> and
> > > > >> other problems that ConfigOption was invented to address.
> > > > >>
> > > > >> Personally I 

Re: [VOTE] Release 1.18.1, release candidate #2

2023-12-20 Thread gongzhongqiang
Thanks Jing Ge for driving this release.

+1 (non-binding), I have checked:
[✓] The checksums and signatures are validated
[✓] The tag checked is fine
[✓] Built from source is passed
[✓] The flink-web PR is reviewed and checked


Best,
Zhongqiang Gong


RE: Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2023-12-20 Thread Jiabao Sun
Thanks Peter for driving this. 

+1 (non-binding)

Best,
Jiabao


On 2023/12/18 12:06:05 Gyula Fóra wrote:
> +1 (binding)
> 
> Gyula
> 
> On Mon, 18 Dec 2023 at 13:04, Márton Balassi 
> wrote:
> 
> > +1 (binding)
> >
> > On Mon 18. Dec 2023 at 09:34, Péter Váry 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Since there were no further comments on the discussion thread [1], I
> > would
> > > like to start the vote for FLIP-372 [2].
> > >
> > > The FLIP started as a small new feature, but in the discussion thread and
> > > in a similar parallel thread [3] we opted for a somewhat bigger change in
> > > the Sink V2 API.
> > >
> > > Please read the FLIP and cast your vote.
> > >
> > > The vote will remain open for at least 72 hours and only concluded if
> > there
> > > are no objections and enough (i.e. at least 3) binding votes.
> > >
> > > Thanks,
> > > Peter
> > >
> > > [1] - https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd
> > > [2] -
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
> > > [3] - https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57
> > >
> >
> 

Re: [VOTE] FLIP-364: Update the default value of backoff-multiplier from 1.2 to 1.5

2023-12-20 Thread Xin Jiang
1.5 multiplier is indeed more reasonable.

+1 (non-binding)

> 2023年12月19日 下午5:17,Rui Fan <1996fan...@gmail.com> 写道:
> 
> Hi everyone,
> 
> Thank you to everyone for the feedback on FLIP-364:
> Improve the restart-strategy[1] which has been voted in this thread[2].
> 
> After the vote on FLIP-364, there was some feedback on the user mail
> list[3]
> suggesting changing the default value of backoff-multiplier from 1.2 to 1.5.
> 
> I would like to start a vote for it. The vote will be open for at least 72
> hours
> unless there is an objection or not enough votes.
> 
> [1] https://cwiki.apache.org/confluence/x/uJqzDw
> [2] https://lists.apache.org/thread/xo03tzw6d02w1vbcj5y9ccpqyc7bqrh9
> [3] https://lists.apache.org/thread/6glz0d57r8gtpzq4f71vf9066c5x6nyw
> 
> Best,
> Rui



Re:Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

2023-12-20 Thread Xuyang
Hi, Benchao.


When Feng Jin and I tried the poc together, we found that when using udaf, 
Calcite directly using the function's input parameters from 
SqlCall#getOperandList. But in fact, these input parameters may use named 
arguments, the order of parameters may be wrong, and they may not include 
optional parameters that need to set default values. Actually, we should use 
new SqlCallBinding(this, scope, call).operands() to let this method correct the 
order and add default values. (You can see the modification in 
SqlToRelConverter in poc branch[1])


We have not reported this bug to the calcite community yet. Our original plan 
was to report this bug to the calcite community during the process of doing 
this flip, and fix it separately in flink's own calcite file. Because the time 
for Calcite to release the version is uncertain. And the time to upgrade flink 
to the latest calcite version is also unknown.


The link to the poc code is at the bottom of the flip[2]. I'm post it here 
again[1].


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
[2] 
https://github.com/apache/flink/compare/master...hackergin:flink:poc_named_argument



--

Best!
Xuyang





在 2023-12-20 13:31:26,"Benchao Li"  写道:
>I didn't see your POC code, so I assumed that you'll need to add
>SqlStdOperatorTable#DEFAULT and
>SqlStdOperatorTable#ARGUMENT_ASSIGNMENT to FlinkSqlOperatorTable, am I
>right?
>
>If yes, this would enable many builtin functions to allow default and
>optional arguments, for example, `select md5(DEFAULT)`, I guess this
>is not what we want to support right? If so, I would suggest to throw
>proper errors for these unexpected usages.
>
>Benchao Li  于2023年12月20日周三 13:16写道:
>>
>> Thanks Feng for driving this, it's a very useful feature.
>>
>> In the FLIP, you mentioned that
>> > During POC verification, bugs were discovered in Calcite that caused 
>> > issues during the validation phase. We need to modify the SqlValidatorImpl 
>> > and SqlToRelConverter to address these problems.
>>
>> Could you log bugs in Calcite, and reference the corresponding Jira
>> number in your code. We want to upgrade Calcite version to latest as
>> much as possible, and maintaining many bugfixes in Flink will add
>> additional burdens for upgrading Calcite. By adding corresponding
>> issue numbers, we can easily make sure that we can remove these Flink
>> hosted bugfixes when we upgrade to a version that already contains the
>> fix.
>>
>> Feng Jin  于2023年12月14日周四 19:30写道:
>> >
>> > Hi Timo
>> > Thanks for your reply.
>> >
>> > >  1) ArgumentNames annotation
>> >
>> > I'm sorry for my incorrect expression. argumentNames is a method of
>> > FunctionHints. We should introduce a new arguments method to replace this
>> > method and return Argument[].
>> > I updated the FLIP doc about this part.
>> >
>> > >  2) Evolution of FunctionHint
>> >
>> > +1 define DataTypeHint as part of ArgumentHint. I'll update the FLIP doc.
>> >
>> > > 3)  Semantical correctness
>> >
>> > I realized that I forgot to submit the latest modification of the FLIP
>> > document. Xuyang and I had a prior discussion before starting this discuss.
>> > Let's restrict it to supporting only one eval() function, which will
>> > simplify the overall design.
>> >
>> > Therefore, I also concur with not permitting overloaded named parameters.
>> >
>> >
>> > Best,
>> > Feng
>> >
>> > On Thu, Dec 14, 2023 at 6:15 PM Timo Walther  wrote:
>> >
>> > > Hi Feng,
>> > >
>> > > thank you for proposing this FLIP. This nicely completes FLIP-65 which
>> > > is great for usability.
>> > >
>> > > I read the FLIP and have some feedback:
>> > >
>> > >
>> > > 1) ArgumentNames annotation
>> > >
>> > >  > Deprecate the ArgumentNames annotation as it is not user-friendly for
>> > > specifying argument names with optional configuration.
>> > >
>> > > Which annotation does the FLIP reference here? I cannot find it in the
>> > > Flink code base.
>> > >
>> > > 2) Evolution of FunctionHint
>> > >
>> > > Introducing @ArgumentHint makes a lot of sense to me. However, using it
>> > > within @FunctionHint looks complex, because there is both `input=` and
>> > > `arguments=`. Ideally, the @DataTypeHint can be defined inline as part
>> > > of the @ArgumentHint. It could even be the `value` such that
>> > > `@ArgumentHint(@DataTypeHint("INT"))` is valid on its own.
>> > >
>> > > We could deprecate `input=`. Or let both `input` and `arguments=`
>> > > coexist but never be defined at the same time.
>> > >
>> > > 3) Semantical correctness
>> > >
>> > > As you can see in the `TypeInference` class, named parameters are
>> > > prepared in the stack already. However, we need to watch out between
>> > > helpful explanation (see `InputTypeStrategy#getExpectedSignatures`) and
>> > > named parameters (see `TypeInference.Builder#namedArguments`) that can
>> > > be used in SQL.
>> > >
>> > > If I remember correctly, named parameters can be 

Re: [DISCUSS] Should Configuration support getting value based on String key?

2023-12-20 Thread Xintong Song
Ideally, public API changes should go through the FLIP process.

I see the point that starting a FLIP for such a tiny change might be
overkill. However, one could also argue that anything that is too trivial
to go through a FLIP should also be easy enough to quickly get through the
process.

In this particular case, since you and Xuannan are working on another FLIP
regarding the usage of string-keys in configurations, why not make removing
of the @Deprecated annotation part of that FLIP?

Best,

Xintong



On Wed, Dec 20, 2023 at 9:27 PM Rui Fan <1996fan...@gmail.com> wrote:

> Thanks Xuannan and Xintong for the quick feedback!
>
> > However, we should let the user know that we encourage using
> ConfigOptions over the string-based
> > configuration key, as Timo said, we should add the message to `String
> > getString(String key, String defaultValue)` method.
>
> Sure, we can add some comments to guide users to use ConfigOption.
>
> If so, I will remove the @Deprecated annotation for
> `getString(String key, String defaultValue)` method`, and add
> some comments  for it.
>
> Also, it's indeed a small change related to Public class(or Api),
> is voting necessary?
>
> Best,
> Rui
>
> On Wed, 20 Dec 2023 at 16:55, Xintong Song  wrote:
>
> > Sounds good to me.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su 
> wrote:
> >
> > > Hi Rui,
> > >
> > > I am fine with keeping the `String getString(String key, String
> > > defaultValue)` if more people favor it. However, we should let the
> > > user know that we encourage using ConfigOptions over the string-based
> > > configuration key, as Timo said, we should add the message to `String
> > > getString(String key, String defaultValue)` method.
> > >
> > > Best,
> > > Xuannan
> > >
> > > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote:
> > > >
> > > > > > I noticed that Configuration is used in
> > > > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig
> > > > > > to store some cacheFile meta-information. Their keys are
> > > > > > temporary(key name with number) and it is not convenient
> > > > > > to predefine ConfigOption.
> > > > >
> > > > >
> > > > > True, this one requires a bit more effort to migrate from
> string-key
> > to
> > > > > ConfigOption, but still should be doable. Looking at how the two
> > > mentioned
> > > > > methods are implemented and used, it seems what is really needed is
> > > > > serialization and deserialization of `DistributedCacheEntry`-s. And
> > > all the
> > > > > entries are always written / read at once. So I think we can
> > serialize
> > > the
> > > > > whole set of entries into a JSON string (or something similar), and
> > > use one
> > > > > ConfigOption with a deterministic key for it, rather than having
> one
> > > > > ConfigOption for each field in each entry. WDYT?
> > > >
> > > > Hi Xintong, thanks for the good suggestion! Most of the entries can
> be
> > > > serialized to a json string, and we can only write/read them at once.
> > > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to
> > > > store it by the setBytes/getBytes.
> > > >
> > > > Also, I have an offline discussion with @Xuannan Su : refactoring all
> > > code
> > > > that uses String as key requires a separate FLIP. And we will provide
> > > > detailed FLIP  later.
> > > >
> > > >
> > >
> >
> --
> > > >
> > > > Hi all, thanks everyone for the lively discussion. It's really a
> > > trade-off to
> > > > keep "String getString(String key, String defaultValue)" or not.
> > > > (It's not a right or wrong thing.)
> > > >
> > > > Judging from the discussion, most discussants can accept that keeping
> > > > `String getString(String key, String defaultValue)` and depreate the
> > > > rest of `getXxx(String key, Xxx defaultValue)`.
> > > >
> > > > cc @Xintong Song @Xuannan Su , WDYT?
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu  wrote:
> > > >>
> > > >> I think it's not clear whether forcing using ConfigOption would hurt
> > > >> the user experience.
> > > >>
> > > >> Maybe it does at the beginning, because using string keys to access
> > > >> Flink configuration can be simpler for new components/jobs.
> > > >> However, problems may happen later if the configuration usages
> become
> > > >> more complex, like key renaming, using types other than strings, and
> > > >> other problems that ConfigOption was invented to address.
> > > >>
> > > >> Personally I prefer to encourage the usage of ConfigOption.
> > > >> Jobs should use GlobalJobParameter for custom config, which is
> > different
> > > >> from the Configuration interface. Therefore, Configuration is mostly
> > > >> used in other components/plugins, in which case the long-term
> > > maintenance
> > > >> can be important.
> > > >>
> > > >> However, since it is not a right or wrong choice, I'd 

Re:Re: [VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Xuyang
Thanks for driving this FLIP, +1 for it (non binding)




--

Best!
Xuyang





在 2023-12-21 06:08:02,"Natea Eshetu Beshada"  写道:
>Thanks for the FLIP Alan, this will be a great addition +1 (non binding)
>
>On Wed, Dec 20, 2023 at 11:41 AM Alan Sheinberg
> wrote:
>
>> Hi everyone,
>>
>> I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF
>> type, AsyncScalarFunction for completing invocations asynchronously.  It
>> has been discussed in this thread [2].
>>
>> I would like to start a vote.  The vote will be open for at least 72 hours
>> (until December 28th 18:00 GMT) unless there is an objection or
>> insufficient votes.
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
>> [2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2
>>
>> Thanks,
>> Alan
>>


Re: [DISCUSS] Release flink-connector-mongodb v1.1.0

2023-12-20 Thread Leonard Xu
Thanks Jiabao for driving this.

+1 to release flink-connector-mongodb v1.1.0 which supports Flink 1.18 and 
Flink 1.17.

I’d like to help  review the pending PRs and manage the release as well.

Best,
Leonard


> 2023年12月20日 下午9:59,Jiabao Sun  写道:
> 
> Hi,
> 
> Now Flink 1.18 is released. I propose to release flink-connector-mongodb 
> v1.1.0 
> to support Flink 1.18 and nominate Leonard Xu as the release manager.
> 
> This release includes support for filter pushdown[1] and JDK 17[2]. 
> We plan to proceed with the release after merging these two pull requests. 
> If you have time, please help review them as well.
> 
> Please let me know if the proposal is a good idea.
> 
> Best,
> Jiabao Sun
> 
> [1] https://github.com/apache/flink-connector-mongodb/pull/17
> [2] https://github.com/apache/flink-connector-mongodb/pull/21



Re: [DISCUSS] FLIP-389: Annotate SingleThreadFetcherManager and FutureCompletingBlockingQueue as PublicEvolving

2023-12-20 Thread Becket Qin
Hi Hongshun,

I think the proposal in the FLIP is basically fine. A few minor comments:

1. In FLIPs, we define all the user-sensible changes as public interfaces.
The public interface section should list all of them. So, the code blocks
currently in the proposed changes section should be put into the public
interface section instead.

2. It would be good to put all the changes of one class together. For
example, for SplitFetcherManager, we can say:
- Change SplitFetcherManager from Internal to PublicEvolving.
- Deprecate the old constructor exposing the
FutureCompletingBlockingQueue, and add new constructors as replacements
which creates the FutureCompletingBlockingQueue instance internally.
- Add a few new methods to expose the functionality of the internal
FutureCompletingBlockingQueue via the SplitFetcherManager.
   And then follows the code block containing all the changes above.
Ideally, the changes should come with something like "// <-- New", so
that it is. easier to be found.

3. In the proposed changes section, it would be good to add some more
detailed explanation of the idea behind the public interface changes. So
even people new to Flink can understand better how exactly the interface
changes will help fulfill the motivation. For example, regarding the
constructor signature change, we can say the following. We can mention a
few things in this section:
- By exposing the SplitFetcherManager / SingleThreadFetcheManager, by
implementing addSplits() and removeSplits(), connector developers can
easily create their own threading models in the SourceReaderBase.
- Note that the SplitFetcher constructor is package private, so users
can only create SplitFetchers via SplitFetcherManager.createSplitFetcher().
This ensures each SplitFetcher is always owned by the SplitFetcherManager.
- This FLIP essentially embedded the element queue (a
FutureCompletingBlockingQueue) instance into the SplitFetcherManager. This
hides the element queue from the connector developers and simplifies the
SourceReaderBase to consist of only SplitFetcherManager and RecordEmitter
as major components.

In short, the public interface section answers the question of "what". We
should list all the user-sensible changes in the public interface section,
without verbose explanation. The proposed changes section answers "how",
where we can add more details to explain the changes listed in the public
interface section.

Thanks,

Jiangjie (Becket) Qin



On Wed, Dec 20, 2023 at 10:07 AM Hongshun Wang 
wrote:

> Hi Becket,
>
>
> It has been a long time since we last discussed. Are there any other
> problems with this Flip from your side? I am looking forward to hearing
> from you.
>
>
> Thanks,
> Hongshun Wang
>


Re: [VOTE] Release flink-connector-pulsar 4.1.0, release candidate #1

2023-12-20 Thread Sergey Nuyanzin
+1 (non-binding)

- Downloaded artifacts
- Maven staging artifacts look good
- Verified checksum && keys
- Verified LICENSE and NOTICE files
- Built from source

On Wed, Dec 20, 2023 at 5:13 AM tison  wrote:

> Hi Leonard,
>
> You are a PMC member also. Perhaps you can check the candidate and
> vote on what you do :D
>
> Best,
> tison.
>
> Leonard Xu  于2023年12月20日周三 11:35写道:
> >
> > Bubble up, I need more votes, especially from PMC members.
> >
> > Best,
> > Leonard
> >
> > > 2023年12月14日 下午11:03,Hang Ruan  写道:
> > >
> > > +1 (non-binding)
> > >
> > > - Validated checksum hash
> > > - Verified signature
> > > - Verified that no binaries exist in the source archive
> > > - Build the source with jdk8
> > > - Verified web PR
> > > - Make sure flink-connector-base have the provided scope
> > >
> > > Best,
> > > Hang
> > >
> > > tison  于2023年12月14日周四 11:51写道:
> > >
> > >> Thanks Leonard for driving this release!
> > >>
> > >> +1 (non-binding)
> > >>
> > >> * Download link valid
> > >> * Maven staging artifacts look good.
> > >> * Checksum and gpg matches
> > >> * LICENSE and NOTICE exist
> > >> * Can build from source.
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >> Rui Fan <1996fan...@gmail.com> 于2023年12月14日周四 09:23写道:
> > >>>
> > >>> Thanks Leonard for driving this release!
> > >>>
> > >>> +1 (non-binding)
> > >>>
> > >>> - Validated checksum hash
> > >>> - Verified signature
> > >>> - Verified that no binaries exist in the source archive
> > >>> - Build the source with Maven and jdk8
> > >>> - Verified licenses
> > >>> - Verified web PRs, left a minor comment
> > >>>
> > >>> Best,
> > >>> Rui
> > >>>
> > >>> On Wed, Dec 13, 2023 at 7:15 PM Leonard Xu 
> wrote:
> > 
> >  Hey all,
> > 
> >  Please review and vote on the release candidate #1 for the version
> > >> 4.1.0 of the Apache Flink Pulsar Connector as follows:
> > 
> >  [ ] +1, Approve the release
> >  [ ] -1, Do not approve the release (please provide specific
> comments)
> > 
> >  The complete staging area is available for your review, which
> includes:
> >  * JIRA release notes [1],
> >  * The official Apache source release to be deployed to
> dist.apache.org
> > >> [2], which are signed with the key with fingerprint
> >  5B2F6608732389AEB67331F5B197E1F1108998AD [3],
> >  * All artifacts to be deployed to the Maven Central Repository [4],
> >  * Source code tag v4.1.0-rc1 [5],
> >  * Website pull request listing the new release [6].
> > 
> >  The vote will be open for at least 72 hours. It is adopted by
> majority
> > >> approval, with at least 3 PMC affirmative votes.
> > 
> > 
> >  Best,
> >  Leonard
> > 
> >  [1]
> > >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353431
> >  [2]
> > >>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-pulsar-4.1.0-rc1/
> >  [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >  [4]
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1688/
> >  [5]
> https://github.com/apache/flink-connector-pulsar/tree/v4.1.0-rc1
> >  [6] https://github.com/apache/flink-web/pull/703
> > >>
> >
>


-- 
Best regards,
Sergey


Re: [VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Natea Eshetu Beshada
Thanks for the FLIP Alan, this will be a great addition +1 (non binding)

On Wed, Dec 20, 2023 at 11:41 AM Alan Sheinberg
 wrote:

> Hi everyone,
>
> I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF
> type, AsyncScalarFunction for completing invocations asynchronously.  It
> has been discussed in this thread [2].
>
> I would like to start a vote.  The vote will be open for at least 72 hours
> (until December 28th 18:00 GMT) unless there is an objection or
> insufficient votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
> [2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2
>
> Thanks,
> Alan
>


[jira] [Created] (FLINK-33919) AutoRescalingITCase hangs on AZP

2023-12-20 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-33919:
---

 Summary: AutoRescalingITCase hangs on AZP
 Key: FLINK-33919
 URL: https://issues.apache.org/jira/browse/FLINK-33919
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Affects Versions: 1.19.0
Reporter: Sergey Nuyanzin


This build fails on AZP
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55700=logs=a657ddbf-d986-5381-9649-342d9c92e7fb=dc085d4a-05c8-580e-06ab-21f5624dab16=8608
because of waiting
{noformat}
Dec 20 02:07:46 "main" #1 [14299] prio=5 os_prio=0 cpu=12675.70ms 
elapsed=3115.94s tid=0x7f3f71481600 nid=14299 waiting on condition  
[0x7f3f74913000]
Dec 20 02:07:46java.lang.Thread.State: TIMED_WAITING (sleeping)
Dec 20 02:07:46 at java.lang.Thread.sleep0(java.base@21.0.1/Native 
Method)
Dec 20 02:07:46 at 
java.lang.Thread.sleep(java.base@21.0.1/Thread.java:509)
Dec 20 02:07:46 at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:152)
Dec 20 02:07:46 at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
Dec 20 02:07:46 at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitForOneMoreCheckpoint(CommonTestUtils.java:374)
Dec 20 02:07:46 at 
org.apache.flink.test.checkpointing.AutoRescalingITCase.testCheckpointRescalingKeyedState(AutoRescalingITCase.java:265)
Dec 20 02:07:46 at 
org.apache.flink.test.checkpointing.AutoRescalingITCase.testCheckpointRescalingInKeyedState(AutoRescalingITCase.java:196)
Dec 20 02:07:46 at 
java.lang.invoke.LambdaForm$DMH/0x7f3f0f201400.invokeVirtual(java.base@21.0.1/LambdaForm$DMH)
Dec 20 02:07:46 at 
java.lang.invoke.LambdaForm$MH/0x7f3f0f20c000.invoke(java.base@21.0.1/LambdaForm$MH)
Dec 20 02:07:46 at 
java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@21.0.1/Invokers$Holder)
Dec 20 02:07:46 at 
jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base@21.0.1/DirectMethodHandleAccessor.java:153)
Dec 20 02:07:46 at 
jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base@21.0.1/DirectMethodHandleAccessor.java:103)
Dec 20 02:07:46 at 
java.lang.reflect.Method.invoke(java.base@21.0.1/Method.java:580)
Dec 20 02:07:46 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Dec 20 02:07:46 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Dec 20 02:07:46 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
Dec 20 02:07:46 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Dec 20 02:07:46 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
Dec 20 02:07:46 at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
Dec 20 02:07:46 at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
Dec 20 02:07:46 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)

{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Alan Sheinberg
Hi everyone,

I'd like to start a vote on FLIP-400 [1]. It covers introducing a new UDF
type, AsyncScalarFunction for completing invocations asynchronously.  It
has been discussed in this thread [2].

I would like to start a vote.  The vote will be open for at least 72 hours
(until December 28th 18:00 GMT) unless there is an objection or
insufficient votes.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
[2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2

Thanks,
Alan


[jira] [Created] (FLINK-33918) Fix AsyncSinkWriterThrottlingTest test failure

2023-12-20 Thread Jim Hughes (Jira)
Jim Hughes created FLINK-33918:
--

 Summary: Fix AsyncSinkWriterThrottlingTest test failure
 Key: FLINK-33918
 URL: https://issues.apache.org/jira/browse/FLINK-33918
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0
Reporter: Jim Hughes


>From 
>[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55700=logs=1c002d28-a73d-5309-26ee-10036d8476b4=d1c117a6-8f13-5466-55f0-d48dbb767fcd]

 

```
Dec 20 03:09:03 03:09:03.411 [ERROR] 
org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize
 -- Time elapsed: 0.879 s <<< ERROR! 
Dec 20 03:09:03 java.lang.IllegalStateException: Illegal thread detected. This 
method must be called from inside the mailbox thread! 
Dec 20 03:09:03 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkIsMailboxThread(TaskMailboxImpl.java:262)
 
Dec 20 03:09:03 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:137)
 
Dec 20 03:09:03 at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.yield(MailboxExecutorImpl.java:84)
 
Dec 20 03:09:03 at 
org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.flush(AsyncSinkWriter.java:367)
 
Dec 20 03:09:03 at 
org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.lambda$registerCallback$3(AsyncSinkWriter.java:315)
 
Dec 20 03:09:03 at 
org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService$CallbackTask.onProcessingTime(TestProcessingTimeService.java:199)
 
Dec 20 03:09:03 at 
org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService.setCurrentTime(TestProcessingTimeService.java:76)
 
Dec 20 03:09:03 at 
org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize(AsyncSinkWriterThrottlingTest.java:64)
 
Dec 20 03:09:03 at java.lang.reflect.Method.invoke(Method.java:498) 
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread David Anderson
I'm delighted to see the progress on this. This is going to be a major
enabler for some important use cases.

The proposed simplifications (global config and ordered mode) for V1 make a
lot of sense to me. +1

David

On Wed, Dec 20, 2023 at 12:31 PM Alan Sheinberg
 wrote:

> Thanks for that feedback Lincoln,
>
> Only one question with the async `timeout` parameter[1](since I
> > haven't seen the POC code), current description is: 'The time which can
> > pass before a restart strategy is triggered',
> > but in the previous flip-232[2] and flip-234[3], in retry scenario, this
> > timeout is the total time, do we keep the behavior of the parameter
> > consistent?
>
> That's a good catch.  I was intending to use *AsyncWaitOperator*, and to
> pass this timeout directly.  Looking through the code a bit, it appears
> that it doesn't restart the timer on a retry, and this timeout is total, as
> you're saying.  I do intend on being consistent with the other FLIPs and
> retaining this behavior, so I will update the wording on my FLIP to reflect
> that.
>
> -Alan
>
> On Wed, Dec 20, 2023 at 1:36 AM Lincoln Lee 
> wrote:
>
> > +1 for this useful feature!
> > Hope this reply isn't too late. Agree that we start with global
> > async-scalar configuration and ordered mode first.
> >
> > @Alan Only one question with the async `timeout` parameter[1](since I
> > haven't seen the POC code), current description is: 'The time which can
> > pass before a restart strategy is triggered',
> > but in the previous flip-232[2] and flip-234[3], in retry scenario, this
> > timeout is the total time, do we keep the behavior of the parameter
> > consistent?
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
> > [2]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963
> > [3]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Alan Sheinberg  于2023年12月20日周三 08:41写道:
> >
> > > Thanks for the comments Timo.
> > >
> > >
> > > > Can you remove the necessary parts? Esp.:
> > >
> > >  @Override
> > > >  public Set getRequirements() {
> > > >  return Collections.singleton(FunctionRequirement.ORDERED);
> > > >  }
> > >
> > >
> > > I removed this section from the FLIP since presumably, there's no use
> in
> > > adding to the public API if it's ignored, with handling just ORDERED
> for
> > > the first version.  I'm not sure how quickly I'll want to add UNORDERED
> > > support, but I guess I can always do another FLIP.
> > >
> > > Otherwise I have no objections to start a VOTE soonish. If others are
> > > > fine as well?
> > >
> > > That would be great.  Any areas that people are interested in
> discussing
> > > further before a vote?
> > >
> > > -Alan
> > >
> > > On Tue, Dec 19, 2023 at 5:49 AM Timo Walther 
> wrote:
> > >
> > > >  > I would be totally fine with the first version only having ORDERED
> > > >  > mode. For a v2, we could attempt to do the next most conservative
> > > >  > thing
> > > >
> > > > Sounds good to me.
> > > >
> > > > I also cheked AsyncWaitOperator and could not find n access of
> > > > StreamRecord's timestamp but only watermarks. But as we said, let's
> > > > focus on ORDERED first.
> > > >
> > > > Can you remove the necessary parts? Esp.:
> > > >
> > > >  @Override
> > > >  public Set getRequirements() {
> > > >  return Collections.singleton(FunctionRequirement.ORDERED);
> > > >  }
> > > >
> > > > Otherwise I have no objections to start a VOTE soonish. If others are
> > > > fine as well?
> > > >
> > > > Regards,
> > > > Timo
> > > >
> > > >
> > > > On 19.12.23 07:32, Alan Sheinberg wrote:
> > > > > Thanks for the helpful comments, Xuyang and Timo.
> > > > >
> > > > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take
> > kafka
> > > as
> > > > >> source and mysql as sink as an example.
> > > > >> Although kafka is an append-only source, one of its fields is used
> > as
> > > pk
> > > > >> when writing to mysql. If async udx is executed
> > > > >>   in an unordered mode, there may be problems with the data in
> mysql
> > > in
> > > > the
> > > > >> end. In this case, we need to ensure that
> > > > >> the sink-based pk is in order actually.
> > > > >
> > > > >
> > > > > @Xuyang: That's a great point.  If some node downstream of my
> > operator
> > > > > cares about ordering, there's no way for it to reconstruct the
> > original
> > > > > ordering of the rows as they were input to my operator.  So even if
> > > they
> > > > > want to preserve ordering by key, the order in which they see it
> may
> > > > > already be incorrect.  Somehow I thought that maybe the analysis of
> > the
> > > > > changelog mode at a given operator was aware of downstream
> > operations,
> > > > but
> > > > > it seems not.
> > > > >
> 

[jira] [Created] (FLINK-33917) IllegalArgumentException: hostname can't be null

2023-12-20 Thread Tom (Jira)
Tom created FLINK-33917:
---

 Summary: IllegalArgumentException: hostname can't be null
 Key: FLINK-33917
 URL: https://issues.apache.org/jira/browse/FLINK-33917
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Reporter: Tom


In certain scenarios, if the hostname contains certain characters it will throw 
an exception when it tries to initialize the InetSocketAddress

 

    @Override
    public boolean isJobManagerPortReady(Configuration config) {
        final URI uri;
        try (var clusterClient = getClusterClient(config)) {
            uri = URI.create(clusterClient.getWebInterfaceURL());
        } catch (Exception ex) {
            throw new FlinkRuntimeException(ex);
        }
        SocketAddress socketAddress = new InetSocketAddress(uri.getHost(), 
uri.getPort());
        Socket socket = new Socket();
        try {
            socket.connect(socketAddress, 1000);
            socket.close();
            return true;
        } catch (IOException e) {
            return false;
        }
    }

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Alan Sheinberg
Thanks for that feedback Lincoln,

Only one question with the async `timeout` parameter[1](since I
> haven't seen the POC code), current description is: 'The time which can
> pass before a restart strategy is triggered',
> but in the previous flip-232[2] and flip-234[3], in retry scenario, this
> timeout is the total time, do we keep the behavior of the parameter
> consistent?

That's a good catch.  I was intending to use *AsyncWaitOperator*, and to
pass this timeout directly.  Looking through the code a bit, it appears
that it doesn't restart the timer on a retry, and this timeout is total, as
you're saying.  I do intend on being consistent with the other FLIPs and
retaining this behavior, so I will update the wording on my FLIP to reflect
that.

-Alan

On Wed, Dec 20, 2023 at 1:36 AM Lincoln Lee  wrote:

> +1 for this useful feature!
> Hope this reply isn't too late. Agree that we start with global
> async-scalar configuration and ordered mode first.
>
> @Alan Only one question with the async `timeout` parameter[1](since I
> haven't seen the POC code), current description is: 'The time which can
> pass before a restart strategy is triggered',
> but in the previous flip-232[2] and flip-234[3], in retry scenario, this
> timeout is the total time, do we keep the behavior of the parameter
> consistent?
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
> [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963
> [3]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems
>
> Best,
> Lincoln Lee
>
>
> Alan Sheinberg  于2023年12月20日周三 08:41写道:
>
> > Thanks for the comments Timo.
> >
> >
> > > Can you remove the necessary parts? Esp.:
> >
> >  @Override
> > >  public Set getRequirements() {
> > >  return Collections.singleton(FunctionRequirement.ORDERED);
> > >  }
> >
> >
> > I removed this section from the FLIP since presumably, there's no use in
> > adding to the public API if it's ignored, with handling just ORDERED for
> > the first version.  I'm not sure how quickly I'll want to add UNORDERED
> > support, but I guess I can always do another FLIP.
> >
> > Otherwise I have no objections to start a VOTE soonish. If others are
> > > fine as well?
> >
> > That would be great.  Any areas that people are interested in discussing
> > further before a vote?
> >
> > -Alan
> >
> > On Tue, Dec 19, 2023 at 5:49 AM Timo Walther  wrote:
> >
> > >  > I would be totally fine with the first version only having ORDERED
> > >  > mode. For a v2, we could attempt to do the next most conservative
> > >  > thing
> > >
> > > Sounds good to me.
> > >
> > > I also cheked AsyncWaitOperator and could not find n access of
> > > StreamRecord's timestamp but only watermarks. But as we said, let's
> > > focus on ORDERED first.
> > >
> > > Can you remove the necessary parts? Esp.:
> > >
> > >  @Override
> > >  public Set getRequirements() {
> > >  return Collections.singleton(FunctionRequirement.ORDERED);
> > >  }
> > >
> > > Otherwise I have no objections to start a VOTE soonish. If others are
> > > fine as well?
> > >
> > > Regards,
> > > Timo
> > >
> > >
> > > On 19.12.23 07:32, Alan Sheinberg wrote:
> > > > Thanks for the helpful comments, Xuyang and Timo.
> > > >
> > > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take
> kafka
> > as
> > > >> source and mysql as sink as an example.
> > > >> Although kafka is an append-only source, one of its fields is used
> as
> > pk
> > > >> when writing to mysql. If async udx is executed
> > > >>   in an unordered mode, there may be problems with the data in mysql
> > in
> > > the
> > > >> end. In this case, we need to ensure that
> > > >> the sink-based pk is in order actually.
> > > >
> > > >
> > > > @Xuyang: That's a great point.  If some node downstream of my
> operator
> > > > cares about ordering, there's no way for it to reconstruct the
> original
> > > > ordering of the rows as they were input to my operator.  So even if
> > they
> > > > want to preserve ordering by key, the order in which they see it may
> > > > already be incorrect.  Somehow I thought that maybe the analysis of
> the
> > > > changelog mode at a given operator was aware of downstream
> operations,
> > > but
> > > > it seems not.
> > > >
> > > > Clear "no" on this. Changelog semantics make the planner complex and
> we
> > > >> need to be careful. Therefore I would strongly suggest we introduce
> > > >> ORDERED and slowly enable UNORDERED whenever we see a good fit for
> it
> > in
> > > >> plans with appropriate planner rules that guard it.
> > > >
> > > >
> > > > @Timo: The better I understand the complexity, the more I agree with
> > > this.
> > > > I would be totally fine with the first version only having ORDERED
> > mode.
> > > > For a v2, we could attempt to do the next 

Re: [VOTE] Release 1.18.1, release candidate #2

2023-12-20 Thread Hang Ruan
+1 (non-binding)

- Validated hashes
- Verified signature
- Build the source with Maven
- Test with the kafka connector 3.0.2: read and write records from kafka in
sql client
- Verified web PRs

Best,
Hang

Jiabao Sun  于2023年12月20日周三 16:57写道:

> Thanks Jing for driving this release.
>
> +1 (non-binding)
>
> - Validated hashes
> - Verified signature
> - Checked the tag
> - Build the source with Maven
> - Verified web PRs
>
> Best,
> Jiabao
>
>
> > 2023年12月20日 07:38,Jing Ge  写道:
> >
> > Hi everyone,
> >
> > The release candidate #1 has been skipped. Please review and vote on the
> > release candidate #2 for the version 1.18.1,
> >
> > as follows:
> >
> > [ ] +1, Approve the release
> >
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> >
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint 96AE0E32CBE6E0753CE6 [3],
> >
> > * all artifacts to be deployed to the Maven Central Repository [4],
> >
> > * source code tag "release-1.18.1-rc2" [5],
> >
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640
> >
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.1-rc2/
> >
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1689
> >
> > [5] https://github.com/apache/flink/releases/tag/release-1.18.1-rc2
> >
> > [6] https://github.com/apache/flink-web/pull/706
> >
> > Thanks,
> > Release Manager
>
>


Re: [DISCUSS] FLIP-377: Support configuration to disable filter push down for Table/SQL Sources

2023-12-20 Thread Jiabao Sun
Hi,

Thank you to everyone for the discussion on this FLIP, 
especially Becket for providing guidance that made it more reasonable. 

The FLIP document[1] has been updated with the recent discussed content. 
Please take a look to double-check it when you have time.

If we can reach a consensus on this, I will open the voting thread in recent 
days.

Best,
Jiabao

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768


> 2023年12月20日 11:38,Jiabao Sun  写道:
> 
> Thanks Becket,
> 
> The behavior description has been added to the Public Interfaces section.
> 
> Best,
> Jiabao
> 
> 
>> 2023年12月20日 08:17,Becket Qin  写道:
>> 
>> Hi Jiabao,
>> 
>> Thanks for updating the FLIP.
>> Can you add the behavior of the policies that are only applicable to some
>> but not all of the databases? This is a part of the intended behavior of
>> the proposed configuration. So, we should include that in the FLIP.
>> Otherwise, the FLIP looks good to me.
>> 
>> Cheers,
>> 
>> Jiangjie (Becket) Qin
>> 
>> On Tue, Dec 19, 2023 at 11:00 PM Jiabao Sun 
>> wrote:
>> 
>>> Hi Becket,
>>> 
>>> I share the same view as you regarding the prefix for this configuration
>>> option.
>>> 
>>> For the JDBC connector, I prefer setting 'filter.handling.policy' = 'FOO'
>>> and throwing an exception when the database do not support that specific
>>> policy.
>>> 
>>> Not using a prefix can reduce the learning curve for users and avoid
>>> introducing a new set of configuration options for every supported JDBC
>>> database.
>>> I think the policies we provide can be compatible with most databases that
>>> follow the JDBC protocol.
>>> However, there may be cases where certain databases cannot support some
>>> policies.
>>> Nevertheless, we can ensure fast failure and allow users to choose a
>>> suitable policy in such situations.
>>> 
>>> I have removed the contents about the configuration prefix.
>>> Please help review it again.
>>> 
>>> Thanks,
>>> Jiabao
>>> 
>>> 
 2023年12月19日 19:46,Becket Qin  写道:
 
 Hi Jiabao,
 
 Thanks for updating the FLIP.
 
 One more question regarding the JDBC connector, since it is a connector
 shared by multiple databases, what if there is a filter handling policy
 that is only applicable to one of the databases, but not the others? In
 that case, how would the users specify that policy?
 Unlike the example of orc format with 2nd+ level config, JDBC connector
 only looks at the URL to decide which driver to use.
 
 For example, MySql supports policy FOO while other databases do not. If
 users want to use FOO for MySql, what should they do? Will they set
 '*mysql.filter.hanlding.policy'
 = 'FOO', *which will only be picked up when the MySql driver is used?
 Or they should just set* 'filter.handling.policy' = 'FOO', *and throw
 exceptions when other JDBC drivers are used? Personally, I prefer the
 latter. If we pick that, do we still need to mention the following?
 
 *The prefix is needed when the option is for a 2nd+ level. *
> 'connector' = 'filesystem',
> 'format' = 'orc',
> 'orc.filter.handling.policy' = 'NUBERIC_ONY'
> 
> *In this case, the values of this configuration may be different
>>> depending
> on the format option. For example, orc format may have INDEXED_ONLY
>>> while
> parquet format may have something else. *
> 
 
 I found this is somewhat misleading, because the example here is not a
>>> part
 of the proposal of this FLIP. It is just an example explaining when a
 prefix is needed, which seems orthogonal to the proposal in this FLIP.
 
 Thanks,
 
 Jiangjie (Becket) Qin
 
 
 On Tue, Dec 19, 2023 at 10:09 AM Jiabao Sun >> .invalid>
 wrote:
 
> Thanks Becket for the suggestions,
> 
> Updated.
> Please help review it again when you have time.
> 
> Best,
> Jiabao
> 
> 
>> 2023年12月19日 09:06,Becket Qin  写道:
>> 
>> Hi JIabao,
>> 
>> Thanks for updating the FLIP. It looks better. Some suggestions /
> questions:
>> 
>> 1. In the motivation section:
>> 
>>> *Currently, Flink Table/SQL does not expose fine-grained control for
> users
>>> to control filter pushdown. **However, filter pushdown has some side
>>> effects, such as additional computational pressure on external
>>> systems. Moreover, Improper queries can lead to issues such as full
> table
>>> scans, which in turn can impact the stability of external systems.*
>> 
>> This statement sounds like the side effects are there for all the
> systems,
>> which is inaccurate. Maybe we can say:
>> *Currently, Flink Table/SQL does not expose fine-grained control for
> users
>> to control filter pushdown. **However, filter pushdown may have side
>> effects in some cases, **such as additional computational pressure on
>> external systems. The JDBC 

[DISCUSS] Release flink-connector-mongodb v1.1.0

2023-12-20 Thread Jiabao Sun
Hi,

Now Flink 1.18 is released. I propose to release flink-connector-mongodb v1.1.0 
to support Flink 1.18 and nominate Leonard Xu as the release manager.

This release includes support for filter pushdown[1] and JDK 17[2]. 
We plan to proceed with the release after merging these two pull requests. 
If you have time, please help review them as well.

Please let me know if the proposal is a good idea.

Best,
Jiabao Sun

[1] https://github.com/apache/flink-connector-mongodb/pull/17
[2] https://github.com/apache/flink-connector-mongodb/pull/21

Re: [DISCUSS] Should Configuration support getting value based on String key?

2023-12-20 Thread Rui Fan
Thanks Xuannan and Xintong for the quick feedback!

> However, we should let the user know that we encourage using
ConfigOptions over the string-based
> configuration key, as Timo said, we should add the message to `String
> getString(String key, String defaultValue)` method.

Sure, we can add some comments to guide users to use ConfigOption.

If so, I will remove the @Deprecated annotation for
`getString(String key, String defaultValue)` method`, and add
some comments  for it.

Also, it's indeed a small change related to Public class(or Api),
is voting necessary?

Best,
Rui

On Wed, 20 Dec 2023 at 16:55, Xintong Song  wrote:

> Sounds good to me.
>
> Best,
>
> Xintong
>
>
>
> On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su  wrote:
>
> > Hi Rui,
> >
> > I am fine with keeping the `String getString(String key, String
> > defaultValue)` if more people favor it. However, we should let the
> > user know that we encourage using ConfigOptions over the string-based
> > configuration key, as Timo said, we should add the message to `String
> > getString(String key, String defaultValue)` method.
> >
> > Best,
> > Xuannan
> >
> > On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > > I noticed that Configuration is used in
> > > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig
> > > > > to store some cacheFile meta-information. Their keys are
> > > > > temporary(key name with number) and it is not convenient
> > > > > to predefine ConfigOption.
> > > >
> > > >
> > > > True, this one requires a bit more effort to migrate from string-key
> to
> > > > ConfigOption, but still should be doable. Looking at how the two
> > mentioned
> > > > methods are implemented and used, it seems what is really needed is
> > > > serialization and deserialization of `DistributedCacheEntry`-s. And
> > all the
> > > > entries are always written / read at once. So I think we can
> serialize
> > the
> > > > whole set of entries into a JSON string (or something similar), and
> > use one
> > > > ConfigOption with a deterministic key for it, rather than having one
> > > > ConfigOption for each field in each entry. WDYT?
> > >
> > > Hi Xintong, thanks for the good suggestion! Most of the entries can be
> > > serialized to a json string, and we can only write/read them at once.
> > > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to
> > > store it by the setBytes/getBytes.
> > >
> > > Also, I have an offline discussion with @Xuannan Su : refactoring all
> > code
> > > that uses String as key requires a separate FLIP. And we will provide
> > > detailed FLIP  later.
> > >
> > >
> >
> --
> > >
> > > Hi all, thanks everyone for the lively discussion. It's really a
> > trade-off to
> > > keep "String getString(String key, String defaultValue)" or not.
> > > (It's not a right or wrong thing.)
> > >
> > > Judging from the discussion, most discussants can accept that keeping
> > > `String getString(String key, String defaultValue)` and depreate the
> > > rest of `getXxx(String key, Xxx defaultValue)`.
> > >
> > > cc @Xintong Song @Xuannan Su , WDYT?
> > >
> > > Best,
> > > Rui
> > >
> > > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu  wrote:
> > >>
> > >> I think it's not clear whether forcing using ConfigOption would hurt
> > >> the user experience.
> > >>
> > >> Maybe it does at the beginning, because using string keys to access
> > >> Flink configuration can be simpler for new components/jobs.
> > >> However, problems may happen later if the configuration usages become
> > >> more complex, like key renaming, using types other than strings, and
> > >> other problems that ConfigOption was invented to address.
> > >>
> > >> Personally I prefer to encourage the usage of ConfigOption.
> > >> Jobs should use GlobalJobParameter for custom config, which is
> different
> > >> from the Configuration interface. Therefore, Configuration is mostly
> > >> used in other components/plugins, in which case the long-term
> > maintenance
> > >> can be important.
> > >>
> > >> However, since it is not a right or wrong choice, I'd also be fine
> > >> to keep the `getString()` method if more devs/users are in favor of
> it.
> > >>
> > >> Thanks,
> > >> Zhu
> > >>
> > >> Timo Walther  于2023年12月14日周四 17:41写道:
> > >>
> > >> > The configuration in Flink is complicated and I fear we won't have
> > >> > enough capacity to substantially fix it. The introduction of
> > >> > ReadableConfig, WritableConfig, and typed ConfigOptions was a right
> > step
> > >> > into making the code more maintainable. From the Flink side, every
> > read
> > >> > access should go through ConfigOption.
> > >> >
> > >> > However, I also understand Gyula pragmatism here because
> (practically
> > >> > speaking) users get access `getString()` via `toMap().get()`. So I'm
> > >> > fine with removing the deprecation for functionality that is
> available
> > >> > 

Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink

2023-12-20 Thread Yong Fang
Hi Ken,

Thanks for your feedback. The purpose of this FLIP is to improve the use of
serialization, including configurable serializer for users, providing
serializer for composite data types, and resolving the default enabling of
Kryo, etc. Introducing a better serialization framework would be a great
help for Flink's performance, and it's great to see your tests on Fury.
However, as @Xintong mentioned, this could be a huge work and beyond the
scope of this FLIP. If you're interested, I think we could create a new
FLIP for it and discuss it further. What do you think? Thanks.

Best,
Fang Yong

On Mon, Dec 18, 2023 at 11:16 AM Xintong Song  wrote:

> Hi Ken,
>
> I think the main purpose of this FLIP is to change how users interact with
> the knobs for customizing the serialization behaviors, from requiring code
> changes to working with pure configurations. Redesigning the knobs (i.e.,
> names, semantics, etc.), on the other hand, is not the purpose of this
> FLIP. Preserving the existing names and semantics should also help minimize
> the migration cost for existing users. Therefore, I'm in favor of not
> changing them.
>
> Concerning decoupling from Kryo, and introducing other serialization
> frameworks like Fury, I think that's a bigger topic that is worth further
> discussion. At the moment, I'm not aware of any community consensus on
> doing so. And even if in the future we decide to do so, the changes needed
> should be the same w/ or w/o this FLIP. So I'd suggest not to block this
> FLIP on these issues.
>
> WDYT?
>
> Best,
>
> Xintong
>
>
>
> On Fri, Dec 15, 2023 at 1:40 AM Ken Krugler 
> wrote:
>
> > Hi Yong,
> >
> > Looks good, thanks for creating this.
> >
> > One comment - related to my recent email about Fury, I would love to see
> > the v2 serialization decoupled from Kryo.
> >
> > As part of that, instead of using xxxKryo in methods, call them
> xxxGeneric.
> >
> > A more extreme change would be to totally rely on Fury (so no more POJO
> > serializer). Fury is faster than the POJO serializer in my tests, but
> this
> > would be a much bigger change.
> >
> > Though it could dramatically simplify the Flink serialization support.
> >
> > — Ken
> >
> > PS - a separate issue is how to migrate state from Kryo to something like
> > Fury, which supports schema evolution. I think this might be possible, by
> > having a smarter deserializer that identifies state as being created by
> > Kryo, and using (shaded) Kryo to deserialize, while still writing as
> Fury.
> >
> > > On Dec 6, 2023, at 6:35 PM, Yong Fang  wrote:
> > >
> > > Hi devs,
> > >
> > > I'd like to start a discussion about FLIP-398: Improve Serialization
> > > Configuration And Usage In Flink [1].
> > >
> > > Currently, users can register custom data types and serializers in
> Flink
> > > jobs through various methods, including registration in code,
> > > configuration, and annotations. These lead to difficulties in upgrading
> > > Flink jobs and priority issues.
> > >
> > > In flink-2.0 we would like to manage job data types and serializers
> > through
> > > configurations. This FLIP will introduce a unified option for data type
> > and
> > > serializer and users can configure all custom data types and
> > > pojo/kryo/custom serializers. In addition, this FLIP will add more
> > built-in
> > > serializers for complex data types such as List and Map, and optimize
> the
> > > management of Avro Serializers.
> > >
> > > Looking forward to hearing from you, thanks!
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink
> > >
> > > Best,
> > > Fang Yong
> >
> > --
> > Ken Krugler
> > http://www.scaleunlimited.com
> > Custom big data solutions
> > Flink & Pinot
> >
> >
> >
> >
>


[jira] [Created] (FLINK-33916) Workflow: Add nightly build for release-1.18

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33916:
-

 Summary: Workflow: Add nightly build for release-1.18
 Key: FLINK-33916
 URL: https://issues.apache.org/jira/browse/FLINK-33916
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


Add nightly workflow for {{{}release-1.18{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33915) Workflow: Add nightly build for the dev version (currently called "master")

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33915:
-

 Summary: Workflow: Add nightly build for the dev version 
(currently called "master")
 Key: FLINK-33915
 URL: https://issues.apache.org/jira/browse/FLINK-33915
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


The nightly builds run on master and the two most-recently released versions of 
Flink as those are the supported versions. This logic is currently captured in 
[flink-ci/git-repo-sync:sync_repo.sh|https://github.com/flink-ci/git-repo-sync/blob/master/sync_repo.sh#L28].

In 
[FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure]
 we decided to go ahead and provide nightly builds for {{master}} and 
{{{}release-1.18{}}}. This issue is about providing the nightly workflow for 
{{master}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33913) Template: Add CI template for running Flink's test suite

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33913:
-

 Summary: Template: Add CI template for running Flink's test suite
 Key: FLINK-33913
 URL: https://issues.apache.org/jira/browse/FLINK-33913
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


We want to have a template that runs the entire Flink test suite.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33914) Workflow: Add basic CI that will run with the default configuration

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33914:
-

 Summary: Workflow: Add basic CI that will run with the default 
configuration
 Key: FLINK-33914
 URL: https://issues.apache.org/jira/browse/FLINK-33914
 Project: Flink
  Issue Type: Sub-task
Reporter: Matthias Pohl


Runs the Flink CI template with the default configuration (Java 8) and can be 
enabled in each push to the branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33912) Template: Add CI template for pre-compile steps

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33912:
-

 Summary: Template: Add CI template for pre-compile steps
 Key: FLINK-33912
 URL: https://issues.apache.org/jira/browse/FLINK-33912
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


We want to have a template that triggers all checks that do not require 
compilation. Those quick checks (e.g. code format) can run without waiting for 
the compilation step to succeed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33911) Custom Action: Select workflow configuration

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33911:
-

 Summary: Custom Action: Select workflow configuration
 Key: FLINK-33911
 URL: https://issues.apache.org/jira/browse/FLINK-33911
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


During experiments, we noticed that the GHA UI isn't capable of utilizing a 
random count of compositions of workflows. If we get into the 3rd level of 
composite workflow, the job name will be cut off in the left menu which makes 
navigating the jobs harder (because you have duplicate of the same job, e.g. 
Compile, belonging to different job profiles).

As a workaround, we came up with Flink CI workflow profiles to configure the CI 
template yaml that is used in every job. A profile configuration can be 
specified through a JSON file that lives in the {{.github/workflow}} folder. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33910) Custom Action: Enable Java version in Flink's CI Docker image

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33910:
-

 Summary: Custom Action: Enable Java version in Flink's CI Docker 
image
 Key: FLINK-33910
 URL: https://issues.apache.org/jira/browse/FLINK-33910
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


Flink's CI Docker image comes with multiple Java versions which can be enabled 
through environment variables. We should have a custom action that sets these 
variables properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33909) Custom Action: Select the right branch and commit hash

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33909:
-

 Summary: Custom Action: Select the right branch and commit hash
 Key: FLINK-33909
 URL: https://issues.apache.org/jira/browse/FLINK-33909
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


For nightly builds, we want to select the release branches dynamically (rather 
than using the automatic selection through GHA schedule). We want to do this 
dynamically because the GHA feature for branch selection seems to be kind of 
limited right now, e.g.:
 * Don't run a branch that hasn't have any changes in the past 1 day (or any 
other time period)
 * Run only the most-recent release branches and ignore older release branches 
(similar to what we're doing in 
[flink-ci/git-repo-sync:sync_repo.sh|https://github.com/flink-ci/git-repo-sync/blob/master/sync_repo.sh#L28]
 right now)

A custom action that selects the branch and commit hash enables us to overwrite 
this setting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33908) Custom Action: Move files within the Docker image to the root folder to match the user

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33908:
-

 Summary: Custom Action: Move files within the Docker image to the 
root folder to match the user
 Key: FLINK-33908
 URL: https://issues.apache.org/jira/browse/FLINK-33908
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


The way the ci template is setup (right now) is to work in the root user's home 
folder. For this we're copying the checkout into /root. This copying is done in 
multiple places which makes it a candidate for a custom action.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33907) Makes copying test jars being done earlier

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33907:
-

 Summary: Makes copying test jars being done earlier
 Key: FLINK-33907
 URL: https://issues.apache.org/jira/browse/FLINK-33907
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


We experienced an issue in GHA which is due to the fact how test resources are 
pre-computed in GHA:
{code:java}
This fixes the following error when compiling flink-clients:
Error: 2.054 [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-dependency-plugin:3.2.0:copy-dependencies 
(copy-dependencies) on project flink-clients: Artifact has not been packaged 
yet. When used on reactor artifact, copy should be executed after packaging: 
see MDEP-187. -> [Help 1] {code}
We need to move this goal to a later phase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33906) tools/azure-pipelines/debug_files_utils.sh should support GHA output as well

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33906:
-

 Summary: tools/azure-pipelines/debug_files_utils.sh should support 
GHA output as well
 Key: FLINK-33906
 URL: https://issues.apache.org/jira/browse/FLINK-33906
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


{{tools/azure-pipelines/debug_files_utils.sh}} sets variables to reference the 
debug output. This is backend-specific and only supports Azure CI right now. We 
should add support for GHA.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33905) FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs

2023-12-20 Thread Wencong Liu (Jira)
Wencong Liu created FLINK-33905:
---

 Summary: FLIP-382: Unify the Provision of Diverse Metadata for 
Context-like APIs
 Key: FLINK-33905
 URL: https://issues.apache.org/jira/browse/FLINK-33905
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.19.0
Reporter: Wencong Liu


This ticket is proposed for 
[FLIP-382|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33904) Add zip as a package to GitHub Actions runners

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33904:
-

 Summary: Add zip  as a package to GitHub Actions runners
 Key: FLINK-33904
 URL: https://issues.apache.org/jira/browse/FLINK-33904
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


FLINK-33253 shows that {{test_pyflink.sh}} fails in GHA because it doesn't find 
{{{}zip{}}}. We should add this as a dependency in the e2e test.
{code:java}
/root/flink/flink-end-to-end-tests/test-scripts/test_pyflink.sh: line 107: zip: 
command not found {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33903) Reenable tests that edit file permissions

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33903:
-

 Summary: Reenable tests that edit file permissions
 Key: FLINK-33903
 URL: https://issues.apache.org/jira/browse/FLINK-33903
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


In GitHub Actions the permissions seem to work differently to how it works in 
Azure CI. In both cases, we run the test suite as root. But in GHA runners, the 
file permission changes won't have any effects because the test started as a 
root user have permissions to adapt the files in any case.

This issue is about enabling the tests again by rewriting the tests (ideally, 
because we shouldn't rely on OS features in the tests). Alternatively, we find 
a way to make those tests work in GHA as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33902) Switch to OpenSSL legacy algorithms

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33902:
-

 Summary: Switch to OpenSSL legacy algorithms
 Key: FLINK-33902
 URL: https://issues.apache.org/jira/browse/FLINK-33902
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Affects Versions: 1.19.0
Reporter: Matthias Pohl


In FLINK-33550 we discovered that the GHA runners provided by GitHub have a 
newer version of OpenSSL installed which caused errors in the SSL tests:
{code:java}
Certificate was added to keystore
Certificate was added to keystore
Certificate reply was installed in keystore
Error outputting keys and certificates
40F767F1D97F:error:0308010C:digital envelope 
routines:inner_evp_generic_fetch:unsupported:../crypto/evp/evp_fetch.c:349:Global
 default library context, Algorithm (RC2-40-CBC : 0), Properties ()
Nov 14 15:39:21 [FAIL] Test script contains errors. {code}
The workaround is to enable legacy algorithms using the {{-legacy}} parameter 
in 3.0.0+. We might need to check whether that works for older OpenSSL version 
(in Azure CI).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33901) Trial Period: GitHub Actions

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33901:
-

 Summary: Trial Period: GitHub Actions
 Key: FLINK-33901
 URL: https://issues.apache.org/jira/browse/FLINK-33901
 Project: Flink
  Issue Type: New Feature
  Components: Build System / CI
Affects Versions: 1.19.0
Reporter: Matthias Pohl


This issue is (in contrast to FLINK-27075 which is used for issues that were 
collected while preparing 
[FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure])
 collecting all the subtasks that are necessary to initiate the trial phase for 
GitHub Actions (as discussed in 
[FLIP-396|https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure]).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33900) Multiple failures in WindowRankITCase due to NoResourceAvailableException

2023-12-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33900:
-

 Summary: Multiple failures in WindowRankITCase due to 
NoResourceAvailableException
 Key: FLINK-33900
 URL: https://issues.apache.org/jira/browse/FLINK-33900
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Matthias Pohl


[https://github.com/XComp/flink/actions/runs/7244405295/job/19733011527#step:12:14989]

There are multiple tests in {{WindowRankITCase}} that fail due to a 
{{NoResourceAvailableException}} supposedly:
{code:java}
[...]
Error: 09:19:33 09:19:32.966 [ERROR] 
WindowRankITCase.testTumbleWindowTVFWithOffset  Time elapsed: 300.072 s  <<< 
FAILURE!
14558Dec 18 09:19:33 org.opentest4j.MultipleFailuresError: 
14559Dec 18 09:19:33 Multiple Failures (2 failures)
14560Dec 18 09:19:33org.apache.flink.runtime.client.JobExecutionException: 
Job execution failed.
14561Dec 18 09:19:33java.lang.AssertionError: 
14562Dec 18 09:19:33at 
org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:200)
14563Dec 18 09:19:33at 
org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:248)
14564Dec 18 09:19:33at 
org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:214)
14565Dec 18 09:19:33at 
org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:88)
14566Dec 18 09:19:33at 
org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
14567Dec 18 09:19:33at 
org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
14568Dec 18 09:19:33at 
org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
14569Dec 18 09:19:33at 
org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
14570Dec 18 09:19:33at 
org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
14571Dec 18 09:19:33at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
14572Dec 18 09:19:33at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
14573Dec 18 09:19:33at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
14574Dec 18 09:19:33at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
14575Dec 18 09:19:33at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
14576Dec 18 09:19:33at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
14577Dec 18 09:19:33at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
14578Dec 18 09:19:33at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
14579Dec 18 09:19:33at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
14580Dec 18 09:19:33at org.junit.runners.Suite.runChild(Suite.java:128)
14581Dec 18 09:19:33at org.junit.runners.Suite.runChild(Suite.java:27)
14582Dec 18 09:19:33at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
14583Dec 18 09:19:33at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
14584Dec 18 09:19:33at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
14585Dec 18 09:19:33at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
14586Dec 18 09:19:33at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
14587Dec 18 09:19:33at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
14588Dec 18 09:19:33at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
14589Dec 18 09:19:33at org.junit.rules.RunRules.evaluate(RunRules.java:20)
14590Dec 18 09:19:33at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
14591Dec 18 09:19:33at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
14592Dec 18 09:19:33at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
14593Dec 18 09:19:33at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
14594Dec 18 09:19:33at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
14595Dec 18 09:19:33at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
14596Dec 18 09:19:33at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
14597Dec 18 09:19:33at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
14598Dec 18 09:19:33at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
14599Dec 18 09:19:33at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
14600Dec 18 09:19:33at 

[jira] [Created] (FLINK-33899) Java 17 and 21 support for mongodb connector

2023-12-20 Thread Jiabao Sun (Jira)
Jiabao Sun created FLINK-33899:
--

 Summary: Java 17 and 21 support for mongodb connector
 Key: FLINK-33899
 URL: https://issues.apache.org/jira/browse/FLINK-33899
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / MongoDB
Affects Versions: mongodb-1.0.2
Reporter: Jiabao Sun
 Fix For: mongodb-1.1.0


After FLINK-33302 is finished it is now possible to specify jdk version
That allows to add jdk17 and jdk21 support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33898) Allow triggering unaligned checkpoint via REST api

2023-12-20 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-33898:
---

 Summary: Allow triggering unaligned checkpoint via REST api
 Key: FLINK-33898
 URL: https://issues.apache.org/jira/browse/FLINK-33898
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / REST
Reporter: Zakelly Lan
Assignee: Zakelly Lan


See FLINK-33897. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33897) Allow triggering unaligned checkpoint via CLI

2023-12-20 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-33897:
---

 Summary: Allow triggering unaligned checkpoint via CLI
 Key: FLINK-33897
 URL: https://issues.apache.org/jira/browse/FLINK-33897
 Project: Flink
  Issue Type: Improvement
  Components: Command Line Client, Runtime / Checkpointing
Reporter: Zakelly Lan


After FLINK-6755, user could trigger checkpoint through CLI. However I noticed 
there would be value supporting trigger it in unaligned way, since the job may 
encounter a high back-pressure and an aligned checkpoint would fail.

 

I suggest we provide an option '-unaligned' in CLI to support that.

 

Similar option would also be useful for REST api



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2023-12-20 Thread Lincoln Lee
+1 for this useful feature!
Hope this reply isn't too late. Agree that we start with global
async-scalar configuration and ordered mode first.

@Alan Only one question with the async `timeout` parameter[1](since I
haven't seen the POC code), current description is: 'The time which can
pass before a restart strategy is triggered',
but in the previous flip-232[2] and flip-234[3], in retry scenario, this
timeout is the total time, do we keep the behavior of the parameter
consistent?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
[2]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883963
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems

Best,
Lincoln Lee


Alan Sheinberg  于2023年12月20日周三 08:41写道:

> Thanks for the comments Timo.
>
>
> > Can you remove the necessary parts? Esp.:
>
>  @Override
> >  public Set getRequirements() {
> >  return Collections.singleton(FunctionRequirement.ORDERED);
> >  }
>
>
> I removed this section from the FLIP since presumably, there's no use in
> adding to the public API if it's ignored, with handling just ORDERED for
> the first version.  I'm not sure how quickly I'll want to add UNORDERED
> support, but I guess I can always do another FLIP.
>
> Otherwise I have no objections to start a VOTE soonish. If others are
> > fine as well?
>
> That would be great.  Any areas that people are interested in discussing
> further before a vote?
>
> -Alan
>
> On Tue, Dec 19, 2023 at 5:49 AM Timo Walther  wrote:
>
> >  > I would be totally fine with the first version only having ORDERED
> >  > mode. For a v2, we could attempt to do the next most conservative
> >  > thing
> >
> > Sounds good to me.
> >
> > I also cheked AsyncWaitOperator and could not find n access of
> > StreamRecord's timestamp but only watermarks. But as we said, let's
> > focus on ORDERED first.
> >
> > Can you remove the necessary parts? Esp.:
> >
> >  @Override
> >  public Set getRequirements() {
> >  return Collections.singleton(FunctionRequirement.ORDERED);
> >  }
> >
> > Otherwise I have no objections to start a VOTE soonish. If others are
> > fine as well?
> >
> > Regards,
> > Timo
> >
> >
> > On 19.12.23 07:32, Alan Sheinberg wrote:
> > > Thanks for the helpful comments, Xuyang and Timo.
> > >
> > > @Timo, @Alan: IIUC, there seems to be something wrong here. Take kafka
> as
> > >> source and mysql as sink as an example.
> > >> Although kafka is an append-only source, one of its fields is used as
> pk
> > >> when writing to mysql. If async udx is executed
> > >>   in an unordered mode, there may be problems with the data in mysql
> in
> > the
> > >> end. In this case, we need to ensure that
> > >> the sink-based pk is in order actually.
> > >
> > >
> > > @Xuyang: That's a great point.  If some node downstream of my operator
> > > cares about ordering, there's no way for it to reconstruct the original
> > > ordering of the rows as they were input to my operator.  So even if
> they
> > > want to preserve ordering by key, the order in which they see it may
> > > already be incorrect.  Somehow I thought that maybe the analysis of the
> > > changelog mode at a given operator was aware of downstream operations,
> > but
> > > it seems not.
> > >
> > > Clear "no" on this. Changelog semantics make the planner complex and we
> > >> need to be careful. Therefore I would strongly suggest we introduce
> > >> ORDERED and slowly enable UNORDERED whenever we see a good fit for it
> in
> > >> plans with appropriate planner rules that guard it.
> > >
> > >
> > > @Timo: The better I understand the complexity, the more I agree with
> > this.
> > > I would be totally fine with the first version only having ORDERED
> mode.
> > > For a v2, we could attempt to do the next most conservative thing and
> > only
> > > allow UNORDERED when the whole graph is in *INSERT *changelog mode.
> The
> > > next best type of optimization might understand what's the key required
> > > downstream, and allow breaking the original order only between
> unrelated
> > > keys, but maintaining it between rows of the same key.  Of course if
> the
> > > key used downstream is computed in some manner, that makes it all the
> > > harder to know this beforehand.
> > >
> > > So unordering should be fine *within* watermarks. This is also what
> > >> watermarks are good for, a trade-off between strict ordering and
> making
> > >> progress. The async operator from DataStream API also supports this
> if I
> > >> remember correctly. However, it assumes a timestamp is present in
> > >> StreamRecord on which it can work. But this is not the case within the
> > >> SQL engine.
> > >
> > >
> > > *AsyncWaitOperator* and *UnorderedStreamElementQueue* (the
> > implementations
> > > I plan on using) seem to support exactly this behavior.  I don't 

[jira] [Created] (FLINK-33896) Implement restore tests for Correlate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33896:
-

 Summary: Implement restore tests for Correlate node
 Key: FLINK-33896
 URL: https://issues.apache.org/jira/browse/FLINK-33896
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33895) Implement restore tests for PythonGroupWindowAggregate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33895:
-

 Summary: Implement restore tests for PythonGroupWindowAggregate 
node
 Key: FLINK-33895
 URL: https://issues.apache.org/jira/browse/FLINK-33895
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33894) Implement restore tests for PythonGroupAggregate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33894:
-

 Summary: Implement restore tests for PythonGroupAggregate node
 Key: FLINK-33894
 URL: https://issues.apache.org/jira/browse/FLINK-33894
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33893) Implement restore tests for PythonCorrelate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33893:
-

 Summary: Implement restore tests for PythonCorrelate node
 Key: FLINK-33893
 URL: https://issues.apache.org/jira/browse/FLINK-33893
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release 1.18.1, release candidate #2

2023-12-20 Thread Jiabao Sun
Thanks Jing for driving this release.

+1 (non-binding)

- Validated hashes
- Verified signature
- Checked the tag
- Build the source with Maven
- Verified web PRs

Best,
Jiabao


> 2023年12月20日 07:38,Jing Ge  写道:
> 
> Hi everyone,
> 
> The release candidate #1 has been skipped. Please review and vote on the
> release candidate #2 for the version 1.18.1,
> 
> as follows:
> 
> [ ] +1, Approve the release
> 
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> 
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 96AE0E32CBE6E0753CE6 [3],
> 
> * all artifacts to be deployed to the Maven Central Repository [4],
> 
> * source code tag "release-1.18.1-rc2" [5],
> 
> * website pull request listing the new release and adding announcement blog
> post [6].
> 
> 
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> 
> 
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640
> 
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.1-rc2/
> 
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> 
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1689
> 
> [5] https://github.com/apache/flink/releases/tag/release-1.18.1-rc2
> 
> [6] https://github.com/apache/flink-web/pull/706
> 
> Thanks,
> Release Manager



Re: [DISCUSS] Should Configuration support getting value based on String key?

2023-12-20 Thread Xintong Song
Sounds good to me.

Best,

Xintong



On Wed, Dec 20, 2023 at 9:40 AM Xuannan Su  wrote:

> Hi Rui,
>
> I am fine with keeping the `String getString(String key, String
> defaultValue)` if more people favor it. However, we should let the
> user know that we encourage using ConfigOptions over the string-based
> configuration key, as Timo said, we should add the message to `String
> getString(String key, String defaultValue)` method.
>
> Best,
> Xuannan
>
> On Tue, Dec 19, 2023 at 7:55 PM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > > I noticed that Configuration is used in
> > > > DistributedCache#writeFileInfoToConfig and readFileInfoFromConfig
> > > > to store some cacheFile meta-information. Their keys are
> > > > temporary(key name with number) and it is not convenient
> > > > to predefine ConfigOption.
> > >
> > >
> > > True, this one requires a bit more effort to migrate from string-key to
> > > ConfigOption, but still should be doable. Looking at how the two
> mentioned
> > > methods are implemented and used, it seems what is really needed is
> > > serialization and deserialization of `DistributedCacheEntry`-s. And
> all the
> > > entries are always written / read at once. So I think we can serialize
> the
> > > whole set of entries into a JSON string (or something similar), and
> use one
> > > ConfigOption with a deterministic key for it, rather than having one
> > > ConfigOption for each field in each entry. WDYT?
> >
> > Hi Xintong, thanks for the good suggestion! Most of the entries can be
> > serialized to a json string, and we can only write/read them at once.
> > The CACHE_FILE_BLOB_KEY is special, its type is byte[], we need to
> > store it by the setBytes/getBytes.
> >
> > Also, I have an offline discussion with @Xuannan Su : refactoring all
> code
> > that uses String as key requires a separate FLIP. And we will provide
> > detailed FLIP  later.
> >
> >
> --
> >
> > Hi all, thanks everyone for the lively discussion. It's really a
> trade-off to
> > keep "String getString(String key, String defaultValue)" or not.
> > (It's not a right or wrong thing.)
> >
> > Judging from the discussion, most discussants can accept that keeping
> > `String getString(String key, String defaultValue)` and depreate the
> > rest of `getXxx(String key, Xxx defaultValue)`.
> >
> > cc @Xintong Song @Xuannan Su , WDYT?
> >
> > Best,
> > Rui
> >
> > On Fri, Dec 15, 2023 at 11:38 AM Zhu Zhu  wrote:
> >>
> >> I think it's not clear whether forcing using ConfigOption would hurt
> >> the user experience.
> >>
> >> Maybe it does at the beginning, because using string keys to access
> >> Flink configuration can be simpler for new components/jobs.
> >> However, problems may happen later if the configuration usages become
> >> more complex, like key renaming, using types other than strings, and
> >> other problems that ConfigOption was invented to address.
> >>
> >> Personally I prefer to encourage the usage of ConfigOption.
> >> Jobs should use GlobalJobParameter for custom config, which is different
> >> from the Configuration interface. Therefore, Configuration is mostly
> >> used in other components/plugins, in which case the long-term
> maintenance
> >> can be important.
> >>
> >> However, since it is not a right or wrong choice, I'd also be fine
> >> to keep the `getString()` method if more devs/users are in favor of it.
> >>
> >> Thanks,
> >> Zhu
> >>
> >> Timo Walther  于2023年12月14日周四 17:41写道:
> >>
> >> > The configuration in Flink is complicated and I fear we won't have
> >> > enough capacity to substantially fix it. The introduction of
> >> > ReadableConfig, WritableConfig, and typed ConfigOptions was a right
> step
> >> > into making the code more maintainable. From the Flink side, every
> read
> >> > access should go through ConfigOption.
> >> >
> >> > However, I also understand Gyula pragmatism here because (practically
> >> > speaking) users get access `getString()` via `toMap().get()`. So I'm
> >> > fine with removing the deprecation for functionality that is available
> >> > anyways. We should, however, add the message to `getString()` that
> this
> >> > method is discouraged and `get(ConfigOption)` should be the preferred
> >> > way of accessting Configuration.
> >> >
> >> > In any case we should remove the getInt and related methods.
> >> >
> >> > Cheers,
> >> > Timo
> >> >
> >> >
> >> > On 14.12.23 09:56, Gyula Fóra wrote:
> >> > > I see a strong value for user facing configs to use ConfigOption
> and this
> >> > > should definitely be an enforced convention.
> >> > >
> >> > > However with the Flink project growing and many other components
> and even
> >> > > users using the Configuration object I really don’t think that we
> should
> >> > > “force” this on the users/developers.
> >> > >
> >> > > If we make fromMap / toMap free with basically no overhead, that is
> fine
> >> > > but otherwise I think it would hurt the 

[RESULT][VOTE] FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs

2023-12-20 Thread Wencong Liu
The vote for FLIP-382 [1]: Unify the Provision of Diverse Metadata for 
Context-like APIs
 (discussion thread [2]) concluded. The vote will be closed.[3].

There are 3 binding votes and 1 non-binding votes:

Xintong Song (binding)
Lijie Wang (binding)
Weijie Guo (binding)
Yuxin Tan (non-binding)

There were no -1 votes. Therefore, FLIP-382 was accepted. I will prepare
the necessary changes for Flink 1.19.

Thanks everyone for the discussion!

Wencong Liu

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs
[2] https://lists.apache.org/thread/3mgsc31odtpmzzl32s4oqbhlhxd0mn6b
[3] https://lists.apache.org/thread/5vlf3klv131x8oj45qohvg9c53qkd87c

[RESULT][VOTE] FLIP-380: Support Full Partition Processing On Non-keyed DataStream

2023-12-20 Thread Wencong Liu
The vote for FLIP-380 [1]: Support Full Partition Processing On Non-keyed 
DataStream
 (discussion thread [2]) concluded. The vote will be closed.[3].

There are 3 binding votes and 1 non-binding votes:

Xintong Song (binding)
YingJie Cao (binding)
Weijie Guo (binding)
Yuxin Tan (non-binding)

There were no -1 votes. Therefore, FLIP-380 was accepted. I will prepare
the necessary changes for Flink 1.19.

Thanks everyone for the discussion!

Wencong Liu

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-380%3A+Support+Full+Partition+Processing+On+Non-keyed+DataStream
[2] https://lists.apache.org/thread/nn7myj7vsvytbkdrnbvj5h0homsjrn1h
[3] https://lists.apache.org/thread/ns1my6ydctjfl9w89hm8gvldh00lqtq3