Re: [DISCUSSION] FLIP-442: FLIP-442: General Improvement to Configuration for Flink 2.0

2024-04-09 Thread Muhammet Orazov

Hey Xuannan,

Thanks for the FLIP and your efforts!

Minor clarification from my side:


We will relocate these ConfigOptions to a class that is included
in the documentation generation.


Would it make sense to define also in the FLIP the options class for
these variables? For example, GPUDriverOptions?

Best,
Muhammet

On 2024-04-09 08:20, Xuannan Su wrote:

Hi all,

I'd like to start a discussion on FLIP-442: General Improvement to
Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we aim to
provide users with a better experience with the existing
configuration. This FLIP proposes several general improvements to the
current configuration.

Looking forward to everyone's feedback and suggestions. Thank you!

Best regards,
Xuannan

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-442%3A+General+Improvement+to+Configuration+for+Flink+2.0


Re: [VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread Zhu Zhu
+1

Thanks,
Zhu

gongzhongqiang  于2024年4月10日周三 13:11写道:

> +1 (non binding)
>
>
> Bests,
>
> Zhongqiang Gong
>
> Rui Fan <1996fan...@gmail.com> 于2024年4月10日周三 12:36写道:
>
> > Hi devs,
> >
> > Thank you to everyone for the feedback on FLIP-441: Show
> > the JobType and remove Execution Mode on Flink WebUI[1]
> > which has been discussed in this thread [2].
> >
> > I would like to start a vote for it. The vote will be open for at least
> 72
> > hours unless there is an objection or not enough votes.
> >
> > [1] https://cwiki.apache.org/confluence/x/agrPEQ
> > [2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd
> >
> > Best,
> > Rui
> >
>


Re: [VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread gongzhongqiang
+1 (non binding)


Bests,

Zhongqiang Gong

Rui Fan <1996fan...@gmail.com> 于2024年4月10日周三 12:36写道:

> Hi devs,
>
> Thank you to everyone for the feedback on FLIP-441: Show
> the JobType and remove Execution Mode on Flink WebUI[1]
> which has been discussed in this thread [2].
>
> I would like to start a vote for it. The vote will be open for at least 72
> hours unless there is an objection or not enough votes.
>
> [1] https://cwiki.apache.org/confluence/x/agrPEQ
> [2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd
>
> Best,
> Rui
>


Re: [VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread Samrat Deb
+1 (non binding)

Bests,
Samrat

On Wed, 10 Apr 2024 at 10:26 AM, Feifan Wang  wrote:

>
>
> +1 (non-binding)
>
>
>
>
>
>
>
> --
>
>
>
>
> ——
>
> Best regards,
>
> Feifan Wang
>
>
>
>
> At 2024-04-10 12:36:00, "Rui Fan" <1996fan...@gmail.com> wrote:
> >Hi devs,
> >
> >Thank you to everyone for the feedback on FLIP-441: Show
> >the JobType and remove Execution Mode on Flink WebUI[1]
> >which has been discussed in this thread [2].
> >
> >I would like to start a vote for it. The vote will be open for at least 72
> >hours unless there is an objection or not enough votes.
> >
> >[1] https://cwiki.apache.org/confluence/x/agrPEQ
> >[2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd
> >
> >Best,
> >Rui
>


Re:[VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread Feifan Wang


+1 (non-binding)







--




——

Best regards,

Feifan Wang




At 2024-04-10 12:36:00, "Rui Fan" <1996fan...@gmail.com> wrote:
>Hi devs,
>
>Thank you to everyone for the feedback on FLIP-441: Show
>the JobType and remove Execution Mode on Flink WebUI[1]
>which has been discussed in this thread [2].
>
>I would like to start a vote for it. The vote will be open for at least 72
>hours unless there is an objection or not enough votes.
>
>[1] https://cwiki.apache.org/confluence/x/agrPEQ
>[2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd
>
>Best,
>Rui


[VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread Rui Fan
Hi devs,

Thank you to everyone for the feedback on FLIP-441: Show
the JobType and remove Execution Mode on Flink WebUI[1]
which has been discussed in this thread [2].

I would like to start a vote for it. The vote will be open for at least 72
hours unless there is an objection or not enough votes.

[1] https://cwiki.apache.org/confluence/x/agrPEQ
[2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd

Best,
Rui


Re: [DISCUSSION] FLIP-442: FLIP-442: General Improvement to Configuration for Flink 2.0

2024-04-09 Thread Zakelly Lan
Thanks Xuannan for driving this! +1 for cleaning these up.

And minor comments: It seems the StateBackendOptions is already annotated
with @PublicEvolving.


Best,
Zakelly


On Tue, Apr 9, 2024 at 4:21 PM Xuannan Su  wrote:

> Hi all,
>
> I'd like to start a discussion on FLIP-442: General Improvement to
> Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we aim to
> provide users with a better experience with the existing
> configuration. This FLIP proposes several general improvements to the
> current configuration.
>
> Looking forward to everyone's feedback and suggestions. Thank you!
>
> Best regards,
> Xuannan
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-442%3A+General+Improvement+to+Configuration+for+Flink+2.0
>


Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-04-09 Thread Zakelly Lan
Thanks Rui for driving this! +1 for this idea.


Best,
Zakelly

On Mon, Apr 8, 2024 at 7:17 PM Ahmed Hamdy  wrote:

> Acknowledged, +1 to start a vote.
> Best Regards
> Ahmed Hamdy
>
>
> On Mon, 8 Apr 2024 at 12:04, Rui Fan <1996fan...@gmail.com> wrote:
>
> > Sorry, it's a typo. It should be FLINK-32558[1].
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-32558
> >
> > Best,
> > Rui
> >
> > On Mon, Apr 8, 2024 at 6:44 PM Ahmed Hamdy  wrote:
> >
> > > Hi Rui,
> > > Thanks for the proposal.
> > > Is the deprecation Jira mentioned (FLINK-32258) correct?
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Sun, 7 Apr 2024 at 03:37, Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > If there are no extra comments, I will start voting in three days,
> > thank
> > > > you~
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Thu, Mar 28, 2024 at 4:46 PM Muhammet Orazov
> > > >  wrote:
> > > >
> > > > > Hey Rui,
> > > > >
> > > > > Thanks for the detailed explanation and updating the FLIP!
> > > > >
> > > > > It is much clearer definitely, thanks for the proposal.
> > > > >
> > > > > Best,
> > > > > Muhammet
> > > > >
> > > > > On 2024-03-28 07:37, Rui Fan wrote:
> > > > > > Hi Muhammet,
> > > > > >
> > > > > > Thanks for your reply!
> > > > > >
> > > > > >> The execution mode is also used for the DataStream API [1],
> > > > > >> would that also affect/hide the DataStream execution mode
> > > > > >> if we remove it from the WebUI?
> > > > > >
> > > > > > Sorry, I didn't describe it clearly in FLIP-441[2], I have
> updated
> > > it.
> > > > > > Let me clarify the Execution Mode here:
> > > > > >
> > > > > > 1. Flink 1.19 website[3] also mentions the Execution mode, but it
> > > > > > actually matches the JobType[4] in the Flink code. Both of them
> > > > > > have 2 types: STREAMING and BATCH.
> > > > > > 2. execution.runtime-mode can be set to 3 types: STREAMING,
> > > > > > BATCH and AUTOMATIC. But the jobType will be inferred as
> > > > > > STREAMING or BATCH when execution.runtime-mode is set
> > > > > > to AUTOMATIC.
> > > > > > 3. The ExecutionMode I describe is: code link[5] , as we can
> > > > > > see, ExecutionMode has 4 enums: PIPELINED,
> > > > > > PIPELINED_FORCED, BATCH and BATCH_FORCED.
> > > > > > And we can see a flink streaming job from Flink WebUI,
> > > > > > the Execution mode is PIPELINE instead of STREAMING.
> > > > > > I attached a screenshot to the FLIP doc[2], you can see it there.
> > > > > > 4. What this proposal wants to do is to remove the ExecutionMode
> > > > > > with four enumerations on Flink WebUI and introduce the
> > > > > > JobType with two enumerations (STREAMING or BATCH).
> > > > > > STREAMING or BATCH is clearer and more accurate for users.
> > > > > >
> > > > > > Please let me know if it's not clear or anything is wrong,
> thanks a
> > > > > > lot!
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/execution_mode/
> > > > > > [2] https://cwiki.apache.org/confluence/x/agrPEQ
> > > > > > [3]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/execution_mode/
> > > > > > [4]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobType.java#L22
> > > > > > [5]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/f31c128bfc457b64dd7734f71123b74faa2958ba/flink-core/src/main/java/org/apache/flink/api/common/ExecutionMode.java#L54
> > > > > >
> > > > > > Best,
> > > > > > Rui
> > > > > >
> > > > > > On Thu, Mar 28, 2024 at 1:33 AM Venkatakrishnan Sowrirajan
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > >> Rui,
> > > > > >>
> > > > > >> I assume the current proposal would also handle the case of
> mixed
> > > mode
> > > > > >> (BATCH + STREAMING within the same app) in the future, right?
> > > > > >>
> > > > > >> Regards
> > > > > >> Venkat
> > > > > >>
> > > > > >> On Wed, Mar 27, 2024 at 10:15 AM Venkatakrishnan Sowrirajan <
> > > > > >> vsowr...@asu.edu> wrote:
> > > > > >>
> > > > > >>> This will be a very useful addition to Flink UI. Thanks Rui for
> > > > > >>> starting
> > > > > >>> a FLIP for this improvement.
> > > > > >>>
> > > > > >>> Regards
> > > > > >>> Venkata krishnan
> > > > > >>>
> > > > > >>>
> > > > > >>> On Wed, Mar 27, 2024 at 4:49 AM Muhammet Orazov
> > > > > >>>  wrote:
> > > > > >>>
> > > > >  Hello Rui,
> > > > > 
> > > > >  Thanks for the proposal! It looks good!
> > > > > 
> > > > >  I have minor clarification from my side:
> > > > > 
> > > > >  The execution mode is also used for the DataStream API [1],
> > > > >  would that also affect/hide the DataStream execution mode
> > > > >  if we remove it from the WebUI?
> > > > > 
> > > > >  Best,
> > > > >  Muhammet
> > > > > 

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Jark Wu
I have been following up on the discussion, it's a great FLIP to further
unify stream and batch ETL pipelines. Thanks for the proposal!

Here is my ranking:

1. Materialized Table  -> "The table materializes the results of a query
that you specify", this can reflect what we want and doesn't conflict with
any SQL standard.
2. Derived Table -> easy to understand and write, but need to extend the
standard
3. Live Table ->  looks too much like Databrick's Delta Live Table.
4. Materialized View -> looks weird to insert/update a view.


Best,
Jark




On Wed, 10 Apr 2024 at 09:57, Becket Qin  wrote:

> Thanks for the proposal. I like the FLIP.
>
> My ranking:
>
> 1. Refresh(ing) / Live Table -> easy to understand and implies the dynamic
> characteristic
>
> 2. Derived Table -> easy to understand.
>
> 3. Materialized Table -> sounds like just a table with physical data stored
> somewhere.
>
> 4. Materialized View -> modifying a view directly is a little weird.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Tue, Apr 9, 2024 at 5:46 AM Lincoln Lee  wrote:
>
> > Thanks Ron and Timo for your proposal!
> >
> > Here is my ranking:
> >
> > 1. Derived table -> extend the persistent semantics of derived table in
> SQL
> >standard, with a strong association with query, and has industry
> > precedents
> >such as Google Looker.
> >
> > 2. Live Table ->  an alternative for 'dynamic table'
> >
> > 3. Materialized Table -> combination of the Materialized View and Table,
> > but
> > still a table which accept data changes
> >
> > 4. Materialized View -> need to extend understanding of the view to
> accept
> > data changes
> >
> > The reason for not adding 'Refresh Table' is I don't want to tell the
> user
> > to 'refresh a refresh table'.
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Ron liu  于2024年4月9日周二 20:11写道:
> >
> > > Hi, Dev
> > >
> > > My rankings are:
> > >
> > > 1. Derived Table
> > > 2. Materialized Table
> > > 3. Live Table
> > > 4. Materialized View
> > >
> > > Best,
> > > Ron
> > >
> > >
> > >
> > > Ron liu  于2024年4月9日周二 20:07写道:
> > >
> > > > Hi, Dev
> > > >
> > > > After several rounds of discussion, there is currently no consensus
> on
> > > the
> > > > name of the new concept. Timo has proposed that we decide the name
> > > through
> > > > a vote. This is a good solution when there is no clear preference, so
> > we
> > > > will adopt this approach.
> > > >
> > > > Regarding the name of the new concept, there are currently five
> > > candidates:
> > > > 1. Derived Table -> taken by SQL standard
> > > > 2. Materialized Table -> similar to SQL materialized view but a table
> > > > 3. Live Table -> similar to dynamic tables
> > > > 4. Refresh Table -> states what it does
> > > > 5. Materialized View -> needs to extend the standard to support
> > modifying
> > > > data
> > > >
> > > > For the above five candidates, everyone can give your rankings based
> on
> > > > your preferences. You can choose up to five options or only choose
> some
> > > of
> > > > them.
> > > > We will use a scoring rule, where the* first rank gets 5 points,
> second
> > > > rank gets 4 points, third rank gets 3 points, fourth rank gets 2
> > points,
> > > > and fifth rank gets 1 point*.
> > > > After the voting closes, I will score all the candidates based on
> > > > everyone's votes, and the candidate with the highest score will be
> > chosen
> > > > as the name for the new concept.
> > > >
> > > > The voting will last up to 72 hours and is expected to close this
> > Friday.
> > > > I look forward to everyone voting on the name in this thread. Of
> > course,
> > > we
> > > > also welcome new input regarding the name.
> > > >
> > > > Best,
> > > > Ron
> > > >
> > > > Ron liu  于2024年4月9日周二 19:49写道:
> > > >
> > > >> Hi, Dev
> > > >>
> > > >> Sorry for my previous statement was not quite accurate. We will
> hold a
> > > >> vote for the name within this thread.
> > > >>
> > > >> Best,
> > > >> Ron
> > > >>
> > > >>
> > > >> Ron liu  于2024年4月9日周二 19:29写道:
> > > >>
> > > >>> Hi, Timo
> > > >>>
> > > >>> Thanks for your reply.
> > > >>>
> > > >>> I agree with you that sometimes naming is more difficult. When no
> one
> > > >>> has a clear preference, voting on the name is a good solution, so
> > I'll
> > > send
> > > >>> a separate email for the vote, clarify the rules for the vote, then
> > let
> > > >>> everyone vote.
> > > >>>
> > > >>> One other point to confirm, in your ranking there is an option for
> > > >>> Materialized View, does it stand for the UPDATING Materialized View
> > > that
> > > >>> you mentioned earlier in the discussion? If using Materialized
> View I
> > > think
> > > >>> it is needed to extend it.
> > > >>>
> > > >>> Best,
> > > >>> Ron
> > > >>>
> > > >>> Timo Walther  于2024年4月9日周二 17:20写道:
> > > >>>
> > >  Hi Ron,
> > > 
> > >  yes naming is hard. But it will have large impact on trainings,
> > >  presentations, and the mental model of users. Maybe the easiest is
> > to
> > >  collect 

Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Jing Ge
+1(binding)

Best regards,
Jing

On Tue, Apr 9, 2024 at 8:54 PM Feng Jin  wrote:

> +1 (non-binding)
>
> Best,
> Feng
>
> On Tue, Apr 9, 2024 at 5:56 PM gongzhongqiang 
> wrote:
>
> > +1 (non-binding)
> >
> > Best,
> >
> > Zhongqiang Gong
> >
> > wudi <676366...@qq.com.invalid> 于2024年4月9日周二 10:48写道:
> >
> > > Hi devs,
> > >
> > > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > > contributing the Flink Doris Connector[2] to the Flink community.
> > > Discussion thread [3].
> > >
> > > The vote will be open for at least 72 hours unless there is an
> objection
> > or
> > > insufficient votes.
> > >
> > >
> > > Thanks,
> > > Di.Wu
> > >
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > > [2] https://github.com/apache/doris-flink-connector
> > > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> > >
> > >
> >
>


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Becket Qin
Thanks for the proposal. I like the FLIP.

My ranking:

1. Refresh(ing) / Live Table -> easy to understand and implies the dynamic
characteristic

2. Derived Table -> easy to understand.

3. Materialized Table -> sounds like just a table with physical data stored
somewhere.

4. Materialized View -> modifying a view directly is a little weird.

Thanks,

Jiangjie (Becket) Qin



On Tue, Apr 9, 2024 at 5:46 AM Lincoln Lee  wrote:

> Thanks Ron and Timo for your proposal!
>
> Here is my ranking:
>
> 1. Derived table -> extend the persistent semantics of derived table in SQL
>standard, with a strong association with query, and has industry
> precedents
>such as Google Looker.
>
> 2. Live Table ->  an alternative for 'dynamic table'
>
> 3. Materialized Table -> combination of the Materialized View and Table,
> but
> still a table which accept data changes
>
> 4. Materialized View -> need to extend understanding of the view to accept
> data changes
>
> The reason for not adding 'Refresh Table' is I don't want to tell the user
> to 'refresh a refresh table'.
>
>
> Best,
> Lincoln Lee
>
>
> Ron liu  于2024年4月9日周二 20:11写道:
>
> > Hi, Dev
> >
> > My rankings are:
> >
> > 1. Derived Table
> > 2. Materialized Table
> > 3. Live Table
> > 4. Materialized View
> >
> > Best,
> > Ron
> >
> >
> >
> > Ron liu  于2024年4月9日周二 20:07写道:
> >
> > > Hi, Dev
> > >
> > > After several rounds of discussion, there is currently no consensus on
> > the
> > > name of the new concept. Timo has proposed that we decide the name
> > through
> > > a vote. This is a good solution when there is no clear preference, so
> we
> > > will adopt this approach.
> > >
> > > Regarding the name of the new concept, there are currently five
> > candidates:
> > > 1. Derived Table -> taken by SQL standard
> > > 2. Materialized Table -> similar to SQL materialized view but a table
> > > 3. Live Table -> similar to dynamic tables
> > > 4. Refresh Table -> states what it does
> > > 5. Materialized View -> needs to extend the standard to support
> modifying
> > > data
> > >
> > > For the above five candidates, everyone can give your rankings based on
> > > your preferences. You can choose up to five options or only choose some
> > of
> > > them.
> > > We will use a scoring rule, where the* first rank gets 5 points, second
> > > rank gets 4 points, third rank gets 3 points, fourth rank gets 2
> points,
> > > and fifth rank gets 1 point*.
> > > After the voting closes, I will score all the candidates based on
> > > everyone's votes, and the candidate with the highest score will be
> chosen
> > > as the name for the new concept.
> > >
> > > The voting will last up to 72 hours and is expected to close this
> Friday.
> > > I look forward to everyone voting on the name in this thread. Of
> course,
> > we
> > > also welcome new input regarding the name.
> > >
> > > Best,
> > > Ron
> > >
> > > Ron liu  于2024年4月9日周二 19:49写道:
> > >
> > >> Hi, Dev
> > >>
> > >> Sorry for my previous statement was not quite accurate. We will hold a
> > >> vote for the name within this thread.
> > >>
> > >> Best,
> > >> Ron
> > >>
> > >>
> > >> Ron liu  于2024年4月9日周二 19:29写道:
> > >>
> > >>> Hi, Timo
> > >>>
> > >>> Thanks for your reply.
> > >>>
> > >>> I agree with you that sometimes naming is more difficult. When no one
> > >>> has a clear preference, voting on the name is a good solution, so
> I'll
> > send
> > >>> a separate email for the vote, clarify the rules for the vote, then
> let
> > >>> everyone vote.
> > >>>
> > >>> One other point to confirm, in your ranking there is an option for
> > >>> Materialized View, does it stand for the UPDATING Materialized View
> > that
> > >>> you mentioned earlier in the discussion? If using Materialized View I
> > think
> > >>> it is needed to extend it.
> > >>>
> > >>> Best,
> > >>> Ron
> > >>>
> > >>> Timo Walther  于2024年4月9日周二 17:20写道:
> > >>>
> >  Hi Ron,
> > 
> >  yes naming is hard. But it will have large impact on trainings,
> >  presentations, and the mental model of users. Maybe the easiest is
> to
> >  collect ranking by everyone with some short justification:
> > 
> > 
> >  My ranking (from good to not so good):
> > 
> >  1. Refresh Table -> states what it does
> >  2. Materialized Table -> similar to SQL materialized view but a
> table
> >  3. Live Table -> nice buzzword, but maybe still too close to dynamic
> >  tables?
> >  4. Materialized View -> a bit broader than standard but still very
> >  similar
> >  5. Derived table -> taken by standard
> > 
> >  Regards,
> >  Timo
> > 
> > 
> > 
> >  On 07.04.24 11:34, Ron liu wrote:
> >  > Hi, Dev
> >  >
> >  > This is a summary letter. After several rounds of discussion,
> there
> >  is a
> >  > strong consensus about the FLIP proposal and the issues it aims to
> >  address.
> >  > The current point of disagreement is the naming of the new
> concept.
> > I
> > 

Re: [DISCUSS] FLIP-438: Amazon SQS Sink Connector

2024-04-09 Thread Dhingra, Priya


On 4/9/24, 3:57 PM, "Dhingra, Priya" mailto:dhipr...@amazon.com.inva>LID> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi Ahmed and Samrat,


Thanks a lot for all the feedbacks, this is my first ever contribution to 
apache Flink, hence I was bit unaware about few things but updated all of them 
as per your suggestions, thanks again for all the support here, much 
appreciated!!


1) I am not sure why we need to suppress warnings in the sink example in the
FLIP?


Removed and updated the FLIP.


2) You provided the sink example as it is the Public Interface, however the
actual AsyncSink logic is mostly in the writer, so would be helpful to
provide a brief of the writer or the "submitRequestEntries"


Added in FLIP


3) I am not sure what customDimensions are or how are they going to be used
by the client, (that's why I thought the writer example should be helpful).


Removed. This is no more required, we have added in our code to support some 
specific usecase, no more required for apache PR.


4) Are we going to use the existing aws client providers to handle the
authentication and async client creation similar to Kinesis/Firehose and
DDB. I would strongly recommend we do.


Yes


5) Given you suggest implementing this in "flink-connector-aws" repo, it
should probably follow the same versioning of AWS connectors, hence
targeting 4.3.0. Also I am not sure why we are targeting support for 1.16
given that it is out of support and 4.2 supports 1.17+.


Sorry, was not aware about the versioning we should have it here. I tested the 
sqs sink with flint 1.16 so thought of putting the same, but was not aware 
about out of support. Updated now with 4.3.0 and 1.17+


6) Are we planning to support Table API & SQL as well? This should not be
much of an effort IMO so I think we should.


No, not putting that in scope for first iteration. We can take that as fast 
follow up.


7) It would be great if we also added an implementation of Element converter 
given that SQS message bodies are mainly Strings if I remember correctly. We 
can extend it to other types for MessageAttributeValue augmentation,this should 
be more valuable on table API as well to use it as default Element Converter.


Updated in FLIP


8. Different connectors provide different types of fault
tolerant guarantees[1]. What type of fault tolerant sink guarantees
flink-connector-sqs will provide ?
Could you elaborate on the fault-tolerant capabilities that the
flink-connector-sqs will provide?


 at-least-once




9) Can you help with what the minimal configuration required for
instantiating the sink ?


SQSSink.builder()
.setSqsUrl(sqsUrl)
.setSqsClientProperties(getSQSClientProperties())
.setSerializationSchema(serializationSchema)
.build();




10) Amazon SQS offers various data types [2]. Could you outline the types of 
SQS data the sink plans to support?

SendMessageBatchRequestEntry














Hi Priya,




Thank you for the FLIP. sqs connector would be a great addition to the
flink connector aws.




+1 for all the queries raised by Ahmed.




Adding to Ahmed's queries, I have a few more:




1. Different connectors provide different types of fault
tolerant guarantees[1]. What type of fault tolerant sink guarantees
flink-connector-sqs will provide ?
Could you elaborate on the fault-tolerant capabilities that the
flink-connector-sqs will provide?




2. Can you help with what the minimal configuration required for
instantiating the sink ?




3. Amazon SQS offers various data types [2]. Could you outline the types of
SQS data the sink plans to support?




[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/guarantees/
 

 

 

[2]
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_Types.html
 

 

 





Bests,
Samrat




On Sat, Apr 6, 2024 at 6:27 PM Ahmed Hamdy mailto:hamdy10...@gmail.com> >> wrote:




> Hi Dhingra, thanks for raising the FLIP
> I am in favour of this addition in general. I have a couple of
> comments/questions on the FLIP.
> 
> - I am not sure why we need to suppress warnings in the sink example in the
> FLIP?
> - You provided the sink example as it is the Public Interface, however the
> actual AsyncSink logic is mostly in the writer, so 

Re: [DISCUSS] FLIP-438: Amazon SQS Sink Connector

2024-04-09 Thread Dhingra, Priya
 Hi Ahmed and Samrat, 

Thanks a lot for all the feedbacks, this is my first ever contribution to 
apache Flink, hence I was bit unaware about few things but updated all of them 
as per your suggestions, thanks again for all the support here, much 
appreciated!!

1)  I am not sure why we need to suppress warnings in the sink example in the
FLIP?
> Removed and updated the FLIP.
 
2)  You provided the sink example as it is the Public Interface, however the
actual AsyncSink logic is mostly in the writer, so would be helpful to
provide a brief of the writer or the "submitRequestEntries"
> Added in FLIP
 
3)  I am not sure what customDimensions are or how are they going to be used
by the client, (that's why I thought the writer example should be helpful).
>Removed. This is no more required, we have added in our code to support some 
>specific usecase, no more required for apache PR.
 
4)  Are we going to use the existing aws client providers to handle the
 authentication and async client creation similar to Kinesis/Firehose and
 DDB. I would strongly recommend we do.
> Yes
 
5) Given you suggest implementing this in "flink-connector-aws" repo, it
should probably follow the same versioning of AWS connectors, hence
 targeting 4.3.0. Also I am not sure why we are targeting support for 1.16
given that it is out of support and 4.2 supports 1.17+.
> Sorry, was not aware about the versioning we should have it here. I tested 
> the sqs sink with flint 1.16 so thought of putting the same, but was not 
> aware about out of support. Updated now with 4.3.0 and 1.17+

6) Are we planning to support Table API & SQL as well? This should not be
 much of an effort IMO so I think we should.
> No, not putting that in scope for first iteration. We can take that as fast 
> follow up.

7) It would be great if we also added an implementation of Element converter 
given that SQS message bodies are mainly Strings if I remember correctly. We 
can extend it to other types for MessageAttributeValue augmentation,this should 
be more valuable on table API as well to use it as default Element Converter.
> Updated in FLIP
 
8. Different connectors provide different types of fault
tolerant guarantees[1]. What type of fault tolerant sink guarantees
flink-connector-sqs will provide ?
Could you elaborate on the fault-tolerant capabilities that the
flink-connector-sqs will provide?
> at-least-once
 
 
9) Can you help with what the minimal configuration required for
instantiating the sink ?
 > SQSSink.builder()
.setSqsUrl(sqsUrl)
.setSqsClientProperties(getSQSClientProperties())
.setSerializationSchema(serializationSchema)
.build();

 
10) Amazon SQS offers various data types [2]. Could you outline the types of 
SQS data the sink plans to support?
    •   SendMessageBatchRequestEntry

 
 
 
 
 
 
Hi Priya,
 
 
Thank you for the FLIP. sqs connector would be a great addition to the
flink connector aws.
 
 
+1 for all the queries raised by Ahmed.
 
 
Adding to Ahmed's queries, I have a few more:
 
 
1. Different connectors provide different types of fault
tolerant guarantees[1]. What type of fault tolerant sink guarantees
flink-connector-sqs will provide ?
Could you elaborate on the fault-tolerant capabilities that the
flink-connector-sqs will provide?
 
 
2. Can you help with what the minimal configuration required for
instantiating the sink ?
 
 
3. Amazon SQS offers various data types [2]. Could you outline the types of
SQS data the sink plans to support?
 
 
[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/guarantees/
 

[2]
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_Types.html
 

 
 
Bests,
Samrat
 
 
On Sat, Apr 6, 2024 at 6:27 PM Ahmed Hamdy mailto:hamdy10...@gmail.com>> wrote:
 
 
> Hi Dhingra, thanks for raising the FLIP
> I am in favour of this addition in general. I have a couple of
> comments/questions on the FLIP.
> 
> - I am not sure why we need to suppress warnings in the sink example in the
> FLIP?
> - You provided the sink example as it is the Public Interface, however the
> actual AsyncSink logic is mostly in the writer, so would be helpful to
> provide a brief of the writer or the "submitRequestEntries"
> - I am not sure what customDimensions are or how are they going to be used
> by the client, (that's why I thought the writer example should be helpful).
> - Are we going to use the existing aws client providers to handle the
> authentication and async client creation similar to Kinesis/Firehose and
> DDB. I would strongly recommend we do.
> - Given you suggest implementing this in "flink-connector-aws" repo, it
> should probably follow the same versioning of AWS connectors, hence
> targeting 4.3.0. Also I am not sure why we are 

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-04-09 Thread Jeyhun Karimov
Hi Lincoln,

I think I was misunderstood.
My approach was not to use MiniBatchLocalGroupAggFunction directly but use
the similar approach to it.
Currently, local and global aggregate functions are used together in query
plans.
In my quick PoC, I verified that my modified version of
MiniBatchLocalGroupAggFunction (used without timers) can be achieved
without global aggregation, just a different implementation of
MapBundleFunction.
So, we are not using global aggregate function (that depends on keyed
state) at all.

The very general idea is similar to the pre-shuffle pre-aggregation [1]. In
our case,  we just utilize pre-aggregation part and no shuffle after that.

I am also ok with scoping the FLIP for batch scenarios first, and create PR
for streaming scenarios (since streaming implementation does not change
public interfaces) afterwards.
WDYT?

[1] https://github.com/TU-Berlin-DIMA/AdCom





On Tue, Apr 9, 2024 at 5:34 PM Lincoln Lee  wrote:

> Thanks Jeyhun for your reply!
>
> Unfortunately, MiniBatchLocalGroupAggFunction only works for local agg
> in two-phase aggregation, while global aggregation (which is actually
> handled
> by the KeyedMapBundleOperator) still relies on the KeyedStream, meaning
> that consistency of the partitioner and state key selector is still
> required.
>
> Best,
> Lincoln Lee
>
>
> Jeyhun Karimov  于2024年4月6日周六 05:11写道:
>
> > Hi Lincoln,
> >
> > I did a bit of analysis on small PoC.
> > Please find my comments below:
> >
> > - In general, current design supports streaming workloads. However, as
> you
> > mentioned it comes with some (implementation-related) difficulties.
> > One of them (as you also mentioned) is that most of the operators utilize
> > keyed functions (e.g., Aggregate functions).
> > As a result, we cannot directly, utilize these operators (e.g.,
> > StreamPhysicalGroupbyAggregate) because they work on keyed inputs and
> their
> > tasks
> > utilize specific keyGroupRange.
> >
> > - As I mentioned above, my idea is to utilize similar approach
> > to MiniBatchLocalGroupAggFunction that is not time based and supports
> also
> > retractions.
> > The existing implementation of this function already supports quite a big
> > part of the scope. With this implementation, we utilize MapbundleFunction
> > that is not bound to a specific key range.
> >
> > - As the next milestone, more generic optimization is required that
> > introduces 1) new streaming distribution type as KEEP_INPUT_AS_IS,
> > 2) utilization of a ForwardHashExchangeProcessor, 3) corresponding
> chaining
> > strategy
> >
> > Currently, the plan is to first support this FLIP for batch workloads
> > (e.g., files, pre-divided data and buckets). Next, support for streaming
> > workloads.
> >
> > I hope I have answered your question.
> >
> > Regards,
> > Jeyhun
> >
> > On Wed, Apr 3, 2024 at 4:33 PM Lincoln Lee 
> wrote:
> >
> > > Hi Jeyhun,
> > >
> > > Thanks for your quick response!
> > >
> > > In streaming scenario, shuffle commonly occurs before the stateful
> > > operator, and there's a sanity check[1] when the stateful operator
> > > accesses the state. This implies the consistency requirement of the
> > > partitioner used for data shuffling and state key selector for state
> > > accessing(see KeyGroupStreamPartitioner for more details),
> > > otherwise, there may be state access errors. That is to say, in the
> > > streaming scenario, it is not only the strict requirement described in
> > > FlinkRelDistribution#requireStrict, but also the implied consistency of
> > > hash calculation.
> > >
> > > Also, if this flip targets both streaming and batch scenarios, it is
> > > recommended to do PoC validation for streaming as well.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-29430
> > >
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Leonard Xu  于2024年4月3日周三 14:25写道:
> > >
> > > > Hey, Jeyhun
> > > >
> > > > Thanks for kicking off this discussion. I have two questions about
> > > > streaming sources:
> > > >
> > > > (1)The FLIP  motivation section says Kafka broker is already
> > partitioned
> > > > w.r.t. some key[s] , Is this the main use case in Kafka world?
> > > Partitioning
> > > > by key fields is not the default partitioner of Kafka default
> > > > partitioner[1] IIUC.
> > > >
> > > > (2) Considering the FLIP’s optimization scope aims to both Batch and
> > > > Streaming pre-partitioned source, could you add a Streaming Source
> > > example
> > > > to help me understand the  FLIP better? I think Kafka Source is a
> good
> > > > candidates for streaming source example, file source is a good one
> for
> > > > batch source and it really helped me to follow-up the FLIP.
> > > >
> > > > Best,
> > > > Leonard
> > > > [1]
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L31
> > > >
> > > >
> > > >
> > > > > 2024年4月3日 上午5:53,Jeyhun Karimov  写道:
> > > > >
> > > > > Hi Lincoln,
> > > > >

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-04-09 Thread Lincoln Lee
Thanks Jeyhun for your reply!

Unfortunately, MiniBatchLocalGroupAggFunction only works for local agg
in two-phase aggregation, while global aggregation (which is actually
handled
by the KeyedMapBundleOperator) still relies on the KeyedStream, meaning
that consistency of the partitioner and state key selector is still
required.

Best,
Lincoln Lee


Jeyhun Karimov  于2024年4月6日周六 05:11写道:

> Hi Lincoln,
>
> I did a bit of analysis on small PoC.
> Please find my comments below:
>
> - In general, current design supports streaming workloads. However, as you
> mentioned it comes with some (implementation-related) difficulties.
> One of them (as you also mentioned) is that most of the operators utilize
> keyed functions (e.g., Aggregate functions).
> As a result, we cannot directly, utilize these operators (e.g.,
> StreamPhysicalGroupbyAggregate) because they work on keyed inputs and their
> tasks
> utilize specific keyGroupRange.
>
> - As I mentioned above, my idea is to utilize similar approach
> to MiniBatchLocalGroupAggFunction that is not time based and supports also
> retractions.
> The existing implementation of this function already supports quite a big
> part of the scope. With this implementation, we utilize MapbundleFunction
> that is not bound to a specific key range.
>
> - As the next milestone, more generic optimization is required that
> introduces 1) new streaming distribution type as KEEP_INPUT_AS_IS,
> 2) utilization of a ForwardHashExchangeProcessor, 3) corresponding chaining
> strategy
>
> Currently, the plan is to first support this FLIP for batch workloads
> (e.g., files, pre-divided data and buckets). Next, support for streaming
> workloads.
>
> I hope I have answered your question.
>
> Regards,
> Jeyhun
>
> On Wed, Apr 3, 2024 at 4:33 PM Lincoln Lee  wrote:
>
> > Hi Jeyhun,
> >
> > Thanks for your quick response!
> >
> > In streaming scenario, shuffle commonly occurs before the stateful
> > operator, and there's a sanity check[1] when the stateful operator
> > accesses the state. This implies the consistency requirement of the
> > partitioner used for data shuffling and state key selector for state
> > accessing(see KeyGroupStreamPartitioner for more details),
> > otherwise, there may be state access errors. That is to say, in the
> > streaming scenario, it is not only the strict requirement described in
> > FlinkRelDistribution#requireStrict, but also the implied consistency of
> > hash calculation.
> >
> > Also, if this flip targets both streaming and batch scenarios, it is
> > recommended to do PoC validation for streaming as well.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29430
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Leonard Xu  于2024年4月3日周三 14:25写道:
> >
> > > Hey, Jeyhun
> > >
> > > Thanks for kicking off this discussion. I have two questions about
> > > streaming sources:
> > >
> > > (1)The FLIP  motivation section says Kafka broker is already
> partitioned
> > > w.r.t. some key[s] , Is this the main use case in Kafka world?
> > Partitioning
> > > by key fields is not the default partitioner of Kafka default
> > > partitioner[1] IIUC.
> > >
> > > (2) Considering the FLIP’s optimization scope aims to both Batch and
> > > Streaming pre-partitioned source, could you add a Streaming Source
> > example
> > > to help me understand the  FLIP better? I think Kafka Source is a good
> > > candidates for streaming source example, file source is a good one for
> > > batch source and it really helped me to follow-up the FLIP.
> > >
> > > Best,
> > > Leonard
> > > [1]
> > >
> >
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L31
> > >
> > >
> > >
> > > > 2024年4月3日 上午5:53,Jeyhun Karimov  写道:
> > > >
> > > > Hi Lincoln,
> > > >
> > > > Thanks a lot for your comments. Please find my answers below.
> > > >
> > > >
> > > > 1. Is this flip targeted only at batch scenarios or does it include
> > > >> streaming?
> > > >> (The flip and the discussion did not explicitly mention this, but in
> > the
> > > >> draft pr, I only
> > > >> saw the implementation for batch scenarios
> > > >>
> > > >>
> > >
> >
> https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8
> > > >> <
> > > >>
> > >
> >
> https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8%EF%BC%89
> > > >>>
> > > >> )
> > > >> If we expect this also apply to streaming, then we need to consider
> > the
> > > >> stricter
> > > >> shuffle restrictions of streaming compared to batch (if support is
> > > >> considered,
> > > >> more discussion is needed here, let’s not expand for now). If it
> only
> > > >> applies to batch,
> > > >> it is recommended to clarify in the flip.
> > > >
> > > >
> > > > - The FLIP targets both streaming and batch scenarios.
> > > > Could you please elaborate more on what you mean by additional
> > 

Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Feng Jin
+1 (non-binding)

Best,
Feng

On Tue, Apr 9, 2024 at 5:56 PM gongzhongqiang 
wrote:

> +1 (non-binding)
>
> Best,
>
> Zhongqiang Gong
>
> wudi <676366...@qq.com.invalid> 于2024年4月9日周二 10:48写道:
>
> > Hi devs,
> >
> > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > contributing the Flink Doris Connector[2] to the Flink community.
> > Discussion thread [3].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > insufficient votes.
> >
> >
> > Thanks,
> > Di.Wu
> >
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > [2] https://github.com/apache/doris-flink-connector
> > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> >
> >
>


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Lincoln Lee
Thanks Ron and Timo for your proposal!

Here is my ranking:

1. Derived table -> extend the persistent semantics of derived table in SQL
   standard, with a strong association with query, and has industry
precedents
   such as Google Looker.

2. Live Table ->  an alternative for 'dynamic table'

3. Materialized Table -> combination of the Materialized View and Table,
but
still a table which accept data changes

4. Materialized View -> need to extend understanding of the view to accept
data changes

The reason for not adding 'Refresh Table' is I don't want to tell the user
to 'refresh a refresh table'.


Best,
Lincoln Lee


Ron liu  于2024年4月9日周二 20:11写道:

> Hi, Dev
>
> My rankings are:
>
> 1. Derived Table
> 2. Materialized Table
> 3. Live Table
> 4. Materialized View
>
> Best,
> Ron
>
>
>
> Ron liu  于2024年4月9日周二 20:07写道:
>
> > Hi, Dev
> >
> > After several rounds of discussion, there is currently no consensus on
> the
> > name of the new concept. Timo has proposed that we decide the name
> through
> > a vote. This is a good solution when there is no clear preference, so we
> > will adopt this approach.
> >
> > Regarding the name of the new concept, there are currently five
> candidates:
> > 1. Derived Table -> taken by SQL standard
> > 2. Materialized Table -> similar to SQL materialized view but a table
> > 3. Live Table -> similar to dynamic tables
> > 4. Refresh Table -> states what it does
> > 5. Materialized View -> needs to extend the standard to support modifying
> > data
> >
> > For the above five candidates, everyone can give your rankings based on
> > your preferences. You can choose up to five options or only choose some
> of
> > them.
> > We will use a scoring rule, where the* first rank gets 5 points, second
> > rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points,
> > and fifth rank gets 1 point*.
> > After the voting closes, I will score all the candidates based on
> > everyone's votes, and the candidate with the highest score will be chosen
> > as the name for the new concept.
> >
> > The voting will last up to 72 hours and is expected to close this Friday.
> > I look forward to everyone voting on the name in this thread. Of course,
> we
> > also welcome new input regarding the name.
> >
> > Best,
> > Ron
> >
> > Ron liu  于2024年4月9日周二 19:49写道:
> >
> >> Hi, Dev
> >>
> >> Sorry for my previous statement was not quite accurate. We will hold a
> >> vote for the name within this thread.
> >>
> >> Best,
> >> Ron
> >>
> >>
> >> Ron liu  于2024年4月9日周二 19:29写道:
> >>
> >>> Hi, Timo
> >>>
> >>> Thanks for your reply.
> >>>
> >>> I agree with you that sometimes naming is more difficult. When no one
> >>> has a clear preference, voting on the name is a good solution, so I'll
> send
> >>> a separate email for the vote, clarify the rules for the vote, then let
> >>> everyone vote.
> >>>
> >>> One other point to confirm, in your ranking there is an option for
> >>> Materialized View, does it stand for the UPDATING Materialized View
> that
> >>> you mentioned earlier in the discussion? If using Materialized View I
> think
> >>> it is needed to extend it.
> >>>
> >>> Best,
> >>> Ron
> >>>
> >>> Timo Walther  于2024年4月9日周二 17:20写道:
> >>>
>  Hi Ron,
> 
>  yes naming is hard. But it will have large impact on trainings,
>  presentations, and the mental model of users. Maybe the easiest is to
>  collect ranking by everyone with some short justification:
> 
> 
>  My ranking (from good to not so good):
> 
>  1. Refresh Table -> states what it does
>  2. Materialized Table -> similar to SQL materialized view but a table
>  3. Live Table -> nice buzzword, but maybe still too close to dynamic
>  tables?
>  4. Materialized View -> a bit broader than standard but still very
>  similar
>  5. Derived table -> taken by standard
> 
>  Regards,
>  Timo
> 
> 
> 
>  On 07.04.24 11:34, Ron liu wrote:
>  > Hi, Dev
>  >
>  > This is a summary letter. After several rounds of discussion, there
>  is a
>  > strong consensus about the FLIP proposal and the issues it aims to
>  address.
>  > The current point of disagreement is the naming of the new concept.
> I
>  have
>  > summarized the candidates as follows:
>  >
>  > 1. Derived Table (Inspired by Google Lookers)
>  >  - Pros: Google Lookers has introduced this concept, which is
>  designed
>  > for building Looker's automated modeling, aligning with our purpose
>  for the
>  > stream-batch automatic pipeline.
>  >
>  >  - Cons: The SQL standard uses derived table term extensively,
>  vendors
>  > adopt this for simply referring to a table within a subclause.
>  >
>  > 2. Materialized Table: It means materialize the query result to
> table,
>  > similar to Db2 MQT (Materialized Query Tables). In addition,
> Snowflake
>  > Dynamic Table's predecessor is also 

[jira] [Created] (FLINK-35070) remove unused duplicate code

2024-04-09 Thread niliushall (Jira)
niliushall created FLINK-35070:
--

 Summary: remove unused duplicate code
 Key: FLINK-35070
 URL: https://issues.apache.org/jira/browse/FLINK-35070
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.18.1, 1.17.2, 1.16.3
Reporter: niliushall
 Fix For: 1.16.3


remove unused duplicate code that used directly in function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35069) ContinuousProcessingTimeTrigger continuously registers timers in a loop at the end of the window

2024-04-09 Thread Jinzhong Li (Jira)
Jinzhong Li created FLINK-35069:
---

 Summary: ContinuousProcessingTimeTrigger continuously registers 
timers in a loop at the end of the window
 Key: FLINK-35069
 URL: https://issues.apache.org/jira/browse/FLINK-35069
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: Jinzhong Li
 Attachments: image-2024-04-09-20-23-54-415.png

In our production environment,  when TumblingEventTimeWindows and 
ContinuousProcessingTimeTrigger are used in combination within a 
WindowOperator, we observe a situation where the timers are continuously 
registered in a loop at the end of the window, leading to the job being 
perpetually stuck in timer processing.

!image-2024-04-09-20-23-54-415.png|width=516,height=205!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Ron liu
Hi, Dev

My rankings are:

1. Derived Table
2. Materialized Table
3. Live Table
4. Materialized View

Best,
Ron



Ron liu  于2024年4月9日周二 20:07写道:

> Hi, Dev
>
> After several rounds of discussion, there is currently no consensus on the
> name of the new concept. Timo has proposed that we decide the name through
> a vote. This is a good solution when there is no clear preference, so we
> will adopt this approach.
>
> Regarding the name of the new concept, there are currently five candidates:
> 1. Derived Table -> taken by SQL standard
> 2. Materialized Table -> similar to SQL materialized view but a table
> 3. Live Table -> similar to dynamic tables
> 4. Refresh Table -> states what it does
> 5. Materialized View -> needs to extend the standard to support modifying
> data
>
> For the above five candidates, everyone can give your rankings based on
> your preferences. You can choose up to five options or only choose some of
> them.
> We will use a scoring rule, where the* first rank gets 5 points, second
> rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points,
> and fifth rank gets 1 point*.
> After the voting closes, I will score all the candidates based on
> everyone's votes, and the candidate with the highest score will be chosen
> as the name for the new concept.
>
> The voting will last up to 72 hours and is expected to close this Friday.
> I look forward to everyone voting on the name in this thread. Of course, we
> also welcome new input regarding the name.
>
> Best,
> Ron
>
> Ron liu  于2024年4月9日周二 19:49写道:
>
>> Hi, Dev
>>
>> Sorry for my previous statement was not quite accurate. We will hold a
>> vote for the name within this thread.
>>
>> Best,
>> Ron
>>
>>
>> Ron liu  于2024年4月9日周二 19:29写道:
>>
>>> Hi, Timo
>>>
>>> Thanks for your reply.
>>>
>>> I agree with you that sometimes naming is more difficult. When no one
>>> has a clear preference, voting on the name is a good solution, so I'll send
>>> a separate email for the vote, clarify the rules for the vote, then let
>>> everyone vote.
>>>
>>> One other point to confirm, in your ranking there is an option for
>>> Materialized View, does it stand for the UPDATING Materialized View that
>>> you mentioned earlier in the discussion? If using Materialized View I think
>>> it is needed to extend it.
>>>
>>> Best,
>>> Ron
>>>
>>> Timo Walther  于2024年4月9日周二 17:20写道:
>>>
 Hi Ron,

 yes naming is hard. But it will have large impact on trainings,
 presentations, and the mental model of users. Maybe the easiest is to
 collect ranking by everyone with some short justification:


 My ranking (from good to not so good):

 1. Refresh Table -> states what it does
 2. Materialized Table -> similar to SQL materialized view but a table
 3. Live Table -> nice buzzword, but maybe still too close to dynamic
 tables?
 4. Materialized View -> a bit broader than standard but still very
 similar
 5. Derived table -> taken by standard

 Regards,
 Timo



 On 07.04.24 11:34, Ron liu wrote:
 > Hi, Dev
 >
 > This is a summary letter. After several rounds of discussion, there
 is a
 > strong consensus about the FLIP proposal and the issues it aims to
 address.
 > The current point of disagreement is the naming of the new concept. I
 have
 > summarized the candidates as follows:
 >
 > 1. Derived Table (Inspired by Google Lookers)
 >  - Pros: Google Lookers has introduced this concept, which is
 designed
 > for building Looker's automated modeling, aligning with our purpose
 for the
 > stream-batch automatic pipeline.
 >
 >  - Cons: The SQL standard uses derived table term extensively,
 vendors
 > adopt this for simply referring to a table within a subclause.
 >
 > 2. Materialized Table: It means materialize the query result to table,
 > similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake
 > Dynamic Table's predecessor is also called Materialized Table.
 >
 > 3. Updating Table (From Timo)
 >
 > 4. Updating Materialized View (From Timo)
 >
 > 5. Refresh/Live Table (From Martijn)
 >
 > As Martijn said, naming is a headache, looking forward to more
 valuable
 > input from everyone.
 >
 > [1]
 >
 https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables
 > [2]
 https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query
 > [3]
 >
 https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables
 >
 > Best,
 > Ron
 >
 > Ron liu  于2024年4月7日周日 15:55写道:
 >
 >> Hi, Lorenzo
 >>
 >> Thank you for your insightful input.
 >>
 > I think the 2 above twisted the materialized view concept to more
 than
 >> just an optimization for accessing 

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Ron liu
Hi, Dev

After several rounds of discussion, there is currently no consensus on the
name of the new concept. Timo has proposed that we decide the name through
a vote. This is a good solution when there is no clear preference, so we
will adopt this approach.

Regarding the name of the new concept, there are currently five candidates:
1. Derived Table -> taken by SQL standard
2. Materialized Table -> similar to SQL materialized view but a table
3. Live Table -> similar to dynamic tables
4. Refresh Table -> states what it does
5. Materialized View -> needs to extend the standard to support modifying
data

For the above five candidates, everyone can give your rankings based on
your preferences. You can choose up to five options or only choose some of
them.
We will use a scoring rule, where the* first rank gets 5 points, second
rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points,
and fifth rank gets 1 point*.
After the voting closes, I will score all the candidates based on
everyone's votes, and the candidate with the highest score will be chosen
as the name for the new concept.

The voting will last up to 72 hours and is expected to close this Friday. I
look forward to everyone voting on the name in this thread. Of course, we
also welcome new input regarding the name.

Best,
Ron

Ron liu  于2024年4月9日周二 19:49写道:

> Hi, Dev
>
> Sorry for my previous statement was not quite accurate. We will hold a
> vote for the name within this thread.
>
> Best,
> Ron
>
>
> Ron liu  于2024年4月9日周二 19:29写道:
>
>> Hi, Timo
>>
>> Thanks for your reply.
>>
>> I agree with you that sometimes naming is more difficult. When no one has
>> a clear preference, voting on the name is a good solution, so I'll send a
>> separate email for the vote, clarify the rules for the vote, then let
>> everyone vote.
>>
>> One other point to confirm, in your ranking there is an option for
>> Materialized View, does it stand for the UPDATING Materialized View that
>> you mentioned earlier in the discussion? If using Materialized View I think
>> it is needed to extend it.
>>
>> Best,
>> Ron
>>
>> Timo Walther  于2024年4月9日周二 17:20写道:
>>
>>> Hi Ron,
>>>
>>> yes naming is hard. But it will have large impact on trainings,
>>> presentations, and the mental model of users. Maybe the easiest is to
>>> collect ranking by everyone with some short justification:
>>>
>>>
>>> My ranking (from good to not so good):
>>>
>>> 1. Refresh Table -> states what it does
>>> 2. Materialized Table -> similar to SQL materialized view but a table
>>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic
>>> tables?
>>> 4. Materialized View -> a bit broader than standard but still very
>>> similar
>>> 5. Derived table -> taken by standard
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> On 07.04.24 11:34, Ron liu wrote:
>>> > Hi, Dev
>>> >
>>> > This is a summary letter. After several rounds of discussion, there is
>>> a
>>> > strong consensus about the FLIP proposal and the issues it aims to
>>> address.
>>> > The current point of disagreement is the naming of the new concept. I
>>> have
>>> > summarized the candidates as follows:
>>> >
>>> > 1. Derived Table (Inspired by Google Lookers)
>>> >  - Pros: Google Lookers has introduced this concept, which is
>>> designed
>>> > for building Looker's automated modeling, aligning with our purpose
>>> for the
>>> > stream-batch automatic pipeline.
>>> >
>>> >  - Cons: The SQL standard uses derived table term extensively,
>>> vendors
>>> > adopt this for simply referring to a table within a subclause.
>>> >
>>> > 2. Materialized Table: It means materialize the query result to table,
>>> > similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake
>>> > Dynamic Table's predecessor is also called Materialized Table.
>>> >
>>> > 3. Updating Table (From Timo)
>>> >
>>> > 4. Updating Materialized View (From Timo)
>>> >
>>> > 5. Refresh/Live Table (From Martijn)
>>> >
>>> > As Martijn said, naming is a headache, looking forward to more valuable
>>> > input from everyone.
>>> >
>>> > [1]
>>> >
>>> https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables
>>> > [2]
>>> https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query
>>> > [3]
>>> >
>>> https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables
>>> >
>>> > Best,
>>> > Ron
>>> >
>>> > Ron liu  于2024年4月7日周日 15:55写道:
>>> >
>>> >> Hi, Lorenzo
>>> >>
>>> >> Thank you for your insightful input.
>>> >>
>>> > I think the 2 above twisted the materialized view concept to more
>>> than
>>> >> just an optimization for accessing pre-computed aggregates/filters.
>>> >> I think that concept (at least in my mind) is now adherent to the
>>> >> semantics of the words themselves ("materialized" and "view") than on
>>> its
>>> >> implementations in DBMs, as just a view on raw data that, hopefully,
>>> is
>>> >> constantly updated with fresh results.
>>> 

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Ron liu
Hi, Dev

Sorry for my previous statement was not quite accurate. We will hold a vote
for the name within this thread.

Best,
Ron


Ron liu  于2024年4月9日周二 19:29写道:

> Hi, Timo
>
> Thanks for your reply.
>
> I agree with you that sometimes naming is more difficult. When no one has
> a clear preference, voting on the name is a good solution, so I'll send a
> separate email for the vote, clarify the rules for the vote, then let
> everyone vote.
>
> One other point to confirm, in your ranking there is an option for
> Materialized View, does it stand for the UPDATING Materialized View that
> you mentioned earlier in the discussion? If using Materialized View I think
> it is needed to extend it.
>
> Best,
> Ron
>
> Timo Walther  于2024年4月9日周二 17:20写道:
>
>> Hi Ron,
>>
>> yes naming is hard. But it will have large impact on trainings,
>> presentations, and the mental model of users. Maybe the easiest is to
>> collect ranking by everyone with some short justification:
>>
>>
>> My ranking (from good to not so good):
>>
>> 1. Refresh Table -> states what it does
>> 2. Materialized Table -> similar to SQL materialized view but a table
>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic
>> tables?
>> 4. Materialized View -> a bit broader than standard but still very similar
>> 5. Derived table -> taken by standard
>>
>> Regards,
>> Timo
>>
>>
>>
>> On 07.04.24 11:34, Ron liu wrote:
>> > Hi, Dev
>> >
>> > This is a summary letter. After several rounds of discussion, there is a
>> > strong consensus about the FLIP proposal and the issues it aims to
>> address.
>> > The current point of disagreement is the naming of the new concept. I
>> have
>> > summarized the candidates as follows:
>> >
>> > 1. Derived Table (Inspired by Google Lookers)
>> >  - Pros: Google Lookers has introduced this concept, which is
>> designed
>> > for building Looker's automated modeling, aligning with our purpose for
>> the
>> > stream-batch automatic pipeline.
>> >
>> >  - Cons: The SQL standard uses derived table term extensively,
>> vendors
>> > adopt this for simply referring to a table within a subclause.
>> >
>> > 2. Materialized Table: It means materialize the query result to table,
>> > similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake
>> > Dynamic Table's predecessor is also called Materialized Table.
>> >
>> > 3. Updating Table (From Timo)
>> >
>> > 4. Updating Materialized View (From Timo)
>> >
>> > 5. Refresh/Live Table (From Martijn)
>> >
>> > As Martijn said, naming is a headache, looking forward to more valuable
>> > input from everyone.
>> >
>> > [1]
>> >
>> https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables
>> > [2]
>> https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query
>> > [3]
>> >
>> https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables
>> >
>> > Best,
>> > Ron
>> >
>> > Ron liu  于2024年4月7日周日 15:55写道:
>> >
>> >> Hi, Lorenzo
>> >>
>> >> Thank you for your insightful input.
>> >>
>> > I think the 2 above twisted the materialized view concept to more
>> than
>> >> just an optimization for accessing pre-computed aggregates/filters.
>> >> I think that concept (at least in my mind) is now adherent to the
>> >> semantics of the words themselves ("materialized" and "view") than on
>> its
>> >> implementations in DBMs, as just a view on raw data that, hopefully, is
>> >> constantly updated with fresh results.
>> >> That's why I understand Timo's et al. objections.
>> >>
>> >> Your understanding of Materialized Views is correct. However, in our
>> >> scenario, an important feature is the support for Update & Delete
>> >> operations, which the current Materialized Views cannot fulfill. As we
>> >> discussed with Timo before, if Materialized Views needs to support data
>> >> modifications, it would require an extension of new keywords, such as
>> >> CREATING xxx (UPDATING) MATERIALIZED VIEW.
>> >>
>> > Still, I don't understand why we need another type of special table.
>> >> Could you dive deep into the reasons why not simply adding the
>> FRESHNESS
>> >> parameter to standard tables?
>> >>
>> >> Firstly, I need to emphasize that we cannot achieve the design goal of
>> >> FLIP through the CREATE TABLE syntax combined with a FRESHNESS
>> parameter.
>> >> The proposal of this FLIP is to use Dynamic Table + Continuous Query,
>> and
>> >> combine it with FRESHNESS to realize a streaming-batch unification.
>> >> However, CREATE TABLE is merely a metadata operation and cannot
>> >> automatically start a background refresh job. To achieve the design
>> goal of
>> >> FLIP with standard tables, it would require extending the CTAS[1]
>> syntax to
>> >> introduce the FRESHNESS keyword. We considered this design initially,
>> but
>> >> it has following problems:
>> >>
>> >> 1. Distinguishing a table created through CTAS as a standard table or
>> as a
>> >> "special" 

[jira] [Created] (FLINK-35068) Introduce built-in serialization support for Set

2024-04-09 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-35068:
-

 Summary: Introduce built-in serialization support for Set
 Key: FLINK-35068
 URL: https://issues.apache.org/jira/browse/FLINK-35068
 Project: Flink
  Issue Type: Sub-task
  Components: API / Type Serialization System
Affects Versions: 1.20.0
Reporter: Zhanghao Chen
 Fix For: 1.20.0


Introduce built-in serialization support for {{{}Set{}}}, another common Java 
collection type. We'll need to add a new built-in serializer for it 
({{{}MultiSetTypeInformation{}}} utilizes {{MapSerializer}} underneath, but it 
could be more efficient for common {{{}Set{}}}).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Ron liu
Hi, Timo

Thanks for your reply.

I agree with you that sometimes naming is more difficult. When no one has a
clear preference, voting on the name is a good solution, so I'll send a
separate email for the vote, clarify the rules for the vote, then let
everyone vote.

One other point to confirm, in your ranking there is an option for
Materialized View, does it stand for the UPDATING Materialized View that
you mentioned earlier in the discussion? If using Materialized View I think
it is needed to extend it.

Best,
Ron

Timo Walther  于2024年4月9日周二 17:20写道:

> Hi Ron,
>
> yes naming is hard. But it will have large impact on trainings,
> presentations, and the mental model of users. Maybe the easiest is to
> collect ranking by everyone with some short justification:
>
>
> My ranking (from good to not so good):
>
> 1. Refresh Table -> states what it does
> 2. Materialized Table -> similar to SQL materialized view but a table
> 3. Live Table -> nice buzzword, but maybe still too close to dynamic
> tables?
> 4. Materialized View -> a bit broader than standard but still very similar
> 5. Derived table -> taken by standard
>
> Regards,
> Timo
>
>
>
> On 07.04.24 11:34, Ron liu wrote:
> > Hi, Dev
> >
> > This is a summary letter. After several rounds of discussion, there is a
> > strong consensus about the FLIP proposal and the issues it aims to
> address.
> > The current point of disagreement is the naming of the new concept. I
> have
> > summarized the candidates as follows:
> >
> > 1. Derived Table (Inspired by Google Lookers)
> >  - Pros: Google Lookers has introduced this concept, which is
> designed
> > for building Looker's automated modeling, aligning with our purpose for
> the
> > stream-batch automatic pipeline.
> >
> >  - Cons: The SQL standard uses derived table term extensively,
> vendors
> > adopt this for simply referring to a table within a subclause.
> >
> > 2. Materialized Table: It means materialize the query result to table,
> > similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake
> > Dynamic Table's predecessor is also called Materialized Table.
> >
> > 3. Updating Table (From Timo)
> >
> > 4. Updating Materialized View (From Timo)
> >
> > 5. Refresh/Live Table (From Martijn)
> >
> > As Martijn said, naming is a headache, looking forward to more valuable
> > input from everyone.
> >
> > [1]
> >
> https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables
> > [2] https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query
> > [3]
> >
> https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables
> >
> > Best,
> > Ron
> >
> > Ron liu  于2024年4月7日周日 15:55写道:
> >
> >> Hi, Lorenzo
> >>
> >> Thank you for your insightful input.
> >>
> > I think the 2 above twisted the materialized view concept to more
> than
> >> just an optimization for accessing pre-computed aggregates/filters.
> >> I think that concept (at least in my mind) is now adherent to the
> >> semantics of the words themselves ("materialized" and "view") than on
> its
> >> implementations in DBMs, as just a view on raw data that, hopefully, is
> >> constantly updated with fresh results.
> >> That's why I understand Timo's et al. objections.
> >>
> >> Your understanding of Materialized Views is correct. However, in our
> >> scenario, an important feature is the support for Update & Delete
> >> operations, which the current Materialized Views cannot fulfill. As we
> >> discussed with Timo before, if Materialized Views needs to support data
> >> modifications, it would require an extension of new keywords, such as
> >> CREATING xxx (UPDATING) MATERIALIZED VIEW.
> >>
> > Still, I don't understand why we need another type of special table.
> >> Could you dive deep into the reasons why not simply adding the FRESHNESS
> >> parameter to standard tables?
> >>
> >> Firstly, I need to emphasize that we cannot achieve the design goal of
> >> FLIP through the CREATE TABLE syntax combined with a FRESHNESS
> parameter.
> >> The proposal of this FLIP is to use Dynamic Table + Continuous Query,
> and
> >> combine it with FRESHNESS to realize a streaming-batch unification.
> >> However, CREATE TABLE is merely a metadata operation and cannot
> >> automatically start a background refresh job. To achieve the design
> goal of
> >> FLIP with standard tables, it would require extending the CTAS[1]
> syntax to
> >> introduce the FRESHNESS keyword. We considered this design initially,
> but
> >> it has following problems:
> >>
> >> 1. Distinguishing a table created through CTAS as a standard table or
> as a
> >> "special" standard table with an ongoing background refresh job using
> the
> >> FRESHNESS keyword is very obscure for users.
> >> 2. It intrudes on the semantics of the CTAS syntax. Currently, tables
> >> created using CTAS only add table metadata to the Catalog and do not
> record
> >> attributes such as query. There are 

Re: Two potential bugs in Flink ML

2024-04-09 Thread Yunfeng Zhou
Hi Komal,

Thanks for your example code! I found that Flink ML has a bug when it
comes to keyed two input operators. I have submitted a PR to fix this
bug and you can build the Flink ML library for your program according
to its document after this PR is approved.

The bugfix PR: https://github.com/apache/flink-ml/pull/260
The document to build Flink ML:
https://github.com/apache/flink-ml?tab=readme-ov-file#building-the-project

Best,
Yunfeng

On Mon, Apr 8, 2024 at 11:02 AM Komal M  wrote:
>
> Hi Yungfeng,
>
>
> Thank you so much for getting back!
>
> For the first bug, here is a sample code that should reproduce it. All it 
> does is subtract 1 from the feedback stream until the tuples reach 0.0. For 
> each subtraction it outputs a relevant message in the ‘finalOutput’ stream. 
> These messages are stored in the keyedState of KeyedCoProcessFunction and are 
> populated by a dataset stream called initialStates. For each key there are 
> different messages associated with it, hence the need for MapState.
>
>  For the second bug, let me compare my implementation to the references you 
> have provided and get back to you on that.
>
>
> import java.util.*;
> import org.apache.flink.api.common.state.MapState;
> import org.apache.flink.api.common.state.MapStateDescriptor;
> import org.apache.flink.api.common.state.ValueStateDescriptor;
> import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
> import org.apache.flink.api.common.typeinfo.TypeHint;
> import org.apache.flink.api.common.typeinfo.TypeInformation;
> import org.apache.flink.api.common.typeinfo.Types;
> import org.apache.flink.api.java.tuple.Tuple2;
> import org.apache.flink.api.java.tuple.Tuple3;
> import org.apache.flink.configuration.Configuration;
> import org.apache.flink.iteration.DataStreamList;
> import org.apache.flink.iteration.IterationBodyResult;
> import org.apache.flink.iteration.Iterations;
> import org.apache.flink.streaming.api.datastream.DataStream;
> import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
> import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
> import org.apache.flink.streaming.api.functions.co.KeyedCoProcessFunction;
> import org.apache.flink.util.Collector;
> import org.apache.flink.util.OutputTag;
>
>
> public class Test {
> public static void main(String[] args) throws Exception {
> // Sets up the execution environment, which is the main entry point
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.createLocalEnvironment();
>
> // sample datastreams (they are assumed to be unbounded streams outside of 
> this test environment)
> List> feedbackinitializer = Arrays.asList(
> new Tuple2<>("A", 2.0),
> new Tuple2<>("B", 3.0),
> new Tuple2<>("C", 1.0),
> new Tuple2<>("D", 1.0)
> );
>
> List> initialStates = Arrays.asList(
> new Tuple3<>("A", 0.0, "Final Output A"),
> new Tuple3<>("A", 1.0, "Test 1A"),
> new Tuple3<>("B", 2.0, "Test 2B"),
> new Tuple3<>("B", 1.0, "Test 1B"),
> new Tuple3<>("B", 0.0, "Final Output B"),
> new Tuple3<>("C", 0.0, "No Change C"),
> new Tuple3<>("D", 0.0, "Final Output D")
> );
>
>
> DataStream> feedbackStream = 
> env.fromCollection(feedbackinitializer);
> DataStream> initialStateStream = 
> env.fromCollection(initialStates);
>
> //parallelize
> DataStream> feedbackParallel = 
> feedbackStream.keyBy(x -> x.f0)
> .map(i -> Tuple2.of(i.f0,i.f1))
> .returns(Types.TUPLE(Types.STRING, Types.DOUBLE));
> DataStream> initialStateParallel = 
> initialStateStream.keyBy(x -> x.f0)
> .map(i -> Tuple3.of(i.f0,i.f1,i.f2))
> .returns(Types.TUPLE(Types.STRING, Types.DOUBLE, 
> Types.STRING));
>
>
>
> //iterate
> DataStreamList result = Iterations.iterateUnboundedStreams(
> DataStreamList.of(feedbackParallel),
> DataStreamList.of(initialStateParallel),
> (variableStreams, dataStreams) -> {
> DataStream> modelUpdate = 
> variableStreams.get(0);
> DataStream> stateStream = 
> dataStreams.get(0);
>
>
> OutputTag finalOutputTag = new 
> OutputTag("msgs") {
> };
>
> SingleOutputStreamOperator> 
> newModelUpdate = stateStream.connect(modelUpdate).keyBy(0, 0).process(new 
> KeyedCoProcessFunction, Tuple2 Double>, Tuple2>() {
> private transient MapState state;
>
> @Override
> public void processElement1(Tuple3 String> stateUpdate, Context context, Collector> 
> collector) throws Exception {
> state.put(stateUpdate.f1, stateUpdate.f2); //load 
> stateStream into mapState
>  

[jira] [Created] (FLINK-35067) Support metadata 'op_type' virtual column for Postgres CDC Connector.

2024-04-09 Thread Hongshun Wang (Jira)
Hongshun Wang created FLINK-35067:
-

 Summary:  Support metadata 'op_type' virtual column for Postgres 
CDC Connector. 
 Key: FLINK-35067
 URL: https://issues.apache.org/jira/browse/FLINK-35067
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Hongshun Wang


Like [https://github.com/apache/flink-cdc/pull/2913,] Support metadata 
'op_type' virtual column for Postgres CDC Connector. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread gongzhongqiang
+1 (non-binding)

Best,

Zhongqiang Gong

wudi <676366...@qq.com.invalid> 于2024年4月9日周二 10:48写道:

> Hi devs,
>
> I would like to start a vote about FLIP-399 [1]. The FLIP is about
> contributing the Flink Doris Connector[2] to the Flink community.
> Discussion thread [3].
>
> The vote will be open for at least 72 hours unless there is an objection or
> insufficient votes.
>
>
> Thanks,
> Di.Wu
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> [2] https://github.com/apache/doris-flink-connector
> [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
>
>


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Jiabao Sun
+1 (binding)

Best,
Jiabao

Leonard Xu  于2024年4月9日周二 17:35写道:

> +1 (binding)
>
> Best,
> Leonard
>
> > 2024年4月9日 下午5:11,Muhammet Orazov  写道:
> >
> > Hey Wudi,
> >
> > Thanks for your efforts.
> >
> > +1 (non-binding)
> >
> > Best,
> > Muhammet
> >
> > On 2024-04-09 02:47, wudi wrote:
> >> Hi devs,
> >> I would like to start a vote about FLIP-399 [1]. The FLIP is about
> contributing the Flink Doris Connector[2] to the Flink community.
> Discussion thread [3].
> >> The vote will be open for at least 72 hours unless there is an
> objection or
> >> insufficient votes.
> >> Thanks,
> >> Di.Wu
> >> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> >> [2] https://github.com/apache/doris-flink-connector
> >> [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
>
>


[jira] [Created] (FLINK-35066) TwoInputOperator in IterationBody cannot use keyBy

2024-04-09 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-35066:


 Summary: TwoInputOperator in IterationBody cannot use keyBy
 Key: FLINK-35066
 URL: https://issues.apache.org/jira/browse/FLINK-35066
 Project: Flink
  Issue Type: Technical Debt
  Components: Library / Machine Learning
Affects Versions: ml-2.3.0
Reporter: Yunfeng Zhou


Implementing a UDF KeyedRichCoProcessFunction or CoFlatMapFunction inside 
IterationBody yields a “java.lang.ClassCastException: 
org.apache.flink.iteration.IterationRecord cannot be cast to class 
org.apache.flink.api.java.tuple.Tuple” error.
More details about this bug can be found at 
https://lists.apache.org/thread/bgkw1g2tdgnp1xy1clsqtcfs3h18pkd6



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Leonard Xu
+1 (binding)

Best,
Leonard

> 2024年4月9日 下午5:11,Muhammet Orazov  写道:
> 
> Hey Wudi,
> 
> Thanks for your efforts.
> 
> +1 (non-binding)
> 
> Best,
> Muhammet
> 
> On 2024-04-09 02:47, wudi wrote:
>> Hi devs,
>> I would like to start a vote about FLIP-399 [1]. The FLIP is about 
>> contributing the Flink Doris Connector[2] to the Flink community. Discussion 
>> thread [3].
>> The vote will be open for at least 72 hours unless there is an objection or
>> insufficient votes.
>> Thanks,
>> Di.Wu
>> [1] 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
>> [2] https://github.com/apache/doris-flink-connector
>> [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh



Re: Inquiry Regarding Azure Pipelines

2024-04-09 Thread Muhammet Orazov

Hey Robert,

Thanks for fixing the flinkbot issue. Thanks for your efforts (including 
Mathias and Lorenzo)!


Best,
Muhammet

p.s. Mandatory XKCD: https://xkcd.com/2347/

On 2024-04-03 16:52, Robert Metzger wrote:

Hi Yisha,

flinkbot is currently not active, so new PRs are not triggering any AZP
builds. We hope to restore the service soon.

AZP is still the source of truth for CI builds.


On Wed, Apr 3, 2024 at 11:34 AM Yisha Zhou 


wrote:


Hi devs,

I hope this email finds you well. I am writing to seek clarification
regarding the status of Azure Pipelines within the Apache community 
and

seek assistance with a specific issue I encountered.

Today, I made some new commits to a pull request in one of the Apache
repositories. However, I noticed that even after approximately six 
hours,
there were no triggers initiated for the Azure Pipeline. I have a 
couple of

questions regarding this matter:

1. Is the Apache community still utilizing Azure Pipelines for CI/CD
purposes? I came across an issue discussing the migration from Azure 
to
GitHub Actions, but I am uncertain about the timeline for 
discontinuing the

use of Azure Pipelines.

2. If Azure Pipelines are still in use, where can I find information 
about

the position of my commits in the CI queue, awaiting execution?

I would greatly appreciate any insights or guidance you can provide
regarding these questions. Thank you for your time and attention.

My PR link is https://github.com/apache/flink/pull/24567 <
https://github.com/apache/flink/pull/24567>.

Best regards,
Yisha


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Timo Walther

Hi Ron,

yes naming is hard. But it will have large impact on trainings, 
presentations, and the mental model of users. Maybe the easiest is to 
collect ranking by everyone with some short justification:



My ranking (from good to not so good):

1. Refresh Table -> states what it does
2. Materialized Table -> similar to SQL materialized view but a table
3. Live Table -> nice buzzword, but maybe still too close to dynamic tables?
4. Materialized View -> a bit broader than standard but still very similar
5. Derived table -> taken by standard

Regards,
Timo



On 07.04.24 11:34, Ron liu wrote:

Hi, Dev

This is a summary letter. After several rounds of discussion, there is a
strong consensus about the FLIP proposal and the issues it aims to address.
The current point of disagreement is the naming of the new concept. I have
summarized the candidates as follows:

1. Derived Table (Inspired by Google Lookers)
 - Pros: Google Lookers has introduced this concept, which is designed
for building Looker's automated modeling, aligning with our purpose for the
stream-batch automatic pipeline.

 - Cons: The SQL standard uses derived table term extensively, vendors
adopt this for simply referring to a table within a subclause.

2. Materialized Table: It means materialize the query result to table,
similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake
Dynamic Table's predecessor is also called Materialized Table.

3. Updating Table (From Timo)

4. Updating Materialized View (From Timo)

5. Refresh/Live Table (From Martijn)

As Martijn said, naming is a headache, looking forward to more valuable
input from everyone.

[1]
https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables
[2] https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query
[3]
https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables

Best,
Ron

Ron liu  于2024年4月7日周日 15:55写道:


Hi, Lorenzo

Thank you for your insightful input.


I think the 2 above twisted the materialized view concept to more than

just an optimization for accessing pre-computed aggregates/filters.
I think that concept (at least in my mind) is now adherent to the
semantics of the words themselves ("materialized" and "view") than on its
implementations in DBMs, as just a view on raw data that, hopefully, is
constantly updated with fresh results.
That's why I understand Timo's et al. objections.

Your understanding of Materialized Views is correct. However, in our
scenario, an important feature is the support for Update & Delete
operations, which the current Materialized Views cannot fulfill. As we
discussed with Timo before, if Materialized Views needs to support data
modifications, it would require an extension of new keywords, such as
CREATING xxx (UPDATING) MATERIALIZED VIEW.


Still, I don't understand why we need another type of special table.

Could you dive deep into the reasons why not simply adding the FRESHNESS
parameter to standard tables?

Firstly, I need to emphasize that we cannot achieve the design goal of
FLIP through the CREATE TABLE syntax combined with a FRESHNESS parameter.
The proposal of this FLIP is to use Dynamic Table + Continuous Query, and
combine it with FRESHNESS to realize a streaming-batch unification.
However, CREATE TABLE is merely a metadata operation and cannot
automatically start a background refresh job. To achieve the design goal of
FLIP with standard tables, it would require extending the CTAS[1] syntax to
introduce the FRESHNESS keyword. We considered this design initially, but
it has following problems:

1. Distinguishing a table created through CTAS as a standard table or as a
"special" standard table with an ongoing background refresh job using the
FRESHNESS keyword is very obscure for users.
2. It intrudes on the semantics of the CTAS syntax. Currently, tables
created using CTAS only add table metadata to the Catalog and do not record
attributes such as query. There are also no ongoing background refresh
jobs, and the data writing operation happens only once at table creation.
3. For the framework, when we perform a certain kind of Alter Table
behavior for a table, for the table created by specifying FRESHNESS and did
not specify the FRESHNESS created table behavior how to distinguish , which
will also cause confusion.

In terms of the design goal of combining Dynamic Table + Continuous Query,
the FLIP proposal cannot be realized by only extending the current stardand
tables, so a new kind of dynamic table needs to be introduced at the
first-level concept.

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#as-select_statement

Best,
Ron

 于2024年4月3日周三 22:25写道:


Hello everybody!
Thanks for the FLIP as it looks amazing (and I think the prove is this
deep discussion it is provoking :))

I have a couple of comments to add to this:

Even though I get the reason why you rejected 

Re: [External] Inquiry Regarding Azure Pipelines

2024-04-09 Thread Yisha Zhou
Thanks Robert. Manual trigger didn’t work. I’ll push a new commit to trigger 
it. 

> 2024年4月9日 16:33,Robert Metzger  写道:
> 
> I'm not 100% sure, but I think the manual trigger via a command will use
> the latest commit (e.g. your second commit) to trigger a build.
> Historically, manual triggering has not been very reliable. If the manual
> triggering isn't working, you can also just push a commit with a tiny
> change to trigger a new build.
> 
> 
> On Sun, Apr 7, 2024 at 5:42 AM Yisha Zhou  >
> wrote:
> 
>> Hi Robert,
>> 
>> Thank you for your prompt response to my previous email. I appreciate the
>> information provided. However, I still have a few remaining questions that
>> I hope you can assist me with.
>> 
>> I have noticed that other PRs are now able to trigger AZP builds
>> automatically. In my case, I have two commits associated with my PR. The
>> first commit has already triggered an Azure build successfully, while the
>> second commit was made during the service downtime. My question are :
>> 
>> 1. If I use the command "@flinkbot run azure" now, will it trigger the
>> build corresponding to my second commit, or will it rerun the already
>> successful build from the first commit?
>> 
>> 2. If the answer for question 1 is the former, how can I trigger AZP for
>> my second commit?
>> 
>> I would appreciate any clarification you can provide on this matter. Thank
>> you for your attention to this issue, and I look forward to your response.
>> 
>> Best regards,
>> Yisha
>> 
>> 
>> 
>>> 2024年4月4日 00:52,Robert Metzger  写道:
>>> 
>>> Hi Yisha,
>>> 
>>> flinkbot is currently not active, so new PRs are not triggering any AZP
>>> builds. We hope to restore the service soon.
>>> 
>>> AZP is still the source of truth for CI builds.
>>> 
>>> 
>>> On Wed, Apr 3, 2024 at 11:34 AM Yisha Zhou > > >>
>>> wrote:
>>> 
 Hi devs,
 
 I hope this email finds you well. I am writing to seek clarification
 regarding the status of Azure Pipelines within the Apache community and
 seek assistance with a specific issue I encountered.
 
 Today, I made some new commits to a pull request in one of the Apache
 repositories. However, I noticed that even after approximately six
>> hours,
 there were no triggers initiated for the Azure Pipeline. I have a
>> couple of
 questions regarding this matter:
 
 1. Is the Apache community still utilizing Azure Pipelines for CI/CD
 purposes? I came across an issue discussing the migration from Azure to
 GitHub Actions, but I am uncertain about the timeline for discontinuing
>> the
 use of Azure Pipelines.
 
 2. If Azure Pipelines are still in use, where can I find information
>> about
 the position of my commits in the CI queue, awaiting execution?
 
 I would greatly appreciate any insights or guidance you can provide
 regarding these questions. Thank you for your time and attention.
 
 My PR link is https://github.com/apache/flink/pull/24567 <
 https://github.com/apache/flink/pull/24567 <
>> https://github.com/apache/flink/pull/24567 
>> >>.
 
 Best regards,
 Yisha



Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Muhammet Orazov

Hey Wudi,

Thanks for your efforts.

+1 (non-binding)

Best,
Muhammet

On 2024-04-09 02:47, wudi wrote:

Hi devs,

I would like to start a vote about FLIP-399 [1]. The FLIP is about 
contributing the Flink Doris Connector[2] to the Flink community. 
Discussion thread [3].


The vote will be open for at least 72 hours unless there is an 
objection or

insufficient votes.


Thanks,
Di.Wu


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris

[2] https://github.com/apache/doris-flink-connector
[3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh


[jira] [Created] (FLINK-35065) Add numFiredTimers and numFiredTimersPerSecond metrics

2024-04-09 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-35065:
--

 Summary: Add numFiredTimers and numFiredTimersPerSecond metrics
 Key: FLINK-35065
 URL: https://issues.apache.org/jira/browse/FLINK-35065
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics, Runtime / Task
Affects Versions: 1.19.0
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.20.0


Currently there is now way of knowing how many timers are being fired by Flink, 
so it's impossible to distinguish, even using code profiling, if operator is 
firing only a couple of heavy timers per second using ~100% of the CPU time, vs 
firing thousands of timer per seconds.

We could add the following metrics to address this issue:
* numFiredTimers - total number of fired timers per operator
* numFiredTimersPerSecond - per second rate of firing timers per operator



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35064) Flink sql connector pulsar/hive com.fasterxml.jackson.annotation.JsonFormat$Value conflict

2024-04-09 Thread chendaping (Jira)
chendaping created FLINK-35064:
--

 Summary: Flink sql connector pulsar/hive 
com.fasterxml.jackson.annotation.JsonFormat$Value conflict
 Key: FLINK-35064
 URL: https://issues.apache.org/jira/browse/FLINK-35064
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Connectors / Pulsar
Affects Versions: 1.16.1
Reporter: chendaping


When I compile and package {{flink-sql-connector-pulsar}} & 
{{{}flink-sql-connector-hive{}}}, and then put these two jar files into the 
Flink lib directory, I execute the following SQL statement through 
{{{}bin/sql-client.sh{}}}:

 
{code:java}
// code placeholder
CREATE TABLE
pulsar_table (
content string,
proc_time AS PROCTIME ()
)
WITH
(
'connector' = 'pulsar',
'topics' = 'persistent://xxx',
'service-url' = 'pulsar://xxx',
'source.subscription-name' = 'xxx',
'source.start.message-id' = 'latest',
'format' = 'csv',
'pulsar.client.authPluginClassName' = 
'org.apache.pulsar.client.impl.auth.AuthenticationToken',
'pulsar.client.authParams' = 'token:xxx'
);
 
select * from pulsar_table; {code}
The task error exception stack is as follows:

 
{code:java}
Caused by: java.lang.NoSuchMethodError: 
com.fasterxml.jackson.annotation.JsonFormat$Value.empty()Lcom/fasterxml/jackson/annotation/JsonFormat$Value;
at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.cfg.MapperConfig.(MapperConfig.java:56)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:660)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:576)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.common.util.ObjectMapperFactory.createObjectMapperInstance(ObjectMapperFactory.java:151)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.common.util.ObjectMapperFactory.(ObjectMapperFactory.java:142)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.conf.ConfigurationDataUtils.create(ConfigurationDataUtils.java:35)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.conf.ConfigurationDataUtils.(ConfigurationDataUtils.java:43)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.ClientBuilderImpl.loadConf(ClientBuilderImpl.java:77)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.common.config.PulsarClientFactory.createClient(PulsarClientFactory.java:105)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.enumerator.PulsarSourceEnumerator.(PulsarSourceEnumerator.java:95)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.enumerator.PulsarSourceEnumerator.(PulsarSourceEnumerator.java:76)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.PulsarSource.createEnumerator(PulsarSource.java:144)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.runtime.source.coordinator.SourceCoordinator.start(SourceCoordinator.java:213)
 ~[flink-dist_2.12-1.16.1.jar:1.16.1]

{code}
 

The exception shows a conflict with 
{{{}com.fasterxml.jackson.annotation.JsonFormat$Value{}}}. I investigated and 
found that {{flink-sql-connector-pulsar}} and {{flink-sql-connector-hive}} 
depend on different versions, leading to this conflict.
{code:java}
// flink-sql-connector-pulsar pom.xml

    com.fasterxml.jackson
    jackson-bom
    pom
    import
    2.13.4.20221013
 

// flink-sql-connector-hive pom.xml

com.fasterxml.jackson
jackson-bom
pom
import
2.15.3
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35063) Flink sql connector pulsar/hive com.fasterxml.jackson.annotation.JsonFormat$Value conflict

2024-04-09 Thread chendaping (Jira)
chendaping created FLINK-35063:
--

 Summary: Flink sql connector pulsar/hive 
com.fasterxml.jackson.annotation.JsonFormat$Value conflict
 Key: FLINK-35063
 URL: https://issues.apache.org/jira/browse/FLINK-35063
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Connectors / Pulsar
Affects Versions: 1.16.1
Reporter: chendaping


When I compile and package {{flink-sql-connector-pulsar}} & 
{{{}flink-sql-connector-hive{}}}, and then put these two jar files into the 
Flink lib directory, I execute the following SQL statement through 
{{{}bin/sql-client.sh{}}}:

 
{code:java}
// code placeholder
CREATE TABLE
pulsar_table (
content string,
proc_time AS PROCTIME ()
)
WITH
(
'connector' = 'pulsar',
'topics' = 'persistent://xxx',
'service-url' = 'pulsar://xxx',
'source.subscription-name' = 'xxx',
'source.start.message-id' = 'latest',
'format' = 'csv',
'pulsar.client.authPluginClassName' = 
'org.apache.pulsar.client.impl.auth.AuthenticationToken',
'pulsar.client.authParams' = 'token:xxx'
);
 
select * from pulsar_table; {code}
The task error exception stack is as follows:

 
{code:java}
Caused by: java.lang.NoSuchMethodError: 
com.fasterxml.jackson.annotation.JsonFormat$Value.empty()Lcom/fasterxml/jackson/annotation/JsonFormat$Value;
at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.cfg.MapperConfig.(MapperConfig.java:56)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:660)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.shade.com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:576)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.common.util.ObjectMapperFactory.createObjectMapperInstance(ObjectMapperFactory.java:151)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.common.util.ObjectMapperFactory.(ObjectMapperFactory.java:142)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.conf.ConfigurationDataUtils.create(ConfigurationDataUtils.java:35)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.conf.ConfigurationDataUtils.(ConfigurationDataUtils.java:43)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.pulsar.client.impl.ClientBuilderImpl.loadConf(ClientBuilderImpl.java:77)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.common.config.PulsarClientFactory.createClient(PulsarClientFactory.java:105)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.enumerator.PulsarSourceEnumerator.(PulsarSourceEnumerator.java:95)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.enumerator.PulsarSourceEnumerator.(PulsarSourceEnumerator.java:76)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.connector.pulsar.source.PulsarSource.createEnumerator(PulsarSource.java:144)
 ~[flink-sql-connector-pulsar-4.0-SNAPSHOT.jar:4.0-SNAPSHOT]at 
org.apache.flink.runtime.source.coordinator.SourceCoordinator.start(SourceCoordinator.java:213)
 ~[flink-dist_2.12-1.16.1.jar:1.16.1]

{code}
 

The exception shows a conflict with 
{{{}com.fasterxml.jackson.annotation.JsonFormat$Value{}}}. I investigated and 
found that {{flink-sql-connector-pulsar}} and {{flink-sql-connector-hive}} 
depend on different versions, leading to this conflict.
{code:java}
// flink-sql-connector-pulsar pom.xml

    com.fasterxml.jackson
    jackson-bom
    pom
    import
    2.13.4.20221013
 

// flink-sql-connector-hive pom.xml

com.fasterxml.jackson
jackson-bom
pom
import
2.15.3
{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Robert Metzger
+1 (binding)

On Tue, Apr 9, 2024 at 10:33 AM Ahmed Hamdy  wrote:

> Hi Wudi,
>
> +1 (non-binding).
>
> Best Regards
> Ahmed Hamdy
>
>
> On Tue, 9 Apr 2024 at 09:21, Yuepeng Pan  wrote:
>
> > Hi, Di.
> >
> > Thank you for driving it !
> >
> > +1 (non-binding).
> >
> > Best,
> > Yuepeng Pan
> >
> > On 2024/04/09 02:47:55 wudi wrote:
> > > Hi devs,
> > >
> > > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > contributing the Flink Doris Connector[2] to the Flink community.
> > Discussion thread [3].
> > >
> > > The vote will be open for at least 72 hours unless there is an
> objection
> > or
> > > insufficient votes.
> > >
> > >
> > > Thanks,
> > > Di.Wu
> > >
> > >
> > > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > > [2] https://github.com/apache/doris-flink-connector
> > > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> > >
> > >
> >
>


Re: Iceberg table maintenance

2024-04-09 Thread Ahmed Hamdy
Thanks Peter for forwarding this.
Best Regards
Ahmed Hamdy


On Tue, 9 Apr 2024 at 05:51, Péter Váry  wrote:

> Forwarding the invite for the discussion we plan to do with the Iceberg
> folks, as some of you might be interested in this.
>
> -- Forwarded message -
> From: Brian Olsen 
> Date: Mon, Apr 8, 2024, 18:29
> Subject: Re: Flink table maintenance
> To: 
>
>
> Hey Iceberg nation,
>
> I would like to share about the meeting this Wednesday to further discuss
> details of Péter's proposal on Flink Maintenance Tasks.
> Calendar Link: https://calendar.app.google/83HGYWXoQJ8zXuVCA
>
> List discussion:
> https://lists.apache.org/thread/10mdf9zo6pn0dfq791nf4w1m7jh9k3sl
> <
> https://www.google.com/url?q=https://lists.apache.org/thread/10mdf9zo6pn0dfq791nf4w1m7jh9k3sl=D=calendar=2=AOvVaw2-aePIRr6APFVHpRDipMgX
> >
>
> Design Doc: Flink table maintenance
> <
> https://www.google.com/url?q=https://docs.google.com/document/d/16g3vR18mVBy8jbFaLjf2JwAANuYOmIwr15yDDxovdnA/edit?usp%3Dsharing=D=calendar=2=AOvVaw1oLYQP76-G1ZEOW5pTxV1M
> >
>
>
>
> On Mon, Apr 1, 2024 at 8:52 PM Manu Zhang  wrote:
>
> > Hi Peter,
> >
> > Are you proposing to create a user facing locking feature in Iceberg, or
> >> just something something for internal use?
> >>
> >
> > Since it's a general issue, I'm proposing to create a general user
> > interface first, while the implementation can be left to users. For
> > example, we use Airflow to schedule maintenance jobs and we can check
> > in-progress jobs with the Airflow API. Hive metastore lock might be
> another
> > option we can implement for users.
> >
> > Thanks,
> > Manu
> >
> > On Tue, Apr 2, 2024 at 5:26 AM Péter Váry 
> > wrote:
> >
> >> Hi Ajantha,
> >>
> >> I thought about enabling post commit topology based compaction for sinks
> >> using options, like we use for the parametrization of streaming reads
> [1].
> >> I think it will be hard to do it in a user friendly way - because of the
> >> high number of parameters -, but I think it is a possible solution with
> >> sensible defaults.
> >>
> >> There is a batch-like solution for data file compaction already
> available
> >> [2], but I do not see how we could extend Flink SQL to be able to call
> it.
> >>
> >> Writing to a branch using Flink SQL should be another thread, but by my
> >> first guess, it shouldn't be hard to implement using options, like:
> >> /*+ OPTIONS('branch'='b1') */
> >> Since writing to branch i already working through the Java API [3].
> >>
> >> Thanks, Peter
> >>
> >> 1 -
> >>
> https://iceberg.apache.org/docs/latest/flink-queries/#flink-streaming-read
> >> 2 -
> >>
> https://github.com/apache/iceberg/blob/820fc3ceda386149f42db8b54e6db9171d1a3a6d/flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/actions/RewriteDataFilesAction.java
> >> 3 -
> >> https://iceberg.apache.org/docs/latest/flink-writes/#branch-writes
> >>
> >> On Mon, Apr 1, 2024, 16:30 Ajantha Bhat  wrote:
> >>
> >>> Thanks for the proposal Peter.
> >>>
> >>> I just wanted to know do we have any plans for supporting SQL syntax
> for
> >>> table maintenance (like CALL procedure) for pure Flink SQL users?
> >>> I didn't see any custom SQL parser plugin support in Flink. I also saw
> >>> that Branch write doesn't have SQL support (only Branch reads use
> Option),
> >>> So I am not sure about the roadmap of Iceberg SQL support in Flink.
> >>> Was there any discussion before?
> >>>
> >>> - Ajantha
> >>>
> >>> On Mon, Apr 1, 2024 at 7:51 PM Péter Váry  >
> >>> wrote:
> >>>
>  Hi Manu,
> 
>  Just to clarify:
>  - Are you proposing to create a user facing locking feature in
> Iceberg,
>  or just something something for internal use?
> 
>  I think we shouldn't add locking to Iceberg's user facing scope in
> this
>  stage. A fully featured locking system has many more features that we
> need
>  (priorities, fairness, timeouts etc). I could be tempted when we are
>  talking about the REST catalog, but I think that should be further
> down the
>  road, if ever...
> 
>  About using the tags:
>  - I whole-heartedly agree that using tags is not intuitive, and I see
>  your points in most of your arguments. OTOH, introducing new
> requirement
>  (locking mechanism) seems like a wrong direction to me.
>  - We already defined a requirement (atomic changes on the table) for
>  the Catalog implementations which could be used to archive our goal
> here.
>  - We also already store technical data in snapshot properties in Flink
>  jobs (JobId, OperatorId, CheckpointId). Maybe technical tags/table
>  properties is not a big stretch.
> 
>  Or we can look at these tags or metadata as 'technical data' which is
>  internal to Iceberg, and shouldn't expressed on the external API. My
>  concern is:
>  - Would it be used often enough to worth the additional complexity?
> 
>  Knowing that Spark compaction is struggling with the same issue is a
> 

Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Ferenc Csaky
+1 (non-binding)

Best,
Ferenc




On Tuesday, April 9th, 2024 at 10:32, Ahmed Hamdy  wrote:

> 
> 
> Hi Wudi,
> 
> +1 (non-binding).
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Tue, 9 Apr 2024 at 09:21, Yuepeng Pan panyuep...@apache.org wrote:
> 
> > Hi, Di.
> > 
> > Thank you for driving it !
> > 
> > +1 (non-binding).
> > 
> > Best,
> > Yuepeng Pan
> > 
> > On 2024/04/09 02:47:55 wudi wrote:
> > 
> > > Hi devs,
> > > 
> > > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > > contributing the Flink Doris Connector[2] to the Flink community.
> > > Discussion thread [3].
> > > 
> > > The vote will be open for at least 72 hours unless there is an objection
> > > or
> > > insufficient votes.
> > > 
> > > Thanks,
> > > Di.Wu
> > > 
> > > [1]
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > > [2] https://github.com/apache/doris-flink-connector
> > > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Ahmed Hamdy
Hi Wudi,

+1 (non-binding).

Best Regards
Ahmed Hamdy


On Tue, 9 Apr 2024 at 09:21, Yuepeng Pan  wrote:

> Hi, Di.
>
> Thank you for driving it !
>
> +1 (non-binding).
>
> Best,
> Yuepeng Pan
>
> On 2024/04/09 02:47:55 wudi wrote:
> > Hi devs,
> >
> > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> contributing the Flink Doris Connector[2] to the Flink community.
> Discussion thread [3].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > insufficient votes.
> >
> >
> > Thanks,
> > Di.Wu
> >
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > [2] https://github.com/apache/doris-flink-connector
> > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> >
> >
>


Re: [External] Inquiry Regarding Azure Pipelines

2024-04-09 Thread Robert Metzger
I'm not 100% sure, but I think the manual trigger via a command will use
the latest commit (e.g. your second commit) to trigger a build.
Historically, manual triggering has not been very reliable. If the manual
triggering isn't working, you can also just push a commit with a tiny
change to trigger a new build.


On Sun, Apr 7, 2024 at 5:42 AM Yisha Zhou 
wrote:

> Hi Robert,
>
> Thank you for your prompt response to my previous email. I appreciate the
> information provided. However, I still have a few remaining questions that
> I hope you can assist me with.
>
> I have noticed that other PRs are now able to trigger AZP builds
> automatically. In my case, I have two commits associated with my PR. The
> first commit has already triggered an Azure build successfully, while the
> second commit was made during the service downtime.  My question are :
>
> 1. If I use the command "@flinkbot run azure" now, will it trigger the
> build corresponding to my second commit, or will it rerun the already
> successful build from the first commit?
>
> 2. If the answer for question 1 is the former, how can I trigger AZP for
> my second commit?
>
> I would appreciate any clarification you can provide on this matter. Thank
> you for your attention to this issue, and I look forward to your response.
>
> Best regards,
> Yisha
>
>
>
> > 2024年4月4日 00:52,Robert Metzger  写道:
> >
> > Hi Yisha,
> >
> > flinkbot is currently not active, so new PRs are not triggering any AZP
> > builds. We hope to restore the service soon.
> >
> > AZP is still the source of truth for CI builds.
> >
> >
> > On Wed, Apr 3, 2024 at 11:34 AM Yisha Zhou  >
> > wrote:
> >
> >> Hi devs,
> >>
> >> I hope this email finds you well. I am writing to seek clarification
> >> regarding the status of Azure Pipelines within the Apache community and
> >> seek assistance with a specific issue I encountered.
> >>
> >> Today, I made some new commits to a pull request in one of the Apache
> >> repositories. However, I noticed that even after approximately six
> hours,
> >> there were no triggers initiated for the Azure Pipeline. I have a
> couple of
> >> questions regarding this matter:
> >>
> >> 1. Is the Apache community still utilizing Azure Pipelines for CI/CD
> >> purposes? I came across an issue discussing the migration from Azure to
> >> GitHub Actions, but I am uncertain about the timeline for discontinuing
> the
> >> use of Azure Pipelines.
> >>
> >> 2. If Azure Pipelines are still in use, where can I find information
> about
> >> the position of my commits in the CI queue, awaiting execution?
> >>
> >> I would greatly appreciate any insights or guidance you can provide
> >> regarding these questions. Thank you for your time and attention.
> >>
> >> My PR link is https://github.com/apache/flink/pull/24567 <
> >> https://github.com/apache/flink/pull/24567 <
> https://github.com/apache/flink/pull/24567>>.
> >>
> >> Best regards,
> >> Yisha
>
>


RE: [DISCUSS] FLIP-437: Support ML Models in Flink SQL

2024-04-09 Thread David Radley
Hi Han,
Thanks for getting back to me.

I am curious about the valid characters in a model name – I assume any 
characters are valid as it is a quoted string in SQL. So $ could be in the 
model name. I would think that the model would be determined then the model is 
deployed, ( there could be other versions associated with authoring  or 
intermediate states of the model that never get deployed) – rather than 
allocated by Flink if there is none.
I see https://github.com/onnx/onnx/blob/main/docs/Versioning.md supports 
numbers or semantic versioning and 3 different types of versioning.

It would be interesting to see how champion challenger scenarios would play out 
– when you try a new version of the model that might perform better.
I suggest having a new optional model-version keyword, which would seem to be a 
cleaner way of specifying a model.



 Kind regards, David.

From: Hao Li 
Date: Wednesday, 3 April 2024 at 18:58
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-437: Support ML Models in Flink SQL
Cross post David Radley's comments here from voting thread:

> I don’t think this counts as an objection, I have some comments. I should
have put this on the discussion thread earlier but have just got to this.
> - I suggest we can put a model version in the model resource. Versions
are notoriously difficult to add later; I don’t think we want to
proliferate differently named models as a model mutates. We may want to
work with non-latest models.
> - I see that the model name is the unique identifier. I realise this
would move away from the Oracle syntax – so may not be feasible short term;
but I wonder if we can have:
> - a uuid as the main identifier and the model name as an attribute.
> or
>  - a namespace (or something like a system of origin)
> to help organise models with the same name.
> - does the model have an owner? I assume that Flink model resource is the
master of the model? I imagine in the future that a model that comes in via
a new connector could be kept up to date with the external model and would
not be allowed to be changed by anything other than the connector.

Thanks for the comments. I agree supporting the model version is important.
I think we could support versioning without changing the overall syntax by
appending version number/name to the model name. Catalog implementations
can handle the versions. For example,

CREATE MODEL `my-model$1`...

"$1" would imply it's version 1. If no version is provided, we can auto
increment the version if the model name exists already or create the first
version if the model name doesn't exist yet.

As for model ownership, I'm not entirely sure about the use case and how it
should be controlled. It could be controlled from the user side through
ACL/rbac or some way in the catalog I guess. Maybe we can follow up on this
as the requirement or use case becomes more clear.

Cross post David Moravek's comments from voting thread:

> My only suggestion would be to move Catalog changes into a separate
> interface to allow us to begin with lower stability guarantees. Existing
> Catalogs would be able to opt-in by implementing it. It's a minor thing
> though, overall the FLIP is solid and the direction is pretty exciting.

I think it's fine to move model related catalog changes to a separate
interface and let the current catalog interface extend it. As model support
will be built-in in Flink, the current catalog interface will need to
support model CRUD operations. For my own education, can you elaborate more
on how separate interface will allow us to begin with lower stability
guarantees?

Thanks,
Hao


On Thu, Mar 28, 2024 at 10:14 AM Hao Li  wrote:

> Thanks Timo. I'll start a vote tomorrow if no further discussion.
>
> Thanks,
> Hao
>
> On Thu, Mar 28, 2024 at 9:33 AM Timo Walther  wrote:
>
>> Hi everyone,
>>
>> I updated the FLIP according to this discussion.
>>
>> @Hao Li: Let me know if I made a mistake somewhere. I added some
>> additional explaning comments about the new PTF syntax.
>>
>> There are no further objections from my side. If nobody objects, Hao
>> feel free to start the voting tomorrow.
>>
>> Regards,
>> Timo
>>
>>
>> On 28.03.24 16:30, Jark Wu wrote:
>> > Thanks, Hao,
>> >
>> > Sounds good to me.
>> >
>> > Best,
>> > Jark
>> >
>> > On Thu, 28 Mar 2024 at 01:02, Hao Li  wrote:
>> >
>> >> Hi Jark,
>> >>
>> >> I think we can start with supporting popular model providers such as
>> >> openai, azureml, sagemaker for remote models.
>> >>
>> >> Thanks,
>> >> Hao
>> >>
>> >> On Tue, Mar 26, 2024 at 8:15 PM Jark Wu  wrote:
>> >>
>> >>> Thanks for the PoC and updating,
>> >>>
>> >>> The final syntax looks good to me, at least it is a nice and concise
>> >> first
>> >>> step.
>> >>>
>> >>> SELECT f1, f2, label FROM
>> >>> ML_PREDICT(
>> >>>   input => `my_data`,
>> >>>   model => `my_cat`.`my_db`.`classifier_model`,
>> >>>   args => DESCRIPTOR(f1, f2));
>> >>>
>> >>> Besides, what built-in models 

[DISCUSSION] FLIP-442: FLIP-442: General Improvement to Configuration for Flink 2.0

2024-04-09 Thread Xuannan Su
Hi all,

I'd like to start a discussion on FLIP-442: General Improvement to
Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we aim to
provide users with a better experience with the existing
configuration. This FLIP proposes several general improvements to the
current configuration.

Looking forward to everyone's feedback and suggestions. Thank you!

Best regards,
Xuannan

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-442%3A+General+Improvement+to+Configuration+for+Flink+2.0


Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Yuepeng Pan
Hi, Di.

Thank you for driving it !

+1 (non-binding).

Best,
Yuepeng Pan

On 2024/04/09 02:47:55 wudi wrote:
> Hi devs,
> 
> I would like to start a vote about FLIP-399 [1]. The FLIP is about 
> contributing the Flink Doris Connector[2] to the Flink community. Discussion 
> thread [3].
> 
> The vote will be open for at least 72 hours unless there is an objection or
> insufficient votes.
> 
> 
> Thanks,
> Di.Wu
> 
> 
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> [2] https://github.com/apache/doris-flink-connector
> [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> 
> 


[jira] [Created] (FLINK-35062) Migrate RewriteMultiJoinConditionRule

2024-04-09 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35062:
-

 Summary: Migrate RewriteMultiJoinConditionRule
 Key: FLINK-35062
 URL: https://issues.apache.org/jira/browse/FLINK-35062
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35061) Migrate TemporalJoinUtil

2024-04-09 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35061:
-

 Summary: Migrate TemporalJoinUtil
 Key: FLINK-35061
 URL: https://issues.apache.org/jira/browse/FLINK-35061
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35060) Provide compatibility of old CheckpointMode for connector testing framework

2024-04-09 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-35060:
---

 Summary: Provide compatibility of old CheckpointMode for connector 
testing framework
 Key: FLINK-35060
 URL: https://issues.apache.org/jira/browse/FLINK-35060
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing, Tests
Reporter: Zakelly Lan
Assignee: Zakelly Lan


After FLINK-34516, the {{org.apache.flink.streaming.api.CheckpointingMode}} has 
been moved to {{org.apache.flink.core.execution.CheckpointingMode}}. It 
introduced a breaking change to connector testing framework as well as to 
externalized connector repos by mistake. This should be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35059) Bump org.postgresql:postgresql from 42.5.1 to 42.5.5 in flink-connector-jdbc

2024-04-09 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-35059:
---

 Summary: Bump org.postgresql:postgresql from 42.5.1 to 42.5.5 in 
flink-connector-jdbc
 Key: FLINK-35059
 URL: https://issues.apache.org/jira/browse/FLINK-35059
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / JDBC
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)