Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-11 Thread Zhanghao Chen
Thanks for driving this, Weijie. Usually, the data distribution of the external 
system is closely related to the keys, e.g. computing the bucket index by key 
hashcode % bucket num, so I'm not sure about how much difference there are 
between partitioning by key and a custom partitioning strategy. Could you give 
a more concrete example in production on when a custom partitioning strategy 
will outperform partitioning by key? Since you've mentioned Paimon in doc, 
maybe an example on Paimon.

Best,
Zhanghao Chen

From: weijie guo 
Sent: Friday, June 7, 2024 9:59
To: dev 
Subject: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream 
of Lookup Join

Hi devs,


I'd like to start a discussion about FLIP-462[1]: Support Custom Data
Distribution for Input Stream of Lookup Join.


Lookup Join is an important feature in Flink, It is typically used to
enrich a table with data that is queried from an external system.
If we interact with the external systems for each incoming record, we
incur significant network IO and RPC overhead.

Therefore, most connectors introduce caching to reduce the per-record
level query overhead. However, because the data distribution of Lookup
Join's input stream is arbitrary, the cache hit rate is sometimes
unsatisfactory.


We want to introduce a mechanism for the connector to tell the Flink
planner its desired input stream data distribution or partitioning
strategy. This can significantly reduce the amount of cached data and
improve performance of Lookup Join.


You can find more details in this FLIP[1]. Looking forward to hearing
from you, thanks!


Best regards,

Weijie


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join


Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-11 Thread 王刚
Congratulations Rui

Best Regards,
Gang Wang

Jacky Lau  于2024年6月11日周二 13:04写道:

> Congratulations Rui, well deserved!
>
> Regards,
> Jacky Lau
>
> Jeyhun Karimov 于2024年6月11日 周二03:49写道:
>
> > Congratulations Rui, well deserved!
> >
> > Regards,
> > Jeyhun
> >
> > On Mon, Jun 10, 2024, 10:21 Ahmed Hamdy  wrote:
> >
> > > Congratulations Rui!
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Mon, 10 Jun 2024 at 09:10, David Radley 
> > > wrote:
> > >
> > > > Congratulations, Rui!
> > > >
> > > > From: Sergey Nuyanzin 
> > > > Date: Sunday, 9 June 2024 at 20:33
> > > > To: dev@flink.apache.org 
> > > > Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink PMC Member - Fan
> > Rui
> > > > Congratulations, Rui!
> > > >
> > > > On Fri, Jun 7, 2024 at 5:36 AM Xia Sun  wrote:
> > > >
> > > > > Congratulations, Rui!
> > > > >
> > > > > Best,
> > > > > Xia
> > > > >
> > > > > Paul Lam  于2024年6月6日周四 11:59写道:
> > > > >
> > > > > > Congrats, Rui!
> > > > > >
> > > > > > Best,
> > > > > > Paul Lam
> > > > > >
> > > > > > > 2024年6月6日 11:02,Junrui Lee  写道:
> > > > > > >
> > > > > > > Congratulations, Rui.
> > > > > > >
> > > > > > > Best,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Hang Ruan  于2024年6月6日周四 10:35写道:
> > > > > > >
> > > > > > >> Congratulations, Rui!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Hang
> > > > > > >>
> > > > > > >> Samrat Deb  于2024年6月6日周四 10:28写道:
> > > > > > >>
> > > > > > >>> Congratulations Rui
> > > > > > >>>
> > > > > > >>> Bests,
> > > > > > >>> Samrat
> > > > > > >>>
> > > > > > >>> On Thu, 6 Jun 2024 at 7:45 AM, Yuxin Tan <
> > tanyuxinw...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Yuxin
> > > > > > 
> > > > > > 
> > > > > >  Xuannan Su  于2024年6月6日周四 09:58写道:
> > > > > > 
> > > > > > > Congratulations!
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Xuannan
> > > > > > >
> > > > > > > On Thu, Jun 6, 2024 at 9:53 AM Hangxiang Yu <
> > > master...@gmail.com
> > > > >
> > > > > > >>> wrote:
> > > > > > >>
> > > > > > >> Congratulations, Rui !
> > > > > > >>
> > > > > > >> On Thu, Jun 6, 2024 at 9:18 AM Lincoln Lee <
> > > > > lincoln.8...@gmail.com
> > > > > > >>>
> > > > > > > wrote:
> > > > > > >>
> > > > > > >>> Congratulations, Rui!
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Lincoln Lee
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Lijie Wang  于2024年6月6日周四
> > 09:11写道:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Lijie
> > > > > > 
> > > > > >  Rodrigo Meneses  于2024年6月5日周三
> > 21:35写道:
> > > > > > 
> > > > > > > All the best
> > > > > > >
> > > > > > > On Wed, Jun 5, 2024 at 5:56 AM xiangyu feng <
> > > > > >  xiangyu...@gmail.com>
> > > > > >  wrote:
> > > > > > >
> > > > > > >> Congratulations, Rui!
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >> Xiangyu Feng
> > > > > > >>
> > > > > > >> Feng Jin  于2024年6月5日周三
> 20:42写道:
> > > > > > >>
> > > > > > >>> Congratulations, Rui!
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Feng Jin
> > > > > > >>>
> > > > > > >>> On Wed, Jun 5, 2024 at 8:23 PM Yanfei Lei <
> > > > > >  fredia...@gmail.com
> > > > > > >>
> > > > > >  wrote:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Yanfei
> > > > > > 
> > > > > >  Luke Chen  于2024年6月5日周三 20:08写道:
> > > > > > >
> > > > > > > Congrats, Rui!
> > > > > > >
> > > > > > > Luke
> > > > > > >
> > > > > > > On Wed, Jun 5, 2024 at 8:02 PM Jiabao Sun <
> > > > > > >>> jiabao...@apache.org>
> > > > > > >>> wrote:
> > > > > > >
> > > > > > >> Congrats, Rui. Well-deserved!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Jiabao
> > > > > > >>
> > > > > > >> Zhanghao Chen 
> > > > > > >>> 于2024年6月5日周三
> > > > > >  19:29写道:
> > > > > > >>
> > > > > > >>> Congrats, Rui!
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Zhanghao Chen
> > > > > > >>> 
> > > > > > >>> From: Piotr Nowojski 
> > > > > > >>> Sent: Wednesday, June 5, 2024 18:01
> > > > > > >>> To: dev ; rui fan <
> > > > > >  1996fan...@gmail.com>
> > > > > > >>> Subject: [ANNOUNCE] New Apache Flink PMC Member -
> > > > > > >>> Fan
> > > > > > > Rui

Re: [June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync 11/06/2024

2024-06-11 Thread weijie guo
Thanks Zhanghao for the feedback.

Please feel free to change the state of this one to `won't make it`.


Best regards,

Weijie


Zhanghao Chen  于2024年6月12日周三 13:18写道:

> Hi Rui,
>
> Thanks for the summary! A quick update here: FLIP-398 was decided not to
> go into 1.20, as it was just found that the effort to add dedicated
> serialization support for Maps, Sets and Lists, will break
> state-compatibility. I will revert the relevant changes soon.
>
> Best,
> Zhanghao Chen
> 
> From: Rui Fan <1996fan...@gmail.com>
> Sent: Wednesday, June 12, 2024 12:59
> To: dev 
> Subject: [June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync
> 11/06/2024
>
> Dear devs,
>
> This is the sixth meeting for Flink 1.20 release[1] cycle.
>
> I'd like to share the information synced in the meeting.
>
> - Feature Freeze
>
> It is worth noting that there are only 3 days left until the
> feature freeze time(June 15, 2024, 00:00 CEST(UTC+2)),
> and developers need to pay attention to the feature freeze time.
>
> After checked with all contributors of 1.20 FLIPs, we don't need
> to postpone the feature freeze time. Please reply to this email
> if other features are valuable and it's better to be merged in 1.20,
> thanks.
>
> - Features:
>
> So far we've had 16 flips/features:
> - 6 flips/features are done
> - 8 flips/features are doing and release managers checked with
> corresponding contributors
>   - 7 of these flips/features can be completed before June 15, 2024, 00:00
> CEST(UTC+2)
>   - We were unable to contact the contributor of FLIP-436
> - 2 flips/features won't make in 1.20
>
> - Blockers:
>
> We don't have any blocker right now, thanks to everyone who fixed blockers
> before.
>
> - Sync meeting[2]:
>
> The next meeting is 18/06/2024 10am (UTC+2) and 4pm (UTC+8), please
> feel free to join us.
>
> Lastly, we encourage attendees to fill out the topics to be discussed at
> the bottom of 1.20 wiki page[1] a day in advance, to make it easier for
> everyone to understand the background of the topics, thanks!
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
> [2] https://meet.google.com/mtj-huez-apu
>
> Best,
> Robert, Weijie, Ufuk and Rui
>


Re: [June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync 11/06/2024

2024-06-11 Thread Zhanghao Chen
Hi Rui,

Thanks for the summary! A quick update here: FLIP-398 was decided not to go 
into 1.20, as it was just found that the effort to add dedicated serialization 
support for Maps, Sets and Lists, will break state-compatibility. I will revert 
the relevant changes soon.

Best,
Zhanghao Chen

From: Rui Fan <1996fan...@gmail.com>
Sent: Wednesday, June 12, 2024 12:59
To: dev 
Subject: [June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync 11/06/2024

Dear devs,

This is the sixth meeting for Flink 1.20 release[1] cycle.

I'd like to share the information synced in the meeting.

- Feature Freeze

It is worth noting that there are only 3 days left until the
feature freeze time(June 15, 2024, 00:00 CEST(UTC+2)),
and developers need to pay attention to the feature freeze time.

After checked with all contributors of 1.20 FLIPs, we don't need
to postpone the feature freeze time. Please reply to this email
if other features are valuable and it's better to be merged in 1.20, thanks.

- Features:

So far we've had 16 flips/features:
- 6 flips/features are done
- 8 flips/features are doing and release managers checked with
corresponding contributors
  - 7 of these flips/features can be completed before June 15, 2024, 00:00
CEST(UTC+2)
  - We were unable to contact the contributor of FLIP-436
- 2 flips/features won't make in 1.20

- Blockers:

We don't have any blocker right now, thanks to everyone who fixed blockers
before.

- Sync meeting[2]:

The next meeting is 18/06/2024 10am (UTC+2) and 4pm (UTC+8), please
feel free to join us.

Lastly, we encourage attendees to fill out the topics to be discussed at
the bottom of 1.20 wiki page[1] a day in advance, to make it easier for
everyone to understand the background of the topics, thanks!

[1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
[2] https://meet.google.com/mtj-huez-apu

Best,
Robert, Weijie, Ufuk and Rui


Re: [June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync 11/06/2024

2024-06-11 Thread weijie guo
Thanks Rui for the summary!

Best regards,

Weijie


Rui Fan <1996fan...@gmail.com> 于2024年6月12日周三 13:00写道:

> Dear devs,
>
> This is the sixth meeting for Flink 1.20 release[1] cycle.
>
> I'd like to share the information synced in the meeting.
>
> - Feature Freeze
>
> It is worth noting that there are only 3 days left until the
> feature freeze time(June 15, 2024, 00:00 CEST(UTC+2)),
> and developers need to pay attention to the feature freeze time.
>
> After checked with all contributors of 1.20 FLIPs, we don't need
> to postpone the feature freeze time. Please reply to this email
> if other features are valuable and it's better to be merged in 1.20,
> thanks.
>
> - Features:
>
> So far we've had 16 flips/features:
> - 6 flips/features are done
> - 8 flips/features are doing and release managers checked with
> corresponding contributors
>   - 7 of these flips/features can be completed before June 15, 2024, 00:00
> CEST(UTC+2)
>   - We were unable to contact the contributor of FLIP-436
> - 2 flips/features won't make in 1.20
>
> - Blockers:
>
> We don't have any blocker right now, thanks to everyone who fixed blockers
> before.
>
> - Sync meeting[2]:
>
> The next meeting is 18/06/2024 10am (UTC+2) and 4pm (UTC+8), please
> feel free to join us.
>
> Lastly, we encourage attendees to fill out the topics to be discussed at
> the bottom of 1.20 wiki page[1] a day in advance, to make it easier for
> everyone to understand the background of the topics, thanks!
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
> [2] https://meet.google.com/mtj-huez-apu
>
> Best,
> Robert, Weijie, Ufuk and Rui
>


[June 15 Feature Freeze][SUMMARY] Flink 1.20 Release Sync 11/06/2024

2024-06-11 Thread Rui Fan
Dear devs,

This is the sixth meeting for Flink 1.20 release[1] cycle.

I'd like to share the information synced in the meeting.

- Feature Freeze

It is worth noting that there are only 3 days left until the
feature freeze time(June 15, 2024, 00:00 CEST(UTC+2)),
and developers need to pay attention to the feature freeze time.

After checked with all contributors of 1.20 FLIPs, we don't need
to postpone the feature freeze time. Please reply to this email
if other features are valuable and it's better to be merged in 1.20, thanks.

- Features:

So far we've had 16 flips/features:
- 6 flips/features are done
- 8 flips/features are doing and release managers checked with
corresponding contributors
  - 7 of these flips/features can be completed before June 15, 2024, 00:00
CEST(UTC+2)
  - We were unable to contact the contributor of FLIP-436
- 2 flips/features won't make in 1.20

- Blockers:

We don't have any blocker right now, thanks to everyone who fixed blockers
before.

- Sync meeting[2]:

The next meeting is 18/06/2024 10am (UTC+2) and 4pm (UTC+8), please
feel free to join us.

Lastly, we encourage attendees to fill out the topics to be discussed at
the bottom of 1.20 wiki page[1] a day in advance, to make it easier for
everyone to understand the background of the topics, thanks!

[1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
[2] https://meet.google.com/mtj-huez-apu

Best,
Robert, Weijie, Ufuk and Rui


[jira] [Created] (FLINK-35570) Consider PlaceholderStreamStateHandle in checkpoint file merging

2024-06-11 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-35570:
---

 Summary: Consider PlaceholderStreamStateHandle in checkpoint file 
merging
 Key: FLINK-35570
 URL: https://issues.apache.org/jira/browse/FLINK-35570
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Zakelly Lan
Assignee: Zakelly Lan


In checkpoint file merging, we should take {{PlaceholderStreamStateHandle}} 
into account during lifecycle, since it can be a file merged one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-11 Thread Zhu Zhu
+1 (binding)

Thanks,
Zhu

Yuepeng Pan  于2024年6月11日周二 17:04写道:

> +1 (non-binding)
>
> Best regards,
> Yuepeng Pan
>
> At 2024-06-11 16:34:12, "Rui Fan" <1996fan...@gmail.com> wrote:
> >+1(binding)
> >
> >Best,
> >Rui
> >
> >On Tue, Jun 11, 2024 at 4:14 PM Muhammet Orazov
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> Thanks Yuxin for driving this!
> >>
> >> Best,
> >> Muhammet
> >>
> >>
> >> On 2024-06-07 08:02, Yuxin Tan wrote:
> >> > Hi everyone,
> >> >
> >> > Thanks for all the feedback about the FLIP-459 Support Flink
> >> > hybrid shuffle integration with Apache Celeborn[1].
> >> > The discussion thread is here [2].
> >> >
> >> > I'd like to start a vote for it. The vote will be open for at least
> >> > 72 hours unless there is an objection or insufficient votes.
> >> >
> >> > [1]
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-459%3A+Support+Flink+hybrid+shuffle+integration+with+Apache+Celeborn
> >> > [2] https://lists.apache.org/thread/gy7sm7qrf7yrv1rl5f4vtk5fo463ts33
> >> >
> >> > Best,
> >> > Yuxin
> >>
>


[jira] [Created] (FLINK-35569) SnapshotFileMergingCompatibilityITCase#testSwitchFromEnablingToDisablingFileMerging failed

2024-06-11 Thread Jane Chan (Jira)
Jane Chan created FLINK-35569:
-

 Summary: 
SnapshotFileMergingCompatibilityITCase#testSwitchFromEnablingToDisablingFileMerging
 failed
 Key: FLINK-35569
 URL: https://issues.apache.org/jira/browse/FLINK-35569
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines, Build System / CI
Affects Versions: 1.20.0
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-463: Schema Definition in CREATE TABLE AS Statement

2024-06-11 Thread Yanquan Lv
Hi Sergio, thanks for driving it, +1 for this.

I have some comments:
1. If we have a source table with primary keys and partition keys defined,
what is the default behavior if PARTITIONED and DISTRIBUTED not specified
in the CTAS statement, It should not be inherited by default?
2. I suggest providing a complete syntax that includes table_properties
like FLIP-218.


Sergio Pena  于2024年6月12日周三 03:54写道:

> I just noticed the CREATE TABLE LIKE statement allows the definition of new
> columns in the CREATE part. The difference
> with this CTAS proposal is that TABLE LIKE appends the new columns at the
> end of the schema instead of adding them
> at the beginning like this proposal and Mysql do.
>
> > create table t1(id int, name string);
> > > create table s1(a int, b string) like t1;
> > > describe s1;
>
> +-+---+--++
> > | Column Name | Data Type | Nullable | Extras |
> > +-+---+--++
> > | id  | INT   | NULL ||
> > | name| STRING| NULL ||
> > | a   | INT   | NULL ||
> > | b   | STRING| NULL ||
> > +-+---+--++
>
>
>
> The CREATE TABLE LIKE also does not let the definition of existing columns
> in the CREATE part. The statement fails
> that the column already exists.
>
> > create table t1(id int, name string);
>
> > create table s1(id double) like t1;
> > A column named 'id' already exists in the base table.
> >
>
> What do you guys think of making it similar to the CREATE TABLE LIKE? Seems
> the best approach in order to
> be compatible with it.
>
> - Sergio
>
> On Tue, Jun 11, 2024 at 2:10 PM Sergio Pena  wrote:
>
> > Thanks Timo for answering Jeyhun questions.
> >
> > To add info more about your questions Jeyhun. This proposal is not
> > handling NULL/NOT_NULL types. I noticed that
> > the current CTAS impl. (as Timo said) adds this constraint as part of the
> > resulting schema. And when defining
> > a primary key in the CREATE part, if the resulting schema does not have a
> > NOT NULL in the column then the CTAS
> > will fail. This is similar to the CREATE TABLE LIKE which expects the
> LIKE
> > table to have a NOT NULL column if
> > the user defines a primary key in the CREATE part.
> >
> > > In some cases, redefining the column types might be redundant,
> especially
> > > when users dont change the column type. A user just wants to change the
> > > column name from the SELECT clause. Should we also support this
> scenario,
> > > similar to postgres?
> >
> > I looked into Postgres too. Postgres matches the columns based on the
> > order defined in the create and select part.
> > If you want to rename a column in the create part, then that column
> > position must be in the same position as the query column.
> > I didn't like the Postgres approach because it does not let us add
> columns
> > that do not exist in the query schema.
> >
> > i.e. query has schema (a int, b string), now the `a` column is renamed to
> > `id` because both are in the same position 0
> > `create table s1(id int) as select a, b from t1`;
> > results in: [id int, b string]
> >
> > I think, if users want to rename then they can use a different alias in
> > the select part. They could also do explicit casting
> > for changing the data types, which now makes it redundant (as you said)
> to
> > allow redefining the query columns again. But
> > perhaps there are cases where explicit casting does not work and just
> > defining the column would? i.e. making a nullable
> > type to not null? I couldn't make `cast(c1 as int not null)` to work for
> > instance, but it may work in the create part?
> >
> > > Could you also mention the casting rules in the FLIP for this case?
> >
> > I mentioned they're the same as insert/select when doing implicit
> casting.
> > I will search for more info about the insert/select
> > and add the casting rules in the flip..
> >
> > - Sergio
> >
> >
> > On Tue, Jun 11, 2024 at 12:59 AM Timo Walther 
> wrote:
> >
> >> Hi Sergio,
> >>
> >> thanks for proposing this FLIP for finalizing the CTAS statement.
> >> Adopting the logic from MySQL for deriving and potentially overwriting
> >> parts of the schema should be easy to understand for everyone. So +1 for
> >> the FLIP in general.
> >>
> >>  > How do you handle CTAS statements with SELECT clauses that have
> >>  > (implicit or explicit) NULLABLE or NOT NULLABLE columns?
> >>
> >> @Jeyhun: I don't think there is anything special about this. The current
> >> CTAS implementation should already cover that. It takes the nullability
> >> of the column's data type as a constraint into derived schema. Keep in
> >> mind that nullability is part of the data type in Flink, not only a
> >> constraint on the schema. This decision was made due to Calcite
> internals.
> >>
> >>  > redefining the column types might be redundant, especially
> >>  > when users dont change the 

[jira] [Created] (FLINK-35568) Add imagePullSecrets for FlinkDeployment spec

2024-06-11 Thread Gang Huang (Jira)
Gang Huang created FLINK-35568:
--

 Summary: Add imagePullSecrets for FlinkDeployment spec
 Key: FLINK-35568
 URL: https://issues.apache.org/jira/browse/FLINK-35568
 Project: Flink
  Issue Type: Improvement
Reporter: Gang Huang


I am confused that how to configure imagePullSecrets for a private dockerhub 
website, since there maybe are no related parameters found in the official docs 
(https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/overview/)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Connector releases for Flink 1.19

2024-06-11 Thread Sergey Nuyanzin
Thanks a lot Danny!

On Tue, Jun 11, 2024 at 10:21 AM Danny Cranmer  wrote:
>
> Hey Sergey,
>
> I have completed the 3 tasks. Let me know if you need anything else.
>
> Thanks,
> Danny
>
> On Tue, Jun 11, 2024 at 9:11 AM Danny Cranmer 
> wrote:
>
> > Thanks for driving this Sergey, I will pick up the PMC tasks.
> >
> > Danny
> >
> > On Sun, Jun 9, 2024 at 11:09 PM Sergey Nuyanzin 
> > wrote:
> >
> >> Hi everyone,
> >>
> >> as you might noticed the voting threads for release of flink-opensearch
> >> connectors (v1, v2) received 3+ binding votes[1][2]
> >>
> >> Now I need PMC help to finish the release of these 2.
> >> I'm on the step "Deploy source and binary releases to dist.apache.org"[3]
> >> for both versions
> >> An attempt to execute this step says that access is forbidden, I assume
> >> that only PMC have access to do it
> >>
> >> I guess there are 3 things which should be done and where only PMC have
> >> access and where I need help
> >> 1. As mentioned deploying to dist.apache.org
> >> 2. Mark version as released on jira
> >> 3. Place a record on reporter.apache.org about the release
> >>
> >> Would be great if someone from PMC could help here
> >>
> >> [1] https://lists.apache.org/thread/kno142xdzqg592tszmnonk69nl14xszl
> >> [2] https://lists.apache.org/thread/by44cdpfv6p9394vwxhh1vzh3rfskzms
> >> [3]
> >> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+flink-connector+release#Creatingaflinkconnectorrelease-Deploysourceandbinaryreleasestodist.apache.org
> >>
> >> On Fri, May 31, 2024 at 7:53 AM Sergey Nuyanzin 
> >> wrote:
> >>
> >>> Hi Jing,
> >>> >Thanks for the hint wrt JDBC connector. Where could users know that it
> >>> > already supports 1.19?
> >>>
> >>> There is no released version supporting/tested against 1.19
> >>> However the support was added within [1] and currently
> >>> there is an active RC in voting stage containing this fix [2]
> >>>
> >>> [1]
> >>> https://github.com/apache/flink-connector-jdbc/commit/7025642d88ff661e486745b23569595e1813a1d0
> >>> [2] https://lists.apache.org/thread/b7xbjo4crt1527ldksw4nkwo8vs56csy
> >>>
> >>
> >>
> >> --
> >> Best regards,
> >> Sergey
> >>
> >



-- 
Best regards,
Sergey


[ANNOUNCE] Apache flink-connector-opensearch 2.0.0 released

2024-06-11 Thread Sergey Nuyanzin
The Apache Flink community is very happy to announce the release of
Apache flink-connector-opensearch 2.0.0.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data
streaming applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354674

We would like to thank all contributors of the Apache Flink community
who made this release possible!

-- 
Best regards,
Sergey


[ANNOUNCE] Apache flink-connector-opensearch 1.2.0 released

2024-06-11 Thread Sergey Nuyanzin
The Apache Flink community is very happy to announce the release of Apache
flink-connector-opensearch 1.2.0 for Flink 1.18 and Flink 1.19.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353812

We would like to thank all contributors of the Apache Flink community who
made this release possible!


-- 
Best regards,
Sergey


Re: [DISCUSS] FLIP-463: Schema Definition in CREATE TABLE AS Statement

2024-06-11 Thread Sergio Pena
I just noticed the CREATE TABLE LIKE statement allows the definition of new
columns in the CREATE part. The difference
with this CTAS proposal is that TABLE LIKE appends the new columns at the
end of the schema instead of adding them
at the beginning like this proposal and Mysql do.

> create table t1(id int, name string);
> > create table s1(a int, b string) like t1;
> > describe s1;

+-+---+--++
> | Column Name | Data Type | Nullable | Extras |
> +-+---+--++
> | id  | INT   | NULL ||
> | name| STRING| NULL ||
> | a   | INT   | NULL ||
> | b   | STRING| NULL ||
> +-+---+--++



The CREATE TABLE LIKE also does not let the definition of existing columns
in the CREATE part. The statement fails
that the column already exists.

> create table t1(id int, name string);

> create table s1(id double) like t1;
> A column named 'id' already exists in the base table.
>

What do you guys think of making it similar to the CREATE TABLE LIKE? Seems
the best approach in order to
be compatible with it.

- Sergio

On Tue, Jun 11, 2024 at 2:10 PM Sergio Pena  wrote:

> Thanks Timo for answering Jeyhun questions.
>
> To add info more about your questions Jeyhun. This proposal is not
> handling NULL/NOT_NULL types. I noticed that
> the current CTAS impl. (as Timo said) adds this constraint as part of the
> resulting schema. And when defining
> a primary key in the CREATE part, if the resulting schema does not have a
> NOT NULL in the column then the CTAS
> will fail. This is similar to the CREATE TABLE LIKE which expects the LIKE
> table to have a NOT NULL column if
> the user defines a primary key in the CREATE part.
>
> > In some cases, redefining the column types might be redundant, especially
> > when users dont change the column type. A user just wants to change the
> > column name from the SELECT clause. Should we also support this scenario,
> > similar to postgres?
>
> I looked into Postgres too. Postgres matches the columns based on the
> order defined in the create and select part.
> If you want to rename a column in the create part, then that column
> position must be in the same position as the query column.
> I didn't like the Postgres approach because it does not let us add columns
> that do not exist in the query schema.
>
> i.e. query has schema (a int, b string), now the `a` column is renamed to
> `id` because both are in the same position 0
> `create table s1(id int) as select a, b from t1`;
> results in: [id int, b string]
>
> I think, if users want to rename then they can use a different alias in
> the select part. They could also do explicit casting
> for changing the data types, which now makes it redundant (as you said) to
> allow redefining the query columns again. But
> perhaps there are cases where explicit casting does not work and just
> defining the column would? i.e. making a nullable
> type to not null? I couldn't make `cast(c1 as int not null)` to work for
> instance, but it may work in the create part?
>
> > Could you also mention the casting rules in the FLIP for this case?
>
> I mentioned they're the same as insert/select when doing implicit casting.
> I will search for more info about the insert/select
> and add the casting rules in the flip..
>
> - Sergio
>
>
> On Tue, Jun 11, 2024 at 12:59 AM Timo Walther  wrote:
>
>> Hi Sergio,
>>
>> thanks for proposing this FLIP for finalizing the CTAS statement.
>> Adopting the logic from MySQL for deriving and potentially overwriting
>> parts of the schema should be easy to understand for everyone. So +1 for
>> the FLIP in general.
>>
>>  > How do you handle CTAS statements with SELECT clauses that have
>>  > (implicit or explicit) NULLABLE or NOT NULLABLE columns?
>>
>> @Jeyhun: I don't think there is anything special about this. The current
>> CTAS implementation should already cover that. It takes the nullability
>> of the column's data type as a constraint into derived schema. Keep in
>> mind that nullability is part of the data type in Flink, not only a
>> constraint on the schema. This decision was made due to Calcite internals.
>>
>>  > redefining the column types might be redundant, especially
>>  > when users dont change the column type
>>
>> This is indeed a good point. On one hand, I think we should avoid
>> further complicating the syntax. But looking at other vendors [1] this
>> seems indeed a valid use case. If it doesn't cause too many special
>> cases in the parser (and it's look-ahead), I'm fine with supporting a
>> list of column names as well. However, the most important use case will
>> be specifying a watermark, metadata columns, or other schema parts that
>> are not just columns names.
>>
>> Regards,
>> Timo
>>
>>
>> [1]
>>
>> 

Re: Flink Kubernetes Operator 1.9.0 release planning

2024-06-11 Thread Márton Balassi
+1 for cutting the release and Gyula as the release manager.

On Tue, Jun 11, 2024 at 10:41 AM David Radley 
wrote:

> I agree – thanks for driving this Gyula.
>
> From: Rui Fan <1996fan...@gmail.com>
> Date: Tuesday, 11 June 2024 at 02:52
> To: dev@flink.apache.org 
> Cc: Mate Czagany 
> Subject: [EXTERNAL] Re: Flink Kubernetes Operator 1.9.0 release planning
> Thanks Gyula for driving this release!
>
> > I suggest we cut the release branch this week after merging current
> > outstanding smaller PRs.
>
> It makes sense to me.
>
> Best,
> Rui
>
> On Mon, Jun 10, 2024 at 3:05 PM Gyula Fóra  wrote:
>
> > Hi all!
> >
> > I want to kick off the discussion / release process for the Flink
> > Kubernetes Operator 1.9.0 version.
> >
> > The last, 1.8.0, version was released in March and since then we have
> had a
> > number of important fixes. Furthermore there are some bigger pieces of
> > outstanding work in the form of open PRs such as the Savepoint CRD work
> > which should only be merged to 1.10.0 to gain more exposure/stability.
> >
> > I suggest we cut the release branch this week after merging current
> > outstanding smaller PRs.
> >
> > I volunteer as the release manager but if someone else would like to do
> it,
> > I would also be happy to assist.
> >
> > Please let me know what you think.
> >
> > Cheers,
> > Gyula
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Thomas Weise
Thanks for bringing this discussion back.

When we decided to decouple the connectors, we already discussed that we
will only realize the full benefit when the connectors actually become
independent from the Flink minor releases. Until that happens we have a ton
of extra work but limited gain. Based on the assumption that getting to the
binary compatibility guarantee is our goal - not just for the connectors
managed within the Flink project but for the ecosystem as a whole - I don't
see the benefit of mono repo or similar approach that targets the symptom
rather than the cause.

In the final picture we would only need connector releases if/when a
specific connector changes and the repository per connector layout would
work well.

I also agree with Danny that we may not have to wait for Flink 2.0 for
that. How close are we to assume compatibility of the API surface that
affects connectors? It appears that practically there have been little to
no known issues in the last couple of releases? Would it be possible to
verify that by running e2e tests of connector binaries built against an
earlier Flink minor version against the latest Flink minor release
candidate as part of the release?

Thanks,
Thomas


On Tue, Jun 11, 2024 at 11:05 AM Chesnay Schepler 
wrote:

> On 10/06/2024 18:25, Danny Cranmer wrote:
> > This would
> > mean we would usually not need to release a new connector version per
> Flink
> > version, assuming there are no breaking changes.
> We technically can't do this because we don't provide binary
> compatibility across minor versions.
> That's the entire reason we did this coupling in the first place, and
> imo /we/ shouldn't take a shortcut but still have our users face that
> very problem.
> We knew this was gonna by annoying for us; that was intentional and
> meant to finally push us towards binary compatibility /guarantees/.


Re: [DISCUSS] FLIP-463: Schema Definition in CREATE TABLE AS Statement

2024-06-11 Thread Sergio Pena
Thanks Timo for answering Jeyhun questions.

To add info more about your questions Jeyhun. This proposal is not handling
NULL/NOT_NULL types. I noticed that
the current CTAS impl. (as Timo said) adds this constraint as part of the
resulting schema. And when defining
a primary key in the CREATE part, if the resulting schema does not have a
NOT NULL in the column then the CTAS
will fail. This is similar to the CREATE TABLE LIKE which expects the LIKE
table to have a NOT NULL column if
the user defines a primary key in the CREATE part.

> In some cases, redefining the column types might be redundant, especially
> when users dont change the column type. A user just wants to change the
> column name from the SELECT clause. Should we also support this scenario,
> similar to postgres?

I looked into Postgres too. Postgres matches the columns based on the order
defined in the create and select part.
If you want to rename a column in the create part, then that column
position must be in the same position as the query column.
I didn't like the Postgres approach because it does not let us add columns
that do not exist in the query schema.

i.e. query has schema (a int, b string), now the `a` column is renamed to
`id` because both are in the same position 0
`create table s1(id int) as select a, b from t1`;
results in: [id int, b string]

I think, if users want to rename then they can use a different alias in the
select part. They could also do explicit casting
for changing the data types, which now makes it redundant (as you said) to
allow redefining the query columns again. But
perhaps there are cases where explicit casting does not work and just
defining the column would? i.e. making a nullable
type to not null? I couldn't make `cast(c1 as int not null)` to work for
instance, but it may work in the create part?

> Could you also mention the casting rules in the FLIP for this case?

I mentioned they're the same as insert/select when doing implicit casting.
I will search for more info about the insert/select
and add the casting rules in the flip..

- Sergio


On Tue, Jun 11, 2024 at 12:59 AM Timo Walther  wrote:

> Hi Sergio,
>
> thanks for proposing this FLIP for finalizing the CTAS statement.
> Adopting the logic from MySQL for deriving and potentially overwriting
> parts of the schema should be easy to understand for everyone. So +1 for
> the FLIP in general.
>
>  > How do you handle CTAS statements with SELECT clauses that have
>  > (implicit or explicit) NULLABLE or NOT NULLABLE columns?
>
> @Jeyhun: I don't think there is anything special about this. The current
> CTAS implementation should already cover that. It takes the nullability
> of the column's data type as a constraint into derived schema. Keep in
> mind that nullability is part of the data type in Flink, not only a
> constraint on the schema. This decision was made due to Calcite internals.
>
>  > redefining the column types might be redundant, especially
>  > when users dont change the column type
>
> This is indeed a good point. On one hand, I think we should avoid
> further complicating the syntax. But looking at other vendors [1] this
> seems indeed a valid use case. If it doesn't cause too many special
> cases in the parser (and it's look-ahead), I'm fine with supporting a
> list of column names as well. However, the most important use case will
> be specifying a watermark, metadata columns, or other schema parts that
> are not just columns names.
>
> Regards,
> Timo
>
>
> [1]
>
> https://learn.microsoft.com/en-us/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse?view=azure-sqldw-latest
>
>
> On 10.06.24 21:37, Jeyhun Karimov wrote:
> > Hi Sergio,
> >
> > Thanks for driving this FLIP. +1 for it.
> > I have a few questions:
> >
> > - How do you handle CTAS statements with SELECT clauses that have
> (implicit
> > or explicit) NULLABLE or NOT NULLABLE columns? Could you also mention the
> > casting rules in the FLIP for this case?
> > - In some cases, redefining the column types might be redundant,
> especially
> > when users dont change the column type. For example, a user just wants to
> > change the column name from the SELECT clause.
> > Should we also support this scenario, similar to the postgres [1] ?
> >
> > Regards,
> > Jeyhun
> >
> >
> > [1] https://www.postgresql.org/docs/8.1/sql-createtableas.html
> >
> > On Mon, Jun 10, 2024 at 6:28 PM Sergio Pena  >
> > wrote:
> >
> >> Hi David,
> >>
> >> The CTAS feature is already part of Flink (proposed in FLIP-218 [1]).
> The
> >> new FLIP-463 is just to extend the CTAS syntax to allow for adding new
> >> columns to
> >> the created table that are not part of the generated schema. I think
> >> FLIP-218 [1] was discussed in the mail list somewhere, but I couldn't
> find
> >> the discussion thread.
> >> I was hoping it could contain the answers for your questions as that's
> >> where CTAS was implemented. There's a user doc [2] for it that may help
> >> too.
> >>
> >> But, in a 

Re: Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-11 Thread Jingsong Li
Hi all,

+1 to this FLIP, very thanks all for your proposal.

isDeterministic looks good to me too.

We can consider stating the following points:

1. How to enable custom data distribution? Is it a dynamic hint? Can
you provide an SQL example.

2. What impact will it have when the mainstream is changelog? Causing
disorder? This may need to be emphasized.

3. Does this feature work in batch mode too?

Best,
Jingsong

On Tue, Jun 11, 2024 at 8:22 PM Wencong Liu  wrote:
>
> Hi Lincoln,
>
>
> Thanks for your reply. Weijie and I discussed these two issues offline,
> and here are the results of our discussion:
> 1. When the user utilizes the hash lookup join hint introduced by FLIP-204[1],
> the `SupportsLookupCustomShuffle` interface should be ignored. This is because
> the hash lookup join hint is directly specified by the user through a SQL 
> HINT,
> which is more in line with user intuition. WDYT?
> 2. We agree with the introduction of the `isDeterministic` method. The
> `SupportsLookupCustomShuffle` interface introduces a custom shuffle, which
> can cause ADD/UPDATE_AFTER events (+I, +U) to appear
> after UPDATE_BEFORE/DELETE events (-D, -U), thus breaking the current
> limitations of the Flink Sink Operator[2]. If `isDeterministic` returns false 
> and the
> changelog event type is not insert-only, the Planner should not apply the 
> shuffle
> provided by `SupportsLookupCustomShuffle`.
>
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
> [2] 
> https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
>
>
> Best,
> Wencong
>
>
>
>
>
>
>
>
>
> At 2024-06-11 00:02:57, "Lincoln Lee"  wrote:
> >Hi Weijie,
> >
> >Thanks for your proposal, this will be a useful advanced optimization for
> >connector developers!
> >
> >I have two questions:
> >
> >1. FLIP-204[1] hash lookup join hint is mentioned in this FLIP, what's the
> >apply ordering of the two feature? For example, a connector that
> >implements the `SupportsLookupCustomShuffle` interface also has a
> >`SHUFFLE_HASH` lookup join hint specified by the user in sql, what's
> >the expected behavior?
> >
> >2. This FLIP considers the relationship with NDU processing, and I agree
> >with the current choice to prioritize NDU first. However, we should also
> >consider another issue: out-of-orderness of the changelog events in
> >streaming[2]. If the connector developer supplies a non-deterministic
> >partitioner, e.g., a random partitioner for anti-skew purpose, then it'll
> >break the assumption relied by current SQL operators in streaming: the
> >ADD/UDPATE_AFTER events (+I, +U) always occur before its related
> >UDPATE_BEFORE/DELETE events (-D, -U) and they are always
> >processed by the same task even if a data shuffle is involved. So a
> >straightforward approach would be to add method `isDeterministic` to
> >the `InputDataPartitioner` interface to explicitly tell the planner whether
> >the partitioner is deterministic or not(then the planner can reject the
> >non-deterministic custom partitioner for correctness requirements).
> >
> >[1]
> >https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
> >[2]
> >https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
> >
> >
> >Best,
> >Lincoln Lee
> >
> >
> >Xintong Song  于2024年6月7日周五 13:53写道:
> >
> >> +1 for this proposal.
> >>
> >> This FLIP will make it possible for each lookup join parallel task to only
> >> access and cache a subset of the data. This will significantly improve the
> >> performance and reduce the overhead when using Paimon for the dimension
> >> table. And it's general enough to also be leveraged by other connectors.
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Fri, Jun 7, 2024 at 10:01 AM weijie guo 
> >> wrote:
> >>
> >> > Hi devs,
> >> >
> >> >
> >> > I'd like to start a discussion about FLIP-462[1]: Support Custom Data
> >> > Distribution for Input Stream of Lookup Join.
> >> >
> >> >
> >> > Lookup Join is an important feature in Flink, It is typically used to
> >> > enrich a table with data that is queried from an external system.
> >> > If we interact with the external systems for each incoming record, we
> >> > incur significant network IO and RPC overhead.
> >> >
> >> > Therefore, most connectors introduce caching to reduce the per-record
> >> > level query overhead. However, because the data distribution of Lookup
> >> > Join's input stream is arbitrary, the cache hit rate is sometimes
> >> > unsatisfactory.
> >> >
> >> >
> >> > We want to introduce a mechanism for the connector to tell the Flink
> >> > planner its desired input stream data distribution or partitioning
> >> > strategy. This can significantly reduce the amount of cached data and
> >> > improve performance of Lookup Join.
> >> >
> >> >
> >> > You can find more details in this FLIP[1]. Looking forward to hearing
> >> > from you, 

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Chesnay Schepler

On 10/06/2024 18:25, Danny Cranmer wrote:

This would
mean we would usually not need to release a new connector version per Flink
version, assuming there are no breaking changes.
We technically can't do this because we don't provide binary 
compatibility across minor versions.
That's the entire reason we did this coupling in the first place, and 
imo /we/ shouldn't take a shortcut but still have our users face that 
very problem.
We knew this was gonna by annoying for us; that was intentional and 
meant to finally push us towards binary compatibility /guarantees/.

RE: [VOTE] Release flink-connector-kafka v3.2.0, release candidate #1

2024-06-11 Thread David Radley
Hi Martjin,
Thanks for the confirmation,
Kind regards, David.

From: Martijn Visser 
Date: Tuesday, 11 June 2024 at 15:00
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-kafka v3.2.0, release 
candidate #1
Hi David,

That's a blocker for a Flink Kafka connector 4.0, not for 3.2.0. It's not
related to this release.

Best regards,

Martijn

On Tue, Jun 11, 2024 at 3:54 PM David Radley 
wrote:

> Hi,
> Sorry I am a bit late.
> I notice https://issues.apache.org/jira/browse/FLINK-35109 is open and a
> blocker. Can I confirm that we have mitigated the impacts of this issue in
> this release?
>   Kind regards, David.
>
> From: Danny Cranmer 
> Date: Friday, 7 June 2024 at 11:46
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-kafka v3.2.0,
> release candidate #1
> Thanks all. This vote is now closed, I will announce the results in a
> separate thread.
>
> On Fri, Jun 7, 2024 at 11:45 AM Danny Cranmer 
> wrote:
>
> > +1 (binding)
> >
> > - Release notes look good
> > - Source archive checksum and signature is correct
> > - Binary checksum and signature is correct
> > - Contents of Maven repo looks good
> > - Verified there are no binaries in the source archive
> > - Builds from source, tests pass using Java 8
> > - CI run passed [1]
> > - Tag exists in repo
> > - NOTICE and LICENSE files present and correct
> >
> > Thanks,
> > Danny
> >
> > [1]
> > https://github.com/apache/flink-connector-kafka/actions/runs/8785158288
> >
> >
> > On Fri, Jun 7, 2024 at 7:19 AM Yanquan Lv  wrote:
> >
> >> +1 (non-binding)
> >>
> >> - verified gpg signatures
> >> - verified sha512 hash
> >> - built from source code with java 8/11/17
> >> - checked Github release tag
> >> - checked the CI result
> >> - checked release notes
> >>
> >> Danny Cranmer  于2024年4月22日周一 21:56写道:
> >>
> >> > Hi everyone,
> >> >
> >> > Please review and vote on release candidate #1 for
> flink-connector-kafka
> >> > v3.2.0, as follows:
> >> > [ ] +1, Approve the release
> >> > [ ] -1, Do not approve the release (please provide specific comments)
> >> >
> >> > This release supports Flink 1.18 and 1.19.
> >> >
> >> > The complete staging area is available for your review, which
> includes:
> >> > * JIRA release notes [1],
> >> > * the official Apache source release to be deployed to
> dist.apache.org
> >> > [2],
> >> > which are signed with the key with fingerprint 125FD8DB [3],
> >> > * all artifacts to be deployed to the Maven Central Repository [4],
> >> > * source code tag v3.2.0-rc1 [5],
> >> > * website pull request listing the new release [6].
> >> > * CI build of the tag [7].
> >> >
> >> > The vote will be open for at least 72 hours. It is adopted by majority
> >> > approval, with at least 3 PMC affirmative votes.
> >> >
> >> > Thanks,
> >> > Danny
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354209
> >> > [2]
> >> >
> >> >
> >>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.2.0-rc1
> >> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >> > [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1723
> >> > [5]
> >> >
> https://github.com/apache/flink-connector-kafka/releases/tag/v3.2.0-rc1
> >> > [6] https://github.com/apache/flink-web/pull/738
> >> > [7] https://github.com/apache/flink-connector-kafka
> >> >
> >>
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [VOTE] Release flink-connector-kafka v3.2.0, release candidate #1

2024-06-11 Thread Martijn Visser
Hi David,

That's a blocker for a Flink Kafka connector 4.0, not for 3.2.0. It's not
related to this release.

Best regards,

Martijn

On Tue, Jun 11, 2024 at 3:54 PM David Radley 
wrote:

> Hi,
> Sorry I am a bit late.
> I notice https://issues.apache.org/jira/browse/FLINK-35109 is open and a
> blocker. Can I confirm that we have mitigated the impacts of this issue in
> this release?
>   Kind regards, David.
>
> From: Danny Cranmer 
> Date: Friday, 7 June 2024 at 11:46
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-kafka v3.2.0,
> release candidate #1
> Thanks all. This vote is now closed, I will announce the results in a
> separate thread.
>
> On Fri, Jun 7, 2024 at 11:45 AM Danny Cranmer 
> wrote:
>
> > +1 (binding)
> >
> > - Release notes look good
> > - Source archive checksum and signature is correct
> > - Binary checksum and signature is correct
> > - Contents of Maven repo looks good
> > - Verified there are no binaries in the source archive
> > - Builds from source, tests pass using Java 8
> > - CI run passed [1]
> > - Tag exists in repo
> > - NOTICE and LICENSE files present and correct
> >
> > Thanks,
> > Danny
> >
> > [1]
> > https://github.com/apache/flink-connector-kafka/actions/runs/8785158288
> >
> >
> > On Fri, Jun 7, 2024 at 7:19 AM Yanquan Lv  wrote:
> >
> >> +1 (non-binding)
> >>
> >> - verified gpg signatures
> >> - verified sha512 hash
> >> - built from source code with java 8/11/17
> >> - checked Github release tag
> >> - checked the CI result
> >> - checked release notes
> >>
> >> Danny Cranmer  于2024年4月22日周一 21:56写道:
> >>
> >> > Hi everyone,
> >> >
> >> > Please review and vote on release candidate #1 for
> flink-connector-kafka
> >> > v3.2.0, as follows:
> >> > [ ] +1, Approve the release
> >> > [ ] -1, Do not approve the release (please provide specific comments)
> >> >
> >> > This release supports Flink 1.18 and 1.19.
> >> >
> >> > The complete staging area is available for your review, which
> includes:
> >> > * JIRA release notes [1],
> >> > * the official Apache source release to be deployed to
> dist.apache.org
> >> > [2],
> >> > which are signed with the key with fingerprint 125FD8DB [3],
> >> > * all artifacts to be deployed to the Maven Central Repository [4],
> >> > * source code tag v3.2.0-rc1 [5],
> >> > * website pull request listing the new release [6].
> >> > * CI build of the tag [7].
> >> >
> >> > The vote will be open for at least 72 hours. It is adopted by majority
> >> > approval, with at least 3 PMC affirmative votes.
> >> >
> >> > Thanks,
> >> > Danny
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354209
> >> > [2]
> >> >
> >> >
> >>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.2.0-rc1
> >> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >> > [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1723
> >> > [5]
> >> >
> https://github.com/apache/flink-connector-kafka/releases/tag/v3.2.0-rc1
> >> > [6] https://github.com/apache/flink-web/pull/738
> >> > [7] https://github.com/apache/flink-connector-kafka
> >> >
> >>
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


RE: [VOTE] Release flink-connector-kafka v3.2.0, release candidate #1

2024-06-11 Thread David Radley
Hi,
Sorry I am a bit late.
I notice https://issues.apache.org/jira/browse/FLINK-35109 is open and a 
blocker. Can I confirm that we have mitigated the impacts of this issue in this 
release?
  Kind regards, David.

From: Danny Cranmer 
Date: Friday, 7 June 2024 at 11:46
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-kafka v3.2.0, release 
candidate #1
Thanks all. This vote is now closed, I will announce the results in a
separate thread.

On Fri, Jun 7, 2024 at 11:45 AM Danny Cranmer 
wrote:

> +1 (binding)
>
> - Release notes look good
> - Source archive checksum and signature is correct
> - Binary checksum and signature is correct
> - Contents of Maven repo looks good
> - Verified there are no binaries in the source archive
> - Builds from source, tests pass using Java 8
> - CI run passed [1]
> - Tag exists in repo
> - NOTICE and LICENSE files present and correct
>
> Thanks,
> Danny
>
> [1]
> https://github.com/apache/flink-connector-kafka/actions/runs/8785158288
>
>
> On Fri, Jun 7, 2024 at 7:19 AM Yanquan Lv  wrote:
>
>> +1 (non-binding)
>>
>> - verified gpg signatures
>> - verified sha512 hash
>> - built from source code with java 8/11/17
>> - checked Github release tag
>> - checked the CI result
>> - checked release notes
>>
>> Danny Cranmer  于2024年4月22日周一 21:56写道:
>>
>> > Hi everyone,
>> >
>> > Please review and vote on release candidate #1 for flink-connector-kafka
>> > v3.2.0, as follows:
>> > [ ] +1, Approve the release
>> > [ ] -1, Do not approve the release (please provide specific comments)
>> >
>> > This release supports Flink 1.18 and 1.19.
>> >
>> > The complete staging area is available for your review, which includes:
>> > * JIRA release notes [1],
>> > * the official Apache source release to be deployed to dist.apache.org
>> > [2],
>> > which are signed with the key with fingerprint 125FD8DB [3],
>> > * all artifacts to be deployed to the Maven Central Repository [4],
>> > * source code tag v3.2.0-rc1 [5],
>> > * website pull request listing the new release [6].
>> > * CI build of the tag [7].
>> >
>> > The vote will be open for at least 72 hours. It is adopted by majority
>> > approval, with at least 3 PMC affirmative votes.
>> >
>> > Thanks,
>> > Danny
>> >
>> > [1]
>> >
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354209
>> > [2]
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.2.0-rc1
>> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> > [4]
>> https://repository.apache.org/content/repositories/orgapacheflink-1723
>> > [5]
>> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.2.0-rc1
>> > [6] https://github.com/apache/flink-web/pull/738
>> > [7] https://github.com/apache/flink-connector-kafka
>> >
>>
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [DISCUSS] Proposing an LTS Release for the 1.x Line

2024-06-11 Thread Alexander Fedulov
Hi Matthias,

I think we can include this generic semantic into the writeup of the LTS
definition for the Flink website (last item in the Migration Plan).
Talking about 1.x and 2.x feels more natural than about N.x and N+1.x - I'd
prefer not to overcomplicate things here.
Should the gap before the next major release match this one (eight years),
it would be appropriate to reconsider the project stance and vote anew if
required.

Best,
Alex

On Mon, 27 May 2024 at 09:02, Matthias Pohl  wrote:

> Hi Alex,
> thanks for creating the FLIP. One question: Is it intentionally done that
> the FLIP only covers the 1.x LTS version? Why don't we make the FLIP
> generic to also apply to other major version upgrades?
>
> Best,
> Matthias
>
> On Sat, May 25, 2024 at 4:55 PM Xintong Song 
> wrote:
>
> > >
> > > I think our starting point should be "We don't backport features,
> unless
> > > discussed and agreed on the Dev mailing list".
> >
> >
> > This makes perfect sense. In general, I think we want to encourage users
> to
> > upgrade in order to use the new features. Alternatively, users can also
> > choose to maintain their own custom patches based on the LTS version. I
> > personally don't see why backporting new features to old releases is
> > necessary. In case of exceptions, a mailing list discussion is always a
> > good direction to go.
> >
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Fri, May 24, 2024 at 9:35 PM David Radley 
> > wrote:
> >
> > > Hi Martjin and Alex,
> > > I agree with your summaries, it will be interesting to see what
> requests
> > > there might be for back ports.
> > >  Kind regards, David.
> > >
> > >
> > >
> > >
> > > From: Alexander Fedulov 
> > > Date: Friday, 24 May 2024 at 14:07
> > > To: dev@flink.apache.org 
> > > Subject: [EXTERNAL] Re: [DISCUSS] Proposing an LTS Release for the 1.x
> > Line
> > > @David
> > > > I agree with Martijn that we only put features into version 2. Back
> > > porting to v1 should not be business as usual for features, only for
> > > security and stability changes.
> > > Yep, this choice is explicitly reflected in the FLIP [1]
> > >
> > > @Martijn
> > > >  I think our starting point should be "We don't backport features,
> > unless
> > > discussed and agreed on the Dev mailing list".
> > > I agree - the baseline is that we do not do that. Only if a very
> > compelling
> > > argument is made and the community reaches consensus, exceptions could
> > > potentially be made, but we should try to avoid them.
> > >
> > > [1] https://cwiki.apache.org/confluence/x/BApeEg
> > >
> > > Best,
> > > Alex
> > >
> > > On Fri, 24 May 2024 at 14:38, Martijn Visser  >
> > > wrote:
> > >
> > > > Hi David,
> > > >
> > > > > If there is a maintainer willing to merge backported features to
> v1,
> > as
> > > > it is important to some part of the community, this should be
> allowed,
> > as
> > > > different parts of the community have different priorities and
> > timelines,
> > > >
> > > > I don't think this is a good idea. Backporting a feature can cause
> > issues
> > > > in other components that might be outside the span of expertise of
> the
> > > > maintainer that backported said feature, causing the overall
> stability
> > to
> > > > be degraded. I think our starting point should be "We don't backport
> > > > features, unless discussed and agreed on the Dev mailing list". That
> > > still
> > > > opens up the ability to backport features but makes it clear where
> the
> > > bar
> > > > lies.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > On Fri, May 24, 2024 at 11:21 AM David Radley <
> david_rad...@uk.ibm.com
> > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > > I agree with Martijn that we only put features into version 2. Back
> > > > > porting to v1 should not be business as usual for features, only
> for
> > > > > security and stability changes.
> > > > >
> > > > > If there is a maintainer willing to merge backported features to
> v1,
> > as
> > > > it
> > > > > is important to some part of the community, this should be allowed,
> > as
> > > > > different parts of the community have different priorities and
> > > timelines,
> > > > >  Kind regards, David.
> > > > >
> > > > >
> > > > > From: Alexander Fedulov 
> > > > > Date: Thursday, 23 May 2024 at 18:50
> > > > > To: dev@flink.apache.org 
> > > > > Subject: [EXTERNAL] Re: [DISCUSS] Proposing an LTS Release for the
> > 1.x
> > > > Line
> > > > > Good point, Xintong, I incorporated this item into the FLIP.
> > > > >
> > > > > Best,
> > > > > Alex
> > > > >
> > > > > On Wed, 22 May 2024 at 10:37, Xintong Song 
> > > > wrote:
> > > > >
> > > > > > Thanks, Alex.
> > > > > >
> > > > > > I see one task that needs to be done once the FLIP is approved,
> > which
> > > > I'd
> > > > > > suggest to also mention in the: To explain the LTS policy to
> users
> > on
> > > > > > website / documentation (because FLIP is developer-facing)
> before /
> > > > upon
> > > > > > releasing 1.20.
> > > > > 

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Aleksandr Pilipenko
Hi Danny,

Thank you for bringing this up.
I agree with points made by Ahmed, the split into different repositories
for connectors/connector groups adds flexibility to evolve connectors
without affecting other connectors.

I am also in favor of dropping the Flink version component, although this
will require maintaining a compatibility matrix.
As pointed out by Sergey, nightly/weekly CI builds in connector repos
already indicate compatibility issues in unreleased (SNAPSHOT) versions,
e.g. [1].

1 - https://issues.apache.org/jira/browse/FLINK-34260

Thanks,
Aleksandr

On Tue, 11 Jun 2024 at 13:54, Ahmed Hamdy  wrote:

> Hi Danny,
> Thanks for bringing this up, I might haven't driven a connector release
> myself but I echo the pain and delay in releases for adding Flink version
> support.
> I am not really with the mono-repo approach for the following reasons
> 1- We will lose the flexibility we currently have for connectors (I mean we
> even had a major release 4.X for AWS connectors IIRC).
> 2- We will be undoing the gains for CI decoupling, the frequency of
> contributions in connectors like kafka and AWS are unmatched for others
> like GCP and RabbitMQ, with new connector added and major feature work I
> see unnecessary cost and delays for CI due to monorepo.
>
> I believe the only benefit of this approach (over the other proposed )would
> be forcing adopting common connectors changes to all connectors at once
> instead of relying on divided efforts to port the change, however for the
> reasons mentioned above I still wouldn't vote for this approach.
>
> I am in favor of dropping the version though, I understand it might be
> confusing for users but as Sergey mentioned; many times the changes of a
> new Flink version don't introduce compatibility issues to connectors so I
> believe I believe it might be an easier task than It sound initially.
>
> A question would be what do you think the best approach to when we do
> introduce backward compatible changes to connectors API like in this PR[1],
> in this case existing connectors would still work with the newly released
> Flink version but would rather accumulate technical debt and removing it
> would be an Adhoc task for maintainers which I believe is an accepted
> tradeoff but would love to hear the feedback.
>
> 1-
>
> https://github.com/apache/flink/pull/24180/files#diff-2ffade463560e5941912b91b12a07c888313a4cc7e39ca8700398ed6975b8e90
>
> Best Regards
> Ahmed Hamdy
>
>
> On Tue, 11 Jun 2024 at 08:50, Sergey Nuyanzin  wrote:
>
> > Thanks for starting this discussion Danny
> >
> > I will put my 5 cents here
> >
> > From one side yes, support of new Flink release takes time as it was
> > mentioned above
> > However from another side most of the connectors (main/master branches)
> > supported Flink 1.19
> > even before it was released, same for 1.20 since they were testing
> against
> > master and supported version branches.
> > There are already nightly/weekly jobs (depending on connector)
> > running against the latest Flink SNAPSHOTs. And it has already helped to
> > catch some blocker issues like[1], [2].
> > In fact there are more, I need to spend time retrieving all of them.
> >
> > I would also not vote for connector mono-repo release since we recently
> > just splitted it.
> >
> > The thing I would suggest:
> > since we already have nightly/weekly jobs for connectors testing against
> > Flink main repo master branch
> > we could add a requirement before the release of Flink itself having
> these
> > job results also green.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-34941
> > [2] https://issues.apache.org/jira/browse/FLINK-32978#comment-17804459
> >
> > On Tue, Jun 11, 2024 at 8:24 AM Xintong Song 
> > wrote:
> >
> > > Thanks for bringing this up, Danny. This is indeed an important issue
> > that
> > > the community needs to improve on.
> > >
> > > Personally, I think a mono-repo might not be a bad idea, if we apply
> > > different rules for the connector releases. To be specific:
> > > - flink-connectors 1.19.x contains all connectors that are compatible
> > with
> > > Flink 1.19.x.
> > > - allow not only bug-fixes, but also new features for a third-digit
> > release
> > > (e.g., flink-connectors 1.19.1)
> > >
> > > This would allow us to immediately release flink-connectors 1.19.0
> right
> > > after flink 1.19.0 is out, excluding connectors that are no longer
> > > compatible with flink 1.19. Then we can have a couple of
> flink-connectors
> > > 1.19.x releases, gradually adding the missing connectors back. In the
> > worst
> > > case, this would result in as many releases as having separated
> connector
> > > repose. The benefit comes from 1) there are chances to combine
> releasing
> > of
> > > multiple connectors into one release of the mono repo (if they are
> ready
> > > around the same time), and 2) no need to maintain a compatibility
> matrix
> > > and worrying about it being out-of-sync with the code base.
> > >
> > > However, 

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Muhammet Orazov

Hello Danny,

Thanks for the starting the discussion.

-1 for mono-repo, and -+1 for dropping Flink version.

I have mixed opinion with dropping the Flink version. Usually, large
production migrations happen on Flink versions and users want also
naturally update the connectors compatible for that Flink version.


which is a burden on the community.


Maybe this is another point we should address?

I agree with Sergey's point to have CI builds with SNAPSHOT versions,
which would make updating the versions easily. We could start updating
builds to include SNAPSHOT version if they are missing.

Another suggestion would be to have a dedicated owners (PMC/committers)
of set of connectors that are responsible for these regular update
tasks together with volunteers. Maybe this should be decided similar
to release managers before each planned release.

Best,
Muhammet


On 2024-06-10 16:25, Danny Cranmer wrote:

Hello Flink community,

It has been over 2 years [1] since we started externalizing the Flink
connectors to dedicated repositories from the main Flink code base. The
past discussions can be found here [2]. The community decided to
externalize the connectors to primarily 1/ improve stability and speed 
of
the CI, and 2/ decouple version and release lifecycle to allow the 
projects
to evolve independently. The outcome of this has resulted in each 
connector
requiring a dedicated release per Flink minor version, which is a 
burden on

the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
supported connector followed roughly 2.5 months later on 2024-06-06 [4]
(MongoDB). There are still 5 connectors that do not support Flink 1.19 
[5].


Two decisions contribute to the high lag between releases. 1/ creating 
one
repository per connector instead of a single flink-connector mono-repo 
and

2/ coupling the Flink version to the connector version [6]. A single
connector repository would reduce the number of connector releases from 
N

to 1, but would couple the connector CI and reduce release flexibility.
Decoupling the connector versions from Flink would eliminate the need 
to
release each connector for each new Flink minor version, but we would 
need

a new compatibility mechanism.

I propose that from each next connector release we drop the coupling on 
the
Flink version. For example, instead of 3.4.0-1.20 (.) 
we

would release 3.4.0 (). We can model a compatibility matrix
within the Flink docs to help users pick the correct versions. This 
would
mean we would usually not need to release a new connector version per 
Flink
version, assuming there are no breaking changes. Worst case, if 
breaking

changes impact all connectors we would still need to release all
connectors. However, for Flink 1.17 and 1.18 there were only a handful 
of
issues (breaking changes), and mostly impacting tests. We could decide 
to

align this with Flink 2.0, however I see no compelling reason to do so.
This was discussed previously [2] as a long term goal once the 
connector
APIs are stable. But I think the current compatibility rules support 
this

change now.

I would prefer to not create a connector mono-repo. Separate repos 
gives

each connector more flexibility to evolve independently, and removing
unnecessary releases will significantly reduce the release effort.

I would like to hear opinions and ideas from the community. In 
particular,

are there any other issues you have observed that we should consider
addressing?

Thanks,
Danny.

[1]
https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
[2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
[3]
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
[4] https://flink.apache.org/downloads/#apache-flink-connectors-1
[5] https://issues.apache.org/jira/browse/FLINK-35131
[6]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Ahmed Hamdy
Hi Danny,
Thanks for bringing this up, I might haven't driven a connector release
myself but I echo the pain and delay in releases for adding Flink version
support.
I am not really with the mono-repo approach for the following reasons
1- We will lose the flexibility we currently have for connectors (I mean we
even had a major release 4.X for AWS connectors IIRC).
2- We will be undoing the gains for CI decoupling, the frequency of
contributions in connectors like kafka and AWS are unmatched for others
like GCP and RabbitMQ, with new connector added and major feature work I
see unnecessary cost and delays for CI due to monorepo.

I believe the only benefit of this approach (over the other proposed )would
be forcing adopting common connectors changes to all connectors at once
instead of relying on divided efforts to port the change, however for the
reasons mentioned above I still wouldn't vote for this approach.

I am in favor of dropping the version though, I understand it might be
confusing for users but as Sergey mentioned; many times the changes of a
new Flink version don't introduce compatibility issues to connectors so I
believe I believe it might be an easier task than It sound initially.

A question would be what do you think the best approach to when we do
introduce backward compatible changes to connectors API like in this PR[1],
in this case existing connectors would still work with the newly released
Flink version but would rather accumulate technical debt and removing it
would be an Adhoc task for maintainers which I believe is an accepted
tradeoff but would love to hear the feedback.

1-
https://github.com/apache/flink/pull/24180/files#diff-2ffade463560e5941912b91b12a07c888313a4cc7e39ca8700398ed6975b8e90

Best Regards
Ahmed Hamdy


On Tue, 11 Jun 2024 at 08:50, Sergey Nuyanzin  wrote:

> Thanks for starting this discussion Danny
>
> I will put my 5 cents here
>
> From one side yes, support of new Flink release takes time as it was
> mentioned above
> However from another side most of the connectors (main/master branches)
> supported Flink 1.19
> even before it was released, same for 1.20 since they were testing against
> master and supported version branches.
> There are already nightly/weekly jobs (depending on connector)
> running against the latest Flink SNAPSHOTs. And it has already helped to
> catch some blocker issues like[1], [2].
> In fact there are more, I need to spend time retrieving all of them.
>
> I would also not vote for connector mono-repo release since we recently
> just splitted it.
>
> The thing I would suggest:
> since we already have nightly/weekly jobs for connectors testing against
> Flink main repo master branch
> we could add a requirement before the release of Flink itself having these
> job results also green.
>
> [1] https://issues.apache.org/jira/browse/FLINK-34941
> [2] https://issues.apache.org/jira/browse/FLINK-32978#comment-17804459
>
> On Tue, Jun 11, 2024 at 8:24 AM Xintong Song 
> wrote:
>
> > Thanks for bringing this up, Danny. This is indeed an important issue
> that
> > the community needs to improve on.
> >
> > Personally, I think a mono-repo might not be a bad idea, if we apply
> > different rules for the connector releases. To be specific:
> > - flink-connectors 1.19.x contains all connectors that are compatible
> with
> > Flink 1.19.x.
> > - allow not only bug-fixes, but also new features for a third-digit
> release
> > (e.g., flink-connectors 1.19.1)
> >
> > This would allow us to immediately release flink-connectors 1.19.0 right
> > after flink 1.19.0 is out, excluding connectors that are no longer
> > compatible with flink 1.19. Then we can have a couple of flink-connectors
> > 1.19.x releases, gradually adding the missing connectors back. In the
> worst
> > case, this would result in as many releases as having separated connector
> > repose. The benefit comes from 1) there are chances to combine releasing
> of
> > multiple connectors into one release of the mono repo (if they are ready
> > around the same time), and 2) no need to maintain a compatibility matrix
> > and worrying about it being out-of-sync with the code base.
> >
> > However, one thing I don't like about this approach is that it requires
> > combining all the repos we just separated from the main-repo to another
> > mono-repo. That back-and-forth is annoying. So I'm just speaking out my
> > ideas, but would not strongly insist on this.
> >
> > And big +1 for compatibility tools and ci checks.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jun 11, 2024 at 2:38 AM David Radley 
> > wrote:
> >
> > > Hi Danny,
> > > I think your proposal is a good one. This is the approach that we took
> > > with the Egeria project, firstly taking the connectors out of the main
> > > repo, then connectors having their own versions that incremented
> > > organically rather then tied to the core release.
> > >
> > > Blue sky thinking - I wonder if we could :
> > > - have a wizard / utility so the 

Re: [Discuss] Non-retriable (partial) errors in Elasticsearch 8 connector

2024-06-11 Thread Ahmed Hamdy
Hi Mingliang,
Yes sounds like a good solution, I am not very familiar with ElasticSearch
internals and APIs but will try to assist with the PR when ready.
Best Regards
Ahmed Hamdy


On Tue, 11 Jun 2024 at 07:07, Mingliang Liu  wrote:

> Thank you Ahmed for the explanation.
>
> The current Elasticsearch 8 connector already uses the
> FatalExeptionClassifier for fatal / non-retriable requests [1]. It's very
> similar to what you linked in the AWS connectors. Currently this is only
> used for fully failed requests. The main problem I was concerned about is
> the partial failures, when the Future of the client bulk request was not
> completed exceptionally, but instead some items failed according to the
> response. For those failed entries in the partially failed request, we
> retry infinitely though retrying will not always help.
>
> To avoid problems of "too many failed but non-retryable request entries in
> the buffer", I was thinking we can fail fast instead of infinitely
> retrying. Alternatively, we can limit the maximum number of retrying per
> rerecord. Like FLINK-35541 you shared for AWS connectors, I think a similar
> approach in Elasticsearch 8 connector would be useful. Given a
> non-retriable request entry, it will retry the request entries anyway but
> will eventually fail after exhausting the retries. Having both sound like a
> more comprehensive solution, as following sample:
>
> void handlePartiallyUnprocessedRequest(
> Response response, Consumer requestResult) {
> List requestsToRetry = new ArrayList<>();
>
> for (Request r : response.failedItems()) {
> if (!isRetryable(r.errorCode())// we don't have this
> check for ES 8, which could be 400 / 404
> || r.retryCount++ > maxRetry) {   // FLINK-35541 could
> help limit retries for all failed requests
> throw new FlinkRuntimeException();
> }
> requestsToRetry.add(r);
> }
>
> requestResult.accept(requestsToRetry);
> }
>
> [1]
>
> https://github.com/apache/flink-connector-elasticsearch/blob/da2ef1fa6d5edd3cf1328b11632929fd2c99f567/flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8AsyncWriter.java#L73-L82
>
>
> On Fri, Jun 7, 2024 at 3:42 AM Ahmed Hamdy  wrote:
>
> > Hi Mingliang,
> >
> > We already have a mechanism for detecting and propagating
> > Fatal/Non-retryable exceptions[1]. We can use that in ElasticSearch
> similar
> > to what we do for AWS connectors[2]. Also, you can check AWS connectors
> for
> > how to add a fail-fast mechanism to disable retrying all along.
> >
> > > FLIP-451 proposes timeout for retrying which helps with un-acknowledged
> > > requests, but not addressing the case when request gets processed and
> > > failed items keep failing no matter how many times we retry. Correct me
> > if
> > > I'm wrong
> > >
> > yes you are correct, this is mainly to mitigate the issues arising from
> > incorrect handling of requests in sink implementers.
> > The Failure handling itself has always been assumed to be the Sink
> > implementation responsibility, this is done in 3 levels
> > - Classifying Fatal exceptions as mentioned above
> > - Adding configuration to disable retries as mentioned above as well.
> > - Adding mechanism to limit retries as in the proposed ticket for AWS
> > connectors[3]
> >
> > In my opinion at least 1 and 3 are useful in this case for Elasticsearch,
> > Adding classifiers and retry mechanisms for elasticsearch.
> >
> > Or we can allow users to configure
> > > "drop/fail" behavior for non-retriable errors
> > >
> >
> > I am not sure I follow this proposal, but in general while "Dropping"
> > records seems to boost reliability, it breaks the at-least-once semantics
> > and if you don't have proper tracing and debugging mechanisms we will be
> > shooting ourselves in the foot.
> >
> >
> > 1-
> >
> >
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java
> > 2-
> >
> >
> https://github.com/apache/flink-connector-aws/blob/c688a8545ac1001c8450e8c9c5fe8bbafa13aeba/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/sink/DynamoDbSinkWriter.java#L227
> >
> > 3-https://issues.apache.org/jira/browse/FLINK-35541
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Thu, 6 Jun 2024 at 06:53, Mingliang Liu  wrote:
> >
> > > Hi all,
> > >
> > > Currently the Elasticsearch 8 connector retries all items if the
> request
> > > fails as a whole, and retries failed items if the request has partial
> > > failures [1]. I think this infinitely retries might be problematic in
> > some
> > > cases when retrying can never eventually succeed. For example, if the
> > > request is 400 (bad request) or 404 (not found), retries do not help.
> If
> > > there are too many failed items non-retriable, new 

Re:Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-11 Thread Wencong Liu
Hi Lincoln,


Thanks for your reply. Weijie and I discussed these two issues offline, 
and here are the results of our discussion:
1. When the user utilizes the hash lookup join hint introduced by FLIP-204[1],
the `SupportsLookupCustomShuffle` interface should be ignored. This is because
the hash lookup join hint is directly specified by the user through a SQL HINT, 
which is more in line with user intuition. WDYT?
2. We agree with the introduction of the `isDeterministic` method. The 
`SupportsLookupCustomShuffle` interface introduces a custom shuffle, which 
can cause ADD/UPDATE_AFTER events (+I, +U) to appear 
after UPDATE_BEFORE/DELETE events (-D, -U), thus breaking the current 
limitations of the Flink Sink Operator[2]. If `isDeterministic` returns false 
and the
changelog event type is not insert-only, the Planner should not apply the 
shuffle 
provided by `SupportsLookupCustomShuffle`.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
[2] 
https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness


Best,
Wencong









At 2024-06-11 00:02:57, "Lincoln Lee"  wrote:
>Hi Weijie,
>
>Thanks for your proposal, this will be a useful advanced optimization for
>connector developers!
>
>I have two questions:
>
>1. FLIP-204[1] hash lookup join hint is mentioned in this FLIP, what's the
>apply ordering of the two feature? For example, a connector that
>implements the `SupportsLookupCustomShuffle` interface also has a
>`SHUFFLE_HASH` lookup join hint specified by the user in sql, what's
>the expected behavior?
>
>2. This FLIP considers the relationship with NDU processing, and I agree
>with the current choice to prioritize NDU first. However, we should also
>consider another issue: out-of-orderness of the changelog events in
>streaming[2]. If the connector developer supplies a non-deterministic
>partitioner, e.g., a random partitioner for anti-skew purpose, then it'll
>break the assumption relied by current SQL operators in streaming: the
>ADD/UDPATE_AFTER events (+I, +U) always occur before its related
>UDPATE_BEFORE/DELETE events (-D, -U) and they are always
>processed by the same task even if a data shuffle is involved. So a
>straightforward approach would be to add method `isDeterministic` to
>the `InputDataPartitioner` interface to explicitly tell the planner whether
>the partitioner is deterministic or not(then the planner can reject the
>non-deterministic custom partitioner for correctness requirements).
>
>[1]
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
>[2]
>https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
>
>
>Best,
>Lincoln Lee
>
>
>Xintong Song  于2024年6月7日周五 13:53写道:
>
>> +1 for this proposal.
>>
>> This FLIP will make it possible for each lookup join parallel task to only
>> access and cache a subset of the data. This will significantly improve the
>> performance and reduce the overhead when using Paimon for the dimension
>> table. And it's general enough to also be leveraged by other connectors.
>>
>> Best,
>>
>> Xintong
>>
>>
>>
>> On Fri, Jun 7, 2024 at 10:01 AM weijie guo 
>> wrote:
>>
>> > Hi devs,
>> >
>> >
>> > I'd like to start a discussion about FLIP-462[1]: Support Custom Data
>> > Distribution for Input Stream of Lookup Join.
>> >
>> >
>> > Lookup Join is an important feature in Flink, It is typically used to
>> > enrich a table with data that is queried from an external system.
>> > If we interact with the external systems for each incoming record, we
>> > incur significant network IO and RPC overhead.
>> >
>> > Therefore, most connectors introduce caching to reduce the per-record
>> > level query overhead. However, because the data distribution of Lookup
>> > Join's input stream is arbitrary, the cache hit rate is sometimes
>> > unsatisfactory.
>> >
>> >
>> > We want to introduce a mechanism for the connector to tell the Flink
>> > planner its desired input stream data distribution or partitioning
>> > strategy. This can significantly reduce the amount of cached data and
>> > improve performance of Lookup Join.
>> >
>> >
>> > You can find more details in this FLIP[1]. Looking forward to hearing
>> > from you, thanks!
>> >
>> >
>> > Best regards,
>> >
>> > Weijie
>> >
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join
>> >
>>


Re: [DISCUSS] The performance of serializerHeavyString starts regress since April 3

2024-06-11 Thread Zakelly Lan
Thanks Sam for your investigation.

I revisited the logs and confirmed that the JDK has never changed.

'java -version' get:

> openjdk version "11.0.19" 2023-04-18 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS,
> mixed mode, sharing)


It is installed via yum, 'yum info' get:

> Name : java-11-openjdk
> Epoch: 1
> Version  : 11.0.19.0.7
> Release  : 4.0.3.al8
> Architecture : x86_64
> Size : 1.3 M
> Source   : java-11-openjdk-11.0.19.0.7-4.0.3.al8.src.rpm
> Repository   : @System
> From repo: alinux3-updates
> Summary  : OpenJDK 11 Runtime Environment
> URL  : http://openjdk.java.net/
> License  : ASL 1.1 and ASL 2.0 and BSD and BSD with advertising and
> GPL+ and
>  : GPLv2 and GPLv2 with exceptions and IJG and LGPLv2+ and MIT
> and
>  : MPLv2.0 and Public Domain and W3C and zlib and ISC and FTL
> and
>  : RSA

Description  : The OpenJDK 11 runtime environment.


The benchmark env is hosted on Aliyun, the OS and JVM are also released by
Aliyun.


And thanks for your PR, I will try to put it in our daily run soon.


Best,
Zakelly

On Mon, Jun 10, 2024 at 9:24 AM Sam Barker  wrote:

> After completing the side quest
> [1] of enabling async
> profiler when running the JMH benchmarks I've been unable to reproduce the
> performance change between the last known good run and the first run
> highlighted as a regression.
> Results from my fedora f40 workstation using
>
> # JMH version: 1.37
> # VM version: JDK 11.0.23, OpenJDK 64-Bit Server VM, 11.0.23+9
> # VM invoker: /home/sam/.sdkman/candidates/java/11.0.23-tem/bin/java
> # VM options: -Djava.rmi.server.hostname=127.0.0.1
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.ssl
> # Blackhole mode: full + dont-inline hint (auto-detected, use
> -Djmh.blackhole.autoDetect=false to disable)
>
> ───┬
>│ File: /tmp/profile-results/163b9cca6d2/jmh-result.csv
>
> ───┼
>1   │ "Benchmark","Mode","Threads","Samples","Score","Score Error
> (99.9%)","Unit"
>2   │
>
> "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString","thrpt",1,30,179.453066,5.725733,"ops/ms"
>3   │
>
> "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString:async","thrpt",1,1,NaN,NaN,"---"
>
> ───┴
>
> ───┬
>│ File: /tmp/profile-results/f38d8ca43f6/jmh-result.csv
>
> ───┼
>1   │ "Benchmark","Mode","Threads","Samples","Score","Score Error
> (99.9%)","Unit"
>2   │
>
> "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString","thrpt",1,30,178.861842,6.711582,"ops/ms"
>3   │
>
> "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString:async","thrpt",1,1,NaN,NaN,"---"
>
> ───┴──
>
> Where f38d8ca43f6 is the last known good run and 163b9cca6d2 is the first
> regression.
>
> One question I have from comparing my local results to those on flink-speed
> <
> https://flink-speed.xyz/timeline/#/?exe=6=serializerHeavyString=on=on=off=3=200
> >[2]
> is it possible the JDK version changed between the runs (I don't see the
> actual JDK build listed anywhere so I can't check versions or
> distributions)?
>
> I've also tried comparing building flink with the java11-target profile vs
> the default JDK 8 build and that does not change the performance.
>
> Sam
>
> [1] https://github.com/apache/flink-benchmarks/pull/90
> [2]
>
> https://flink-speed.xyz/timeline/#/?exe=6=serializerHeavyString=on=on=off=3=200
>
> On Wed, 29 May 2024 at 16:53, Sam Barker  wrote:
>
> > > I guess that improvement is a fluctuation. You can double check the
> > performance results[1] of the last few days. The performance isn't
> > recovered.
> >
> > Hmm yeah the improvement was a fluctuation and smaller than I remembered
> > seeing (maybe I had zoomed into the timeline 

[jira] [Created] (FLINK-35567) CDC BinaryWriter cast NullableSerializerWrapper error

2024-06-11 Thread Hongshun Wang (Jira)
Hongshun Wang created FLINK-35567:
-

 Summary: CDC BinaryWriter cast NullableSerializerWrapper error 
 Key: FLINK-35567
 URL: https://issues.apache.org/jira/browse/FLINK-35567
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: Hongshun Wang
 Fix For: cdc-3.1.1


Current, we will generate data type serializers by 
org.apache.flink.cdc.runtime.typeutils.BinaryRecordDataGenerator#BinaryRecordDataGenerator(org.apache.flink.cdc.common.types.DataType[]),
 which will put into a 
NullableSerializerWrapper.
{code:java}
//代码占位符
public BinaryRecordDataGenerator(DataType[] dataTypes) {
this(
dataTypes,
Arrays.stream(dataTypes)
.map(InternalSerializers::create)
.map(NullableSerializerWrapper::new)
.toArray(TypeSerializer[]::new));
} {code}
However, when use in BinaryWriter#write, if type is ARRAY/MAP/ROW, will cast 
NullableSerializerWrapper to 
ArrayDataSerializer/TypeSerializer/TypeSerializer.
A exception will be thrown:
{code:java}
java.lang.ClassCastException: 
org.apache.flink.cdc.runtime.serializer.NullableSerializerWrapper cannot be 
cast to org.apache.flink.cdc.runtime.serializer.data.ArrayDataSerializer
at 
org.apache.flink.cdc.runtime.serializer.data.writer.BinaryWriter.write(BinaryWriter.java:134)
at 
org.apache.flink.cdc.runtime.typeutils.BinaryRecordDataGenerator.generate(BinaryRecordDataGenerator.java:89)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35566) Consider promoting TypeSerializer from PublicEvolving to Public

2024-06-11 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-35566:
--

 Summary: Consider promoting TypeSerializer from PublicEvolving to 
Public
 Key: FLINK-35566
 URL: https://issues.apache.org/jira/browse/FLINK-35566
 Project: Flink
  Issue Type: Technical Debt
  Components: API / Core
Reporter: Martijn Visser


While working on implementing FLINK-35378, I ran into the problem that 
TypeSerializer is still on PublicEvolving since Flink 1.0. We should consider 
annotating this as Public. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35565) Flink KafkaSource Batch Job Gets Into Infinite Loop after Resetting Offset

2024-06-11 Thread Naci Simsek (Jira)
Naci Simsek created FLINK-35565:
---

 Summary: Flink KafkaSource Batch Job Gets Into Infinite Loop after 
Resetting Offset
 Key: FLINK-35565
 URL: https://issues.apache.org/jira/browse/FLINK-35565
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.1.0
 Environment: This is reproduced on a *Flink 1.18.1* with the latest 
Kafka connector 3.1.0-1.18 on a session cluster.
Reporter: Naci Simsek
 Attachments: image-2024-06-11-11-19-09-889.png, 
taskmanager_localhost_54489-ac092a_log.txt

h2. Summary

Flink batch job gets into an infinite fetch loop and could not gracefully 
finish if the connected Kafka topic is empty and starting offset value in Flink 
job is lower than the current start/end offset of the related topic. See below 
for details:
h2. How to reproduce

Flink +*batch*+ job which works as a {*}KafkaSource{*}, will consume events 
from Kafka topic.

Related Kafka topic is empty, there are no events, and the offset value is as 
below: *15*

!image-2024-06-11-11-19-09-889.png|width=895,height=256!

 

Flink job uses a *specific starting offset* value, which is +*less*+ than the 
current offset of the topic/partition.

See below, it set as “4”

{{}}
{code:java}
package naci.grpId;

import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.connector.kafka.source.KafkaSource;
import 
org.apache.flink.connector.kafka.source.enumerator.initializer.OffsetsInitializer;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.kafka.common.TopicPartition;

import java.util.HashMap;
import java.util.Map;

public class KafkaSource_Print {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);

// Define the specific offsets for the partitions
Map specificOffsets = new HashMap<>();
specificOffsets.put(new TopicPartition("topic_test", 0), 4L); // Start 
from offset 4 for partition 0

KafkaSource kafkaSource = KafkaSource
.builder()
.setBootstrapServers("localhost:9093")  // Make sure the port 
is correct
.setTopics("topic_test")
.setValueOnlyDeserializer(new SimpleStringSchema())
.setStartingOffsets(OffsetsInitializer.offsets(specificOffsets))
.setBounded(OffsetsInitializer.latest())
.build();

DataStream stream = env.fromSource(
kafkaSource,
WatermarkStrategy.noWatermarks(),
"Kafka Source"
);
stream.print();

env.execute("Flink KafkaSource test job");
}
}{code}
{{}}

 

Here are the initial logs printed related to the offset, as soon as the job 
gets submitted:

{{}}
{code:java}
2024-05-30 12:15:50,010 INFO  
org.apache.flink.connector.base.source.reader.SourceReaderBase [] - Adding 
split(s) to reader: [[Partition: topic_test-0, StartingOffset: 4, 
StoppingOffset: 15]]
2024-05-30 12:15:50,069 DEBUG 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher [] - Prepare 
to run AddSplitsTask: [[[Partition: topic_test-0, StartingOffset: 4, 
StoppingOffset: 15]]]
2024-05-30 12:15:50,074 TRACE 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader [] - 
Seeking starting offsets to specified offsets: {topic_test-0=4}
2024-05-30 12:15:50,074 INFO  org.apache.kafka.clients.consumer.KafkaConsumer   
   [] - [Consumer clientId=KafkaSource--2381765882724812354-0, 
groupId=null] Seeking to offset 4 for partition topic_test-0
2024-05-30 12:15:50,075 DEBUG 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader [] - 
SplitsChange handling result: [topic_test-0, start:4, stop: 15]
2024-05-30 12:15:50,075 DEBUG 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher [] - 
Finished running task AddSplitsTask: [[[Partition: topic_test-0, 
StartingOffset: 4, StoppingOffset: 15]]]
2024-05-30 12:15:50,075 DEBUG 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher [] - Prepare 
to run FetchTask{code}
{{}}

 

Since the starting offset {color:#FF}*4*{color} is *out of range* for the 
Kafka topic, KafkaConsumer initiates an {*}offset +reset+{*}, as seen on task 
manager logs:

{{}}
{code:java}
2024-05-30 12:15:50,193 INFO  
org.apache.kafka.clients.consumer.internals.Fetcher  [] - [Consumer 
clientId=KafkaSource--2381765882724812354-0, groupId=null] Fetch position 
FetchPosition{offset=4, offsetEpoch=Optional.empty, 

Re: [VOTE] FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-11 Thread Yuepeng Pan
+1 (non-binding)

Best regards,
Yuepeng Pan

At 2024-06-11 16:34:12, "Rui Fan" <1996fan...@gmail.com> wrote:
>+1(binding)
>
>Best,
>Rui
>
>On Tue, Jun 11, 2024 at 4:14 PM Muhammet Orazov
> wrote:
>
>> +1 (non-binding)
>>
>> Thanks Yuxin for driving this!
>>
>> Best,
>> Muhammet
>>
>>
>> On 2024-06-07 08:02, Yuxin Tan wrote:
>> > Hi everyone,
>> >
>> > Thanks for all the feedback about the FLIP-459 Support Flink
>> > hybrid shuffle integration with Apache Celeborn[1].
>> > The discussion thread is here [2].
>> >
>> > I'd like to start a vote for it. The vote will be open for at least
>> > 72 hours unless there is an objection or insufficient votes.
>> >
>> > [1]
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-459%3A+Support+Flink+hybrid+shuffle+integration+with+Apache+Celeborn
>> > [2] https://lists.apache.org/thread/gy7sm7qrf7yrv1rl5f4vtk5fo463ts33
>> >
>> > Best,
>> > Yuxin
>>


RE: Flink Kubernetes Operator 1.9.0 release planning

2024-06-11 Thread David Radley
I agree – thanks for driving this Gyula.

From: Rui Fan <1996fan...@gmail.com>
Date: Tuesday, 11 June 2024 at 02:52
To: dev@flink.apache.org 
Cc: Mate Czagany 
Subject: [EXTERNAL] Re: Flink Kubernetes Operator 1.9.0 release planning
Thanks Gyula for driving this release!

> I suggest we cut the release branch this week after merging current
> outstanding smaller PRs.

It makes sense to me.

Best,
Rui

On Mon, Jun 10, 2024 at 3:05 PM Gyula Fóra  wrote:

> Hi all!
>
> I want to kick off the discussion / release process for the Flink
> Kubernetes Operator 1.9.0 version.
>
> The last, 1.8.0, version was released in March and since then we have had a
> number of important fixes. Furthermore there are some bigger pieces of
> outstanding work in the form of open PRs such as the Savepoint CRD work
> which should only be merged to 1.10.0 to gain more exposure/stability.
>
> I suggest we cut the release branch this week after merging current
> outstanding smaller PRs.
>
> I volunteer as the release manager but if someone else would like to do it,
> I would also be happy to assist.
>
> Please let me know what you think.
>
> Cheers,
> Gyula
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-11 Thread Hang Ruan
+1(non-binding)

- Verified signatures
- Verified hashsums
- Checked Github release tag
- Source archives with no binary files
- Reviewed the flink-web PR
- Checked the jar build with jdk 1.8

Best,
Hang

gongzhongqiang  于2024年6月11日周二 15:53写道:

> +1(non-binding)
>
> - Verified signatures and sha512
> - Checked Github release tag exsit
> - Source archives with no binary files
> - Build the source with jdk8 on ubuntu 22.04 succeed
> - Reviewed the flink-web PR
>
> Best,
> Zhongqiang Gong
>
> Hong Liang  于2024年6月6日周四 23:39写道:
>
> > Hi everyone,
> > Please review and vote on the release candidate #1 for the flink v1.19.1,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint B78A5EA1 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.19.1-rc1" [5],
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Hong
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354399
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.1-rc1/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1736/
> > [5] https://github.com/apache/flink/releases/tag/release-1.19.1-rc1
> > [6] https://github.com/apache/flink-web/pull/745
> >
>


Re: [VOTE] FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-11 Thread Rui Fan
+1(binding)

Best,
Rui

On Tue, Jun 11, 2024 at 4:14 PM Muhammet Orazov
 wrote:

> +1 (non-binding)
>
> Thanks Yuxin for driving this!
>
> Best,
> Muhammet
>
>
> On 2024-06-07 08:02, Yuxin Tan wrote:
> > Hi everyone,
> >
> > Thanks for all the feedback about the FLIP-459 Support Flink
> > hybrid shuffle integration with Apache Celeborn[1].
> > The discussion thread is here [2].
> >
> > I'd like to start a vote for it. The vote will be open for at least
> > 72 hours unless there is an objection or insufficient votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-459%3A+Support+Flink+hybrid+shuffle+integration+with+Apache+Celeborn
> > [2] https://lists.apache.org/thread/gy7sm7qrf7yrv1rl5f4vtk5fo463ts33
> >
> > Best,
> > Yuxin
>


Re: [DISCUSS] Connector releases for Flink 1.19

2024-06-11 Thread Danny Cranmer
Hey Sergey,

I have completed the 3 tasks. Let me know if you need anything else.

Thanks,
Danny

On Tue, Jun 11, 2024 at 9:11 AM Danny Cranmer 
wrote:

> Thanks for driving this Sergey, I will pick up the PMC tasks.
>
> Danny
>
> On Sun, Jun 9, 2024 at 11:09 PM Sergey Nuyanzin 
> wrote:
>
>> Hi everyone,
>>
>> as you might noticed the voting threads for release of flink-opensearch
>> connectors (v1, v2) received 3+ binding votes[1][2]
>>
>> Now I need PMC help to finish the release of these 2.
>> I'm on the step "Deploy source and binary releases to dist.apache.org"[3]
>> for both versions
>> An attempt to execute this step says that access is forbidden, I assume
>> that only PMC have access to do it
>>
>> I guess there are 3 things which should be done and where only PMC have
>> access and where I need help
>> 1. As mentioned deploying to dist.apache.org
>> 2. Mark version as released on jira
>> 3. Place a record on reporter.apache.org about the release
>>
>> Would be great if someone from PMC could help here
>>
>> [1] https://lists.apache.org/thread/kno142xdzqg592tszmnonk69nl14xszl
>> [2] https://lists.apache.org/thread/by44cdpfv6p9394vwxhh1vzh3rfskzms
>> [3]
>> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+flink-connector+release#Creatingaflinkconnectorrelease-Deploysourceandbinaryreleasestodist.apache.org
>>
>> On Fri, May 31, 2024 at 7:53 AM Sergey Nuyanzin 
>> wrote:
>>
>>> Hi Jing,
>>> >Thanks for the hint wrt JDBC connector. Where could users know that it
>>> > already supports 1.19?
>>>
>>> There is no released version supporting/tested against 1.19
>>> However the support was added within [1] and currently
>>> there is an active RC in voting stage containing this fix [2]
>>>
>>> [1]
>>> https://github.com/apache/flink-connector-jdbc/commit/7025642d88ff661e486745b23569595e1813a1d0
>>> [2] https://lists.apache.org/thread/b7xbjo4crt1527ldksw4nkwo8vs56csy
>>>
>>
>>
>> --
>> Best regards,
>> Sergey
>>
>


Re: [VOTE] FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-11 Thread Muhammet Orazov

+1 (non-binding)

Thanks Yuxin for driving this!

Best,
Muhammet


On 2024-06-07 08:02, Yuxin Tan wrote:

Hi everyone,

Thanks for all the feedback about the FLIP-459 Support Flink
hybrid shuffle integration with Apache Celeborn[1].
The discussion thread is here [2].

I'd like to start a vote for it. The vote will be open for at least
72 hours unless there is an objection or insufficient votes.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-459%3A+Support+Flink+hybrid+shuffle+integration+with+Apache+Celeborn
[2] https://lists.apache.org/thread/gy7sm7qrf7yrv1rl5f4vtk5fo463ts33

Best,
Yuxin


Re: [DISCUSS] Connector releases for Flink 1.19

2024-06-11 Thread Danny Cranmer
Thanks for driving this Sergey, I will pick up the PMC tasks.

Danny

On Sun, Jun 9, 2024 at 11:09 PM Sergey Nuyanzin  wrote:

> Hi everyone,
>
> as you might noticed the voting threads for release of flink-opensearch
> connectors (v1, v2) received 3+ binding votes[1][2]
>
> Now I need PMC help to finish the release of these 2.
> I'm on the step "Deploy source and binary releases to dist.apache.org"[3]
> for both versions
> An attempt to execute this step says that access is forbidden, I assume
> that only PMC have access to do it
>
> I guess there are 3 things which should be done and where only PMC have
> access and where I need help
> 1. As mentioned deploying to dist.apache.org
> 2. Mark version as released on jira
> 3. Place a record on reporter.apache.org about the release
>
> Would be great if someone from PMC could help here
>
> [1] https://lists.apache.org/thread/kno142xdzqg592tszmnonk69nl14xszl
> [2] https://lists.apache.org/thread/by44cdpfv6p9394vwxhh1vzh3rfskzms
> [3]
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+flink-connector+release#Creatingaflinkconnectorrelease-Deploysourceandbinaryreleasestodist.apache.org
>
> On Fri, May 31, 2024 at 7:53 AM Sergey Nuyanzin 
> wrote:
>
>> Hi Jing,
>> >Thanks for the hint wrt JDBC connector. Where could users know that it
>> > already supports 1.19?
>>
>> There is no released version supporting/tested against 1.19
>> However the support was added within [1] and currently
>> there is an active RC in voting stage containing this fix [2]
>>
>> [1]
>> https://github.com/apache/flink-connector-jdbc/commit/7025642d88ff661e486745b23569595e1813a1d0
>> [2] https://lists.apache.org/thread/b7xbjo4crt1527ldksw4nkwo8vs56csy
>>
>
>
> --
> Best regards,
> Sergey
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-11 Thread Muhammet Orazov

Congratulations Rui, well deserved!

Best,
Muhammet

On 2024-06-05 10:01, Piotr Nowojski wrote:

Hi everyone,

On behalf of the PMC, I'm very happy to announce another new Apache 
Flink

PMC Member - Fan Rui.

Rui has been active in the community since August 2019. During this 
time he

has contributed a lot of new features. Among others:
  - Decoupling Autoscaler from Kubernetes Operator, and supporting
Standalone Autoscaler
  - Improvements to checkpointing, flamegraphs, restart strategies,
watermark alignment, network shuffles
  - Optimizing the memory and CPU usage of large operators, greatly
reducing the risk and probability of TaskManager OOM

He reviewed a significant amount of PRs and has been active both on the
mailing lists and in Jira helping to both maintain and grow Apache 
Flink's

community. He is also our current Flink 1.20 release manager.

In the last 12 months, Rui has been the most active contributor in the
Flink Kubernetes Operator project, while being the 2nd most active 
Flink

contributor at the same time.

Please join me in welcoming and congratulating Fan Rui!

Best,
Piotrek (on behalf of the Flink PMC)


Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-11 Thread gongzhongqiang
+1(non-binding)

- Verified signatures and sha512
- Checked Github release tag exsit
- Source archives with no binary files
- Build the source with jdk8 on ubuntu 22.04 succeed
- Reviewed the flink-web PR

Best,
Zhongqiang Gong

Hong Liang  于2024年6月6日周四 23:39写道:

> Hi everyone,
> Please review and vote on the release candidate #1 for the flink v1.19.1,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint B78A5EA1 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.19.1-rc1" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Hong
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354399
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.1-rc1/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1736/
> [5] https://github.com/apache/flink/releases/tag/release-1.19.1-rc1
> [6] https://github.com/apache/flink-web/pull/745
>


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Sergey Nuyanzin
Thanks for starting this discussion Danny

I will put my 5 cents here

>From one side yes, support of new Flink release takes time as it was
mentioned above
However from another side most of the connectors (main/master branches)
supported Flink 1.19
even before it was released, same for 1.20 since they were testing against
master and supported version branches.
There are already nightly/weekly jobs (depending on connector)
running against the latest Flink SNAPSHOTs. And it has already helped to
catch some blocker issues like[1], [2].
In fact there are more, I need to spend time retrieving all of them.

I would also not vote for connector mono-repo release since we recently
just splitted it.

The thing I would suggest:
since we already have nightly/weekly jobs for connectors testing against
Flink main repo master branch
we could add a requirement before the release of Flink itself having these
job results also green.

[1] https://issues.apache.org/jira/browse/FLINK-34941
[2] https://issues.apache.org/jira/browse/FLINK-32978#comment-17804459

On Tue, Jun 11, 2024 at 8:24 AM Xintong Song  wrote:

> Thanks for bringing this up, Danny. This is indeed an important issue that
> the community needs to improve on.
>
> Personally, I think a mono-repo might not be a bad idea, if we apply
> different rules for the connector releases. To be specific:
> - flink-connectors 1.19.x contains all connectors that are compatible with
> Flink 1.19.x.
> - allow not only bug-fixes, but also new features for a third-digit release
> (e.g., flink-connectors 1.19.1)
>
> This would allow us to immediately release flink-connectors 1.19.0 right
> after flink 1.19.0 is out, excluding connectors that are no longer
> compatible with flink 1.19. Then we can have a couple of flink-connectors
> 1.19.x releases, gradually adding the missing connectors back. In the worst
> case, this would result in as many releases as having separated connector
> repose. The benefit comes from 1) there are chances to combine releasing of
> multiple connectors into one release of the mono repo (if they are ready
> around the same time), and 2) no need to maintain a compatibility matrix
> and worrying about it being out-of-sync with the code base.
>
> However, one thing I don't like about this approach is that it requires
> combining all the repos we just separated from the main-repo to another
> mono-repo. That back-and-forth is annoying. So I'm just speaking out my
> ideas, but would not strongly insist on this.
>
> And big +1 for compatibility tools and ci checks.
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jun 11, 2024 at 2:38 AM David Radley 
> wrote:
>
> > Hi Danny,
> > I think your proposal is a good one. This is the approach that we took
> > with the Egeria project, firstly taking the connectors out of the main
> > repo, then connectors having their own versions that incremented
> > organically rather then tied to the core release.
> >
> > Blue sky thinking - I wonder if we could :
> > - have a wizard / utility so the user inputs which Flink level they want
> > and which connectors; the utility knows the compatibility matrix and
> > downloads the appropriate bundles.
> > - have the docs interrogate the core and connector repos to check the
> poms
> > for the Flink levels and the pr builds to have ?live? docs showing the
> > supported Flink levels. PyTorch does something like this for it?s docs.
> >
> > Kind regards, David.
> >
> >
> >
> > From: Danny Cranmer 
> > Date: Monday, 10 June 2024 at 17:26
> > To: dev 
> > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> > Hello Flink community,
> >
> > It has been over 2 years [1] since we started externalizing the Flink
> > connectors to dedicated repositories from the main Flink code base. The
> > past discussions can be found here [2]. The community decided to
> > externalize the connectors to primarily 1/ improve stability and speed of
> > the CI, and 2/ decouple version and release lifecycle to allow the
> projects
> > to evolve independently. The outcome of this has resulted in each
> connector
> > requiring a dedicated release per Flink minor version, which is a burden
> on
> > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> > supported connector followed roughly 2.5 months later on 2024-06-06 [4]
> > (MongoDB). There are still 5 connectors that do not support Flink 1.19
> [5].
> >
> > Two decisions contribute to the high lag between releases. 1/ creating
> one
> > repository per connector instead of a single flink-connector mono-repo
> and
> > 2/ coupling the Flink version to the connector version [6]. A single
> > connector repository would reduce the number of connector releases from N
> > to 1, but would couple the connector CI and reduce release flexibility.
> > Decoupling the connector versions from Flink would eliminate the need to
> > release each connector for each new Flink minor version, but we would
> need
> > a new compatibility mechanism.
> >
> > I 

Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-11 Thread Leonard Xu
+1 (binding)

- verified signatures
- verified hashsums
- checked Github release tag 
- checked release notes
- reviewed all Jira issues for 1.19.1 have been resolved
- reviewed the web PR 

Best,
Leonard

> 2024年6月11日 下午3:19,Sergey Nuyanzin  写道:
> 
> +1 (non-binding)
> 
> - Downloaded all the artifacts
> - Verified checksums and signatures
> - Verified that source archives do not contain any binaries
> - Built from source with jdk8
> - Ran a simple wordcount job on local standalone cluster
> 
> On Tue, Jun 11, 2024 at 8:36 AM Matthias Pohl  wrote:
> 
>> +1 (binding)
>> 
>> * Downloaded all artifacts
>> * Extracted sources and ran compilation on sources
>> * Diff of git tag checkout with downloaded sources
>> * Verified SHA512 & GPG checksums
>> * Checked that all POMs have the right expected version
>> * Generated diffs to compare pom file changes with NOTICE files
>> * Verified WordCount in batch mode and streaming mode with a standalone
>> session cluster to verify the logs: no suspicious behavior observed
>> 
>> Best,
>> Matthias
>> 
>> On Mon, Jun 10, 2024 at 12:54 PM Hong Liang  wrote:
>> 
>>> Thanks for testing the release candidate, everyone. Nice to see coverage
>> on
>>> different types of testing being done.
>>> 
>>> I've addressed the comments on the web PR - thanks Rui Fan for good
>>> comments, and for the reminder from Ahmed :)
>>> 
>>> We have <24 hours on the vote wait time, and still waiting on 1 more
>>> binding vote!
>>> 
>>> Regards,
>>> Hong
>>> 
>>> On Sat, Jun 8, 2024 at 11:33 PM Ahmed Hamdy 
>> wrote:
>>> 
 Hi Hong,
 Thanks for driving
 
 +1 (non-binding)
 
 - Verified signatures and hashes
 - Checked github release tag
 - Verified licenses
 - Checked that the source code does not contain binaries
 - Reviewed Web PR, nit: Could we address the comment of adding
>>> FLINK-34633
 in the release
 
 
 Best Regards
 Ahmed Hamdy
 
 
 On Sat, 8 Jun 2024 at 22:22, Jeyhun Karimov 
>>> wrote:
 
> Hi Hong,
> 
> Thanks for driving the release.
> +1 (non-binding)
> 
> - Verified gpg signature
> - Reviewed the PR
> - Verified sha512
> - Checked github release tag
> - Checked that the source code does not contain binaries
> 
> Regards,
> Jeyhun
> 
> On Sat, Jun 8, 2024 at 1:52 PM weijie guo >> 
> wrote:
> 
>> Thanks Hong!
>> 
>> +1(binding)
>> 
>> - Verified gpg signature
>> - Verified sha512 hash
>> - Checked gh release tag
>> - Checked all artifacts deployed to maven repo
>> - Ran a simple wordcount job on local standalone cluster
>> - Compiled from source code with JDK 1.8.0_291.
>> 
>> Best regards,
>> 
>> Weijie
>> 
>> 
>> Xiqian YU  于2024年6月7日周五 18:23写道:
>> 
>>> +1 (non-binding)
>>> 
>>> 
>>>  *   Checked download links & release tags
>>>  *   Verified that package checksums matched
>>>  *   Compiled Flink from source code with JDK 8 / 11
>>>  *   Ran E2e data integration test jobs on local cluster
>>> 
>>> Regards,
>>> yux
>>> 
>>> De : Rui Fan <1996fan...@gmail.com>
>>> Date : vendredi, 7 juin 2024 à 17:14
>>> À : dev@flink.apache.org 
>>> Objet : Re: [VOTE] Release 1.19.1, release candidate #1
>>> +1(binding)
>>> 
>>> - Reviewed the flink-web PR (Left some comments)
>>> - Checked Github release tag
>>> - Verified signatures
>>> - Verified sha512 (hashsums)
>>> - The source archives do not contain any binaries
>>> - Build the source with Maven 3 and java8 (Checked the license as
 well)
>>> - Start the cluster locally with jdk8, and run the
 StateMachineExample
>> job,
>>> it works fine.
>>> 
>>> Best,
>>> Rui
>>> 
>>> On Thu, Jun 6, 2024 at 11:39 PM Hong Liang 
>>> wrote:
>>> 
 Hi everyone,
 Please review and vote on the release candidate #1 for the
>> flink
>> v1.19.1,
 as follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific
 comments)
 
 
 The complete staging area is available for your review, which
> includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience
 releases
> to
>>> be
 deployed to dist.apache.org [2], which are signed with the key
 with
 fingerprint B78A5EA1 [3],
 * all artifacts to be deployed to the Maven Central Repository
>>> [4],
 * source code tag "release-1.19.1-rc1" [5],
 * website pull request listing the new release and adding
> announcement
>>> blog
 post [6].
 
 The vote will be open for at least 72 hours. It is adopted by
> majority
 approval, with at least 3 PMC affirmative votes.
 
 Thanks,
 Hong
 
 

[jira] [Created] (FLINK-35564) The topic cannot be distributed on subtask when calculatePartitionOwner returns -1

2024-06-11 Thread Jira
中国无锡周良 created FLINK-35564:
--

 Summary: The topic cannot be distributed on subtask when 
calculatePartitionOwner returns -1
 Key: FLINK-35564
 URL: https://issues.apache.org/jira/browse/FLINK-35564
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Pulsar
Affects Versions: 1.17.2
Reporter: 中国无锡周良


The topic cannot be distributed on subtask when calculatePartitionOwner returns 
-1
{code:java}
@VisibleForTesting
static int calculatePartitionOwner(String topic, int partitionId, int 
parallelism) {
int startIndex = ((topic.hashCode() * 31) & 0x7FFF) % parallelism;
/*
 * Here, the assumption is that the id of Pulsar partitions are always 
ascending starting from
 * 0. Therefore, can be used directly as the offset clockwise from the 
start index.
 */
return (startIndex + partitionId) % parallelism;
} {code}
Here startIndex is a non-negative number calculated based on topic.hashCode() 
and in the range [0, parallelism-1].

For non-partitioned topic. partitionId is NON_PARTITION_ID = -1;

but
{code:java}
@Override
public Optional> createAssignment(
List readers) {
if (pendingPartitionSplits.isEmpty() || readers.isEmpty()) {
return Optional.empty();
}

Map> assignMap =
new HashMap<>(pendingPartitionSplits.size());

for (Integer reader : readers) {
Set splits = 
pendingPartitionSplits.remove(reader);
if (splits != null && !splits.isEmpty()) {
assignMap.put(reader, new ArrayList<>(splits));
}
}

if (assignMap.isEmpty()) {
return Optional.empty();
} else {
return Optional.of(new SplitsAssignment<>(assignMap));
}
} {code}
pendingPartitionSplits can't possibly have a value of -1, right? The 
calculation method of the topic by the above return 1, pendingPartitionSplits. 
Remove (reader), forever is null; This topic will not be assigned to a subtask; 
And I simulated this topic locally and found that messages were indeed not 
processed;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-11 Thread Sergey Nuyanzin
+1 (non-binding)

- Downloaded all the artifacts
- Verified checksums and signatures
- Verified that source archives do not contain any binaries
- Built from source with jdk8
- Ran a simple wordcount job on local standalone cluster

On Tue, Jun 11, 2024 at 8:36 AM Matthias Pohl  wrote:

> +1 (binding)
>
> * Downloaded all artifacts
> * Extracted sources and ran compilation on sources
> * Diff of git tag checkout with downloaded sources
> * Verified SHA512 & GPG checksums
> * Checked that all POMs have the right expected version
> * Generated diffs to compare pom file changes with NOTICE files
> * Verified WordCount in batch mode and streaming mode with a standalone
> session cluster to verify the logs: no suspicious behavior observed
>
> Best,
> Matthias
>
> On Mon, Jun 10, 2024 at 12:54 PM Hong Liang  wrote:
>
> > Thanks for testing the release candidate, everyone. Nice to see coverage
> on
> > different types of testing being done.
> >
> > I've addressed the comments on the web PR - thanks Rui Fan for good
> > comments, and for the reminder from Ahmed :)
> >
> > We have <24 hours on the vote wait time, and still waiting on 1 more
> > binding vote!
> >
> > Regards,
> > Hong
> >
> > On Sat, Jun 8, 2024 at 11:33 PM Ahmed Hamdy 
> wrote:
> >
> > > Hi Hong,
> > > Thanks for driving
> > >
> > > +1 (non-binding)
> > >
> > > - Verified signatures and hashes
> > > - Checked github release tag
> > > - Verified licenses
> > > - Checked that the source code does not contain binaries
> > > - Reviewed Web PR, nit: Could we address the comment of adding
> > FLINK-34633
> > > in the release
> > >
> > >
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Sat, 8 Jun 2024 at 22:22, Jeyhun Karimov 
> > wrote:
> > >
> > > > Hi Hong,
> > > >
> > > > Thanks for driving the release.
> > > > +1 (non-binding)
> > > >
> > > > - Verified gpg signature
> > > > - Reviewed the PR
> > > > - Verified sha512
> > > > - Checked github release tag
> > > > - Checked that the source code does not contain binaries
> > > >
> > > > Regards,
> > > > Jeyhun
> > > >
> > > > On Sat, Jun 8, 2024 at 1:52 PM weijie guo  >
> > > > wrote:
> > > >
> > > > > Thanks Hong!
> > > > >
> > > > > +1(binding)
> > > > >
> > > > > - Verified gpg signature
> > > > > - Verified sha512 hash
> > > > > - Checked gh release tag
> > > > > - Checked all artifacts deployed to maven repo
> > > > > - Ran a simple wordcount job on local standalone cluster
> > > > > - Compiled from source code with JDK 1.8.0_291.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > Xiqian YU  于2024年6月7日周五 18:23写道:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > >
> > > > > >   *   Checked download links & release tags
> > > > > >   *   Verified that package checksums matched
> > > > > >   *   Compiled Flink from source code with JDK 8 / 11
> > > > > >   *   Ran E2e data integration test jobs on local cluster
> > > > > >
> > > > > > Regards,
> > > > > > yux
> > > > > >
> > > > > > De : Rui Fan <1996fan...@gmail.com>
> > > > > > Date : vendredi, 7 juin 2024 à 17:14
> > > > > > À : dev@flink.apache.org 
> > > > > > Objet : Re: [VOTE] Release 1.19.1, release candidate #1
> > > > > > +1(binding)
> > > > > >
> > > > > > - Reviewed the flink-web PR (Left some comments)
> > > > > > - Checked Github release tag
> > > > > > - Verified signatures
> > > > > > - Verified sha512 (hashsums)
> > > > > > - The source archives do not contain any binaries
> > > > > > - Build the source with Maven 3 and java8 (Checked the license as
> > > well)
> > > > > > - Start the cluster locally with jdk8, and run the
> > > StateMachineExample
> > > > > job,
> > > > > > it works fine.
> > > > > >
> > > > > > Best,
> > > > > > Rui
> > > > > >
> > > > > > On Thu, Jun 6, 2024 at 11:39 PM Hong Liang 
> > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > > Please review and vote on the release candidate #1 for the
> flink
> > > > > v1.19.1,
> > > > > > > as follows:
> > > > > > > [ ] +1, Approve the release
> > > > > > > [ ] -1, Do not approve the release (please provide specific
> > > comments)
> > > > > > >
> > > > > > >
> > > > > > > The complete staging area is available for your review, which
> > > > includes:
> > > > > > > * JIRA release notes [1],
> > > > > > > * the official Apache source release and binary convenience
> > > releases
> > > > to
> > > > > > be
> > > > > > > deployed to dist.apache.org [2], which are signed with the key
> > > with
> > > > > > > fingerprint B78A5EA1 [3],
> > > > > > > * all artifacts to be deployed to the Maven Central Repository
> > [4],
> > > > > > > * source code tag "release-1.19.1-rc1" [5],
> > > > > > > * website pull request listing the new release and adding
> > > > announcement
> > > > > > blog
> > > > > > > post [6].
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours. It is adopted by
> > > > majority
> > > > > > > approval, with at least 3 PMC affirmative votes.
> > > > > > >
> > > > > > > 

Re: [VOTE] Release 1.19.1, release candidate #1

2024-06-11 Thread Matthias Pohl
+1 (binding)

* Downloaded all artifacts
* Extracted sources and ran compilation on sources
* Diff of git tag checkout with downloaded sources
* Verified SHA512 & GPG checksums
* Checked that all POMs have the right expected version
* Generated diffs to compare pom file changes with NOTICE files
* Verified WordCount in batch mode and streaming mode with a standalone
session cluster to verify the logs: no suspicious behavior observed

Best,
Matthias

On Mon, Jun 10, 2024 at 12:54 PM Hong Liang  wrote:

> Thanks for testing the release candidate, everyone. Nice to see coverage on
> different types of testing being done.
>
> I've addressed the comments on the web PR - thanks Rui Fan for good
> comments, and for the reminder from Ahmed :)
>
> We have <24 hours on the vote wait time, and still waiting on 1 more
> binding vote!
>
> Regards,
> Hong
>
> On Sat, Jun 8, 2024 at 11:33 PM Ahmed Hamdy  wrote:
>
> > Hi Hong,
> > Thanks for driving
> >
> > +1 (non-binding)
> >
> > - Verified signatures and hashes
> > - Checked github release tag
> > - Verified licenses
> > - Checked that the source code does not contain binaries
> > - Reviewed Web PR, nit: Could we address the comment of adding
> FLINK-34633
> > in the release
> >
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Sat, 8 Jun 2024 at 22:22, Jeyhun Karimov 
> wrote:
> >
> > > Hi Hong,
> > >
> > > Thanks for driving the release.
> > > +1 (non-binding)
> > >
> > > - Verified gpg signature
> > > - Reviewed the PR
> > > - Verified sha512
> > > - Checked github release tag
> > > - Checked that the source code does not contain binaries
> > >
> > > Regards,
> > > Jeyhun
> > >
> > > On Sat, Jun 8, 2024 at 1:52 PM weijie guo 
> > > wrote:
> > >
> > > > Thanks Hong!
> > > >
> > > > +1(binding)
> > > >
> > > > - Verified gpg signature
> > > > - Verified sha512 hash
> > > > - Checked gh release tag
> > > > - Checked all artifacts deployed to maven repo
> > > > - Ran a simple wordcount job on local standalone cluster
> > > > - Compiled from source code with JDK 1.8.0_291.
> > > >
> > > > Best regards,
> > > >
> > > > Weijie
> > > >
> > > >
> > > > Xiqian YU  于2024年6月7日周五 18:23写道:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > >
> > > > >   *   Checked download links & release tags
> > > > >   *   Verified that package checksums matched
> > > > >   *   Compiled Flink from source code with JDK 8 / 11
> > > > >   *   Ran E2e data integration test jobs on local cluster
> > > > >
> > > > > Regards,
> > > > > yux
> > > > >
> > > > > De : Rui Fan <1996fan...@gmail.com>
> > > > > Date : vendredi, 7 juin 2024 à 17:14
> > > > > À : dev@flink.apache.org 
> > > > > Objet : Re: [VOTE] Release 1.19.1, release candidate #1
> > > > > +1(binding)
> > > > >
> > > > > - Reviewed the flink-web PR (Left some comments)
> > > > > - Checked Github release tag
> > > > > - Verified signatures
> > > > > - Verified sha512 (hashsums)
> > > > > - The source archives do not contain any binaries
> > > > > - Build the source with Maven 3 and java8 (Checked the license as
> > well)
> > > > > - Start the cluster locally with jdk8, and run the
> > StateMachineExample
> > > > job,
> > > > > it works fine.
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Thu, Jun 6, 2024 at 11:39 PM Hong Liang 
> wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > > Please review and vote on the release candidate #1 for the flink
> > > > v1.19.1,
> > > > > > as follows:
> > > > > > [ ] +1, Approve the release
> > > > > > [ ] -1, Do not approve the release (please provide specific
> > comments)
> > > > > >
> > > > > >
> > > > > > The complete staging area is available for your review, which
> > > includes:
> > > > > > * JIRA release notes [1],
> > > > > > * the official Apache source release and binary convenience
> > releases
> > > to
> > > > > be
> > > > > > deployed to dist.apache.org [2], which are signed with the key
> > with
> > > > > > fingerprint B78A5EA1 [3],
> > > > > > * all artifacts to be deployed to the Maven Central Repository
> [4],
> > > > > > * source code tag "release-1.19.1-rc1" [5],
> > > > > > * website pull request listing the new release and adding
> > > announcement
> > > > > blog
> > > > > > post [6].
> > > > > >
> > > > > > The vote will be open for at least 72 hours. It is adopted by
> > > majority
> > > > > > approval, with at least 3 PMC affirmative votes.
> > > > > >
> > > > > > Thanks,
> > > > > > Hong
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354399
> > > > > > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-1.19.1-rc1/
> > > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > > [4]
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1736/
> > > > > > [5]
> > https://github.com/apache/flink/releases/tag/release-1.19.1-rc1
> > > > > > [6] https://github.com/apache/flink-web/pull/745
> > > 

Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-11 Thread Jiangang Liu
Congratulations, Fan Rui!


Best,
Jiangang Liu

Jacky Lau  于2024年6月11日周二 13:04写道:

> Congratulations Rui, well deserved!
>
> Regards,
> Jacky Lau
>
> Jeyhun Karimov 于2024年6月11日 周二03:49写道:
>
> > Congratulations Rui, well deserved!
> >
> > Regards,
> > Jeyhun
> >
> > On Mon, Jun 10, 2024, 10:21 Ahmed Hamdy  wrote:
> >
> > > Congratulations Rui!
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Mon, 10 Jun 2024 at 09:10, David Radley 
> > > wrote:
> > >
> > > > Congratulations, Rui!
> > > >
> > > > From: Sergey Nuyanzin 
> > > > Date: Sunday, 9 June 2024 at 20:33
> > > > To: dev@flink.apache.org 
> > > > Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink PMC Member - Fan
> > Rui
> > > > Congratulations, Rui!
> > > >
> > > > On Fri, Jun 7, 2024 at 5:36 AM Xia Sun  wrote:
> > > >
> > > > > Congratulations, Rui!
> > > > >
> > > > > Best,
> > > > > Xia
> > > > >
> > > > > Paul Lam  于2024年6月6日周四 11:59写道:
> > > > >
> > > > > > Congrats, Rui!
> > > > > >
> > > > > > Best,
> > > > > > Paul Lam
> > > > > >
> > > > > > > 2024年6月6日 11:02,Junrui Lee  写道:
> > > > > > >
> > > > > > > Congratulations, Rui.
> > > > > > >
> > > > > > > Best,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Hang Ruan  于2024年6月6日周四 10:35写道:
> > > > > > >
> > > > > > >> Congratulations, Rui!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Hang
> > > > > > >>
> > > > > > >> Samrat Deb  于2024年6月6日周四 10:28写道:
> > > > > > >>
> > > > > > >>> Congratulations Rui
> > > > > > >>>
> > > > > > >>> Bests,
> > > > > > >>> Samrat
> > > > > > >>>
> > > > > > >>> On Thu, 6 Jun 2024 at 7:45 AM, Yuxin Tan <
> > tanyuxinw...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Yuxin
> > > > > > 
> > > > > > 
> > > > > >  Xuannan Su  于2024年6月6日周四 09:58写道:
> > > > > > 
> > > > > > > Congratulations!
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Xuannan
> > > > > > >
> > > > > > > On Thu, Jun 6, 2024 at 9:53 AM Hangxiang Yu <
> > > master...@gmail.com
> > > > >
> > > > > > >>> wrote:
> > > > > > >>
> > > > > > >> Congratulations, Rui !
> > > > > > >>
> > > > > > >> On Thu, Jun 6, 2024 at 9:18 AM Lincoln Lee <
> > > > > lincoln.8...@gmail.com
> > > > > > >>>
> > > > > > > wrote:
> > > > > > >>
> > > > > > >>> Congratulations, Rui!
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Lincoln Lee
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Lijie Wang  于2024年6月6日周四
> > 09:11写道:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Lijie
> > > > > > 
> > > > > >  Rodrigo Meneses  于2024年6月5日周三
> > 21:35写道:
> > > > > > 
> > > > > > > All the best
> > > > > > >
> > > > > > > On Wed, Jun 5, 2024 at 5:56 AM xiangyu feng <
> > > > > >  xiangyu...@gmail.com>
> > > > > >  wrote:
> > > > > > >
> > > > > > >> Congratulations, Rui!
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >> Xiangyu Feng
> > > > > > >>
> > > > > > >> Feng Jin  于2024年6月5日周三
> 20:42写道:
> > > > > > >>
> > > > > > >>> Congratulations, Rui!
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Feng Jin
> > > > > > >>>
> > > > > > >>> On Wed, Jun 5, 2024 at 8:23 PM Yanfei Lei <
> > > > > >  fredia...@gmail.com
> > > > > > >>
> > > > > >  wrote:
> > > > > > >>>
> > > > > >  Congratulations, Rui!
> > > > > > 
> > > > > >  Best,
> > > > > >  Yanfei
> > > > > > 
> > > > > >  Luke Chen  于2024年6月5日周三 20:08写道:
> > > > > > >
> > > > > > > Congrats, Rui!
> > > > > > >
> > > > > > > Luke
> > > > > > >
> > > > > > > On Wed, Jun 5, 2024 at 8:02 PM Jiabao Sun <
> > > > > > >>> jiabao...@apache.org>
> > > > > > >>> wrote:
> > > > > > >
> > > > > > >> Congrats, Rui. Well-deserved!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Jiabao
> > > > > > >>
> > > > > > >> Zhanghao Chen 
> > > > > > >>> 于2024年6月5日周三
> > > > > >  19:29写道:
> > > > > > >>
> > > > > > >>> Congrats, Rui!
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Zhanghao Chen
> > > > > > >>> 
> > > > > > >>> From: Piotr Nowojski 
> > > > > > >>> Sent: Wednesday, June 5, 2024 18:01
> > > > > > >>> To: dev ; rui fan <
> > > > > >  1996fan...@gmail.com>
> > > > > > >>> Subject: [ANNOUNCE] New Apache Flink PMC Member -
> > > > > > >>> Fan
> > > > > > > 

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Xintong Song
Thanks for bringing this up, Danny. This is indeed an important issue that
the community needs to improve on.

Personally, I think a mono-repo might not be a bad idea, if we apply
different rules for the connector releases. To be specific:
- flink-connectors 1.19.x contains all connectors that are compatible with
Flink 1.19.x.
- allow not only bug-fixes, but also new features for a third-digit release
(e.g., flink-connectors 1.19.1)

This would allow us to immediately release flink-connectors 1.19.0 right
after flink 1.19.0 is out, excluding connectors that are no longer
compatible with flink 1.19. Then we can have a couple of flink-connectors
1.19.x releases, gradually adding the missing connectors back. In the worst
case, this would result in as many releases as having separated connector
repose. The benefit comes from 1) there are chances to combine releasing of
multiple connectors into one release of the mono repo (if they are ready
around the same time), and 2) no need to maintain a compatibility matrix
and worrying about it being out-of-sync with the code base.

However, one thing I don't like about this approach is that it requires
combining all the repos we just separated from the main-repo to another
mono-repo. That back-and-forth is annoying. So I'm just speaking out my
ideas, but would not strongly insist on this.

And big +1 for compatibility tools and ci checks.

Best,

Xintong



On Tue, Jun 11, 2024 at 2:38 AM David Radley 
wrote:

> Hi Danny,
> I think your proposal is a good one. This is the approach that we took
> with the Egeria project, firstly taking the connectors out of the main
> repo, then connectors having their own versions that incremented
> organically rather then tied to the core release.
>
> Blue sky thinking - I wonder if we could :
> - have a wizard / utility so the user inputs which Flink level they want
> and which connectors; the utility knows the compatibility matrix and
> downloads the appropriate bundles.
> - have the docs interrogate the core and connector repos to check the poms
> for the Flink levels and the pr builds to have ?live? docs showing the
> supported Flink levels. PyTorch does something like this for it?s docs.
>
> Kind regards, David.
>
>
>
> From: Danny Cranmer 
> Date: Monday, 10 June 2024 at 17:26
> To: dev 
> Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> Hello Flink community,
>
> It has been over 2 years [1] since we started externalizing the Flink
> connectors to dedicated repositories from the main Flink code base. The
> past discussions can be found here [2]. The community decided to
> externalize the connectors to primarily 1/ improve stability and speed of
> the CI, and 2/ decouple version and release lifecycle to allow the projects
> to evolve independently. The outcome of this has resulted in each connector
> requiring a dedicated release per Flink minor version, which is a burden on
> the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> supported connector followed roughly 2.5 months later on 2024-06-06 [4]
> (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5].
>
> Two decisions contribute to the high lag between releases. 1/ creating one
> repository per connector instead of a single flink-connector mono-repo and
> 2/ coupling the Flink version to the connector version [6]. A single
> connector repository would reduce the number of connector releases from N
> to 1, but would couple the connector CI and reduce release flexibility.
> Decoupling the connector versions from Flink would eliminate the need to
> release each connector for each new Flink minor version, but we would need
> a new compatibility mechanism.
>
> I propose that from each next connector release we drop the coupling on the
> Flink version. For example, instead of 3.4.0-1.20 (.) we
> would release 3.4.0 (). We can model a compatibility matrix
> within the Flink docs to help users pick the correct versions. This would
> mean we would usually not need to release a new connector version per Flink
> version, assuming there are no breaking changes. Worst case, if breaking
> changes impact all connectors we would still need to release all
> connectors. However, for Flink 1.17 and 1.18 there were only a handful of
> issues (breaking changes), and mostly impacting tests. We could decide to
> align this with Flink 2.0, however I see no compelling reason to do so.
> This was discussed previously [2] as a long term goal once the connector
> APIs are stable. But I think the current compatibility rules support this
> change now.
>
> I would prefer to not create a connector mono-repo. Separate repos gives
> each connector more flexibility to evolve independently, and removing
> unnecessary releases will significantly reduce the release effort.
>
> I would like to hear opinions and ideas from the community. In particular,
> are there any other issues you have observed that we should consider
> addressing?
>
> Thanks,
> 

Re: [DISCUSSION] FLIP-456: CompiledPlan support for Batch Execution Mode

2024-06-11 Thread Timo Walther

Hi Alexey,

thanks for proposing this FLIP. It is a nice continuation of the vision 
we had for CompiledPlan when writing and implementing FLIP-190. The 
whole stack is prepared for serializing BatchExecNodes as well so it 
shouldn't be too hard to make this a reality.


> I think the FLIP should be clear on the backwards support strategy
> here. The strategy for streaming is "forever".  This may be the most
> interesting part of the FLIP to discuss.

I agree with Jim. We shouldn't put too much burden on us (the Flink 
community). BatchExecNodes can evolve quicker than StreamExecNodes as 
the state component isn't an issue. Backwards compatibility of 2-3 Flink 
versions and at least 1 year of time should be enough for batch 
infrastructure to update. Of course we should avoid breaking changes 
whenever possible. This should be written down in the FLIP.


Regards,
Timo




On 07.06.24 23:10, Jim Hughes wrote:

Hi Alexey,

Responses inline below:

On Mon, May 13, 2024 at 7:18 PM Alexey Leonov-Vendrovskiy <
vendrov...@gmail.com> wrote:


Thanks Jim.


1. For the testing, I'd call the tests "execution" tests rather than
"restore" tests.  For streaming execution, restore tests have the

compiled

plan and intermediate state; the tests verify that those can work

together

and continue processing.


Agree that we don't need to store and restore the intermediate state. So
the most critical part is that the CompiledPlan for batch can be executed.



On the FLIP, can you be more specific about what we are checking during
execution?  I'd suggest that `executeSql(_)` and
`executePlan(compilePlanSql(_))` should be compared.



2. The FLIP implicitly suggests "completeness tests" (to use FLIP-190's

words).  Do we need "change detection tests"?  I'm a little unsure if

that

is presently happening in an automatic way for streaming operators.



  We might need to elaborate more on this, but the idea is that  we need to
make sure that compiled plans created by an older version of SQL Planner
are executable on newer runtimes.

3.  Can we remove old versions of batch operators eventually?  Or do we

need to keep them forever like we would for streaming operators?



We could have deprecation paths for old operator nodes in some cases. It is
a matter of the time window: what could be practical the "time distance"
between query planner and flink runtime against which the query query can
be resubmitted.
Note, here we don't have continuous queries, so there is always an option
to "re-plan" the original SQL query text into a newer version of the
CompiledPlan.
With this in mind, a time window of 1yr+ would allow deprecation of older
batch exec nodes, though I don't see this as a frequent event.



As I read the JavaDocs for `TableEnvironment.loadPlan`, it looks like the
compiled plan ought to be sufficient to run a job at a later time.

I think the FLIP should be clear on the backwards support strategy here.
The strategy for streaming is "forever".  This may be the most interesting
part of the FLIP to discuss.

Can you let us know when you've updated the FLIP?

Cheers,

Jim



-Alexey



On Mon, May 13, 2024 at 1:52 PM Jim Hughes 
wrote:


Hi Alexey,

After some thought, I have a question about deprecations:

3.  Can we remove old versions of batch operators eventually?  Or do we
need to keep them forever like we would for streaming operators?

Cheers,

Jim

On Thu, May 9, 2024 at 11:29 AM Jim Hughes  wrote:


Hi Alexey,

Overall, the FLIP looks good and makes sense to me.

1. For the testing, I'd call the tests "execution" tests rather than
"restore" tests.  For streaming execution, restore tests have the

compiled

plan and intermediate state; the tests verify that those can work

together

and continue processing.

For batch execution, I think we just want that all existing compiled

plans

can be executed in future versions.

2. The FLIP implicitly suggests "completeness tests" (to use FLIP-190's
words).  Do we need "change detection tests"?  I'm a little unsure if

that

is presently happening in an automatic way for streaming operators.

In RestoreTestBase, generateTestSetupFiles is disabled and has to be

run

manually when tests are being written.

Cheers,

Jim

On Tue, May 7, 2024 at 5:11 AM Paul Lam  wrote:


Hi Alexey,

Thanks a lot for bringing up the discussion. I’m big +1 for the FLIP.

I suppose the goal doesn’t involve the interchangeability of json

plans

between batch mode and streaming mode, right?
In other words, a json plan compiled in a batch program can’t be run

in

streaming mode without a migration (which is not yet supported).

Best,
Paul Lam


2024年5月7日 14:38,Alexey Leonov-Vendrovskiy 

写道:


Hi everyone,

PTAL at the proposed FLIP-456: CompiledPlan support for Batch

Execution

Mode. It is pretty self-describing.

Any thoughts are welcome!

Thanks,
Alexey

[1]






https://cwiki.apache.org/confluence/display/FLINK/FLIP-456%3A+CompiledPlan+support+for+Batch+Execution+Mode

.













Re: [VOTE] FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-11 Thread Junrui Lee
+1 (non-binding)

Best,
Junrui

Venkatakrishnan Sowrirajan  于2024年6月10日周一 02:37写道:

> Thanks for adding this new support. +1 (non-binding)
>
> On Sat, Jun 8, 2024, 3:26 PM Ahmed Hamdy  wrote:
>
> > +1 (non-binding)
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Sat, 8 Jun 2024 at 22:26, Jeyhun Karimov 
> wrote:
> >
> > > Hi Yuxin,
> > >
> > > Thanks for driving this.
> > > +1 (non-binding)
> > >
> > > Regards,
> > > Jeyhun
> > >
> > > On Fri, Jun 7, 2024 at 6:05 PM Jim Hughes  >
> > > wrote:
> > >
> > > > HI all,
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Cheers,
> > > >
> > > > Jim
> > > >
> > > > On Fri, Jun 7, 2024 at 4:03 AM Yuxin Tan 
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks for all the feedback about the FLIP-459 Support Flink
> > > > > hybrid shuffle integration with Apache Celeborn[1].
> > > > > The discussion thread is here [2].
> > > > >
> > > > > I'd like to start a vote for it. The vote will be open for at least
> > > > > 72 hours unless there is an objection or insufficient votes.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-459*3A*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;JSsrKysrKysr!!IKRxdwAv5BmarQ!ckFoYTA_CEJtmHerB8mgy6Ch-q-fxi9kFARc6zxNNnRWcM7t8wBqzSK-1MCQQ7GaOUjSS618gftZ5GDYc3ynGz4$
> > > > > [2]
> >
> https://urldefense.com/v3/__https://lists.apache.org/thread/gy7sm7qrf7yrv1rl5f4vtk5fo463ts33__;!!IKRxdwAv5BmarQ!ckFoYTA_CEJtmHerB8mgy6Ch-q-fxi9kFARc6zxNNnRWcM7t8wBqzSK-1MCQQ7GaOUjSS618gftZ5GDY_O3QZuY$
> > > > >
> > > > > Best,
> > > > > Yuxin
> > > > >
> > > >
> > >
> >
>


Re: [Discuss] Non-retriable (partial) errors in Elasticsearch 8 connector

2024-06-11 Thread Mingliang Liu
Thank you Ahmed for the explanation.

The current Elasticsearch 8 connector already uses the
FatalExeptionClassifier for fatal / non-retriable requests [1]. It's very
similar to what you linked in the AWS connectors. Currently this is only
used for fully failed requests. The main problem I was concerned about is
the partial failures, when the Future of the client bulk request was not
completed exceptionally, but instead some items failed according to the
response. For those failed entries in the partially failed request, we
retry infinitely though retrying will not always help.

To avoid problems of "too many failed but non-retryable request entries in
the buffer", I was thinking we can fail fast instead of infinitely
retrying. Alternatively, we can limit the maximum number of retrying per
rerecord. Like FLINK-35541 you shared for AWS connectors, I think a similar
approach in Elasticsearch 8 connector would be useful. Given a
non-retriable request entry, it will retry the request entries anyway but
will eventually fail after exhausting the retries. Having both sound like a
more comprehensive solution, as following sample:

void handlePartiallyUnprocessedRequest(
Response response, Consumer requestResult) {
List requestsToRetry = new ArrayList<>();

for (Request r : response.failedItems()) {
if (!isRetryable(r.errorCode())// we don't have this
check for ES 8, which could be 400 / 404
|| r.retryCount++ > maxRetry) {   // FLINK-35541 could
help limit retries for all failed requests
throw new FlinkRuntimeException();
}
requestsToRetry.add(r);
}

requestResult.accept(requestsToRetry);
}

[1]
https://github.com/apache/flink-connector-elasticsearch/blob/da2ef1fa6d5edd3cf1328b11632929fd2c99f567/flink-connector-elasticsearch8/src/main/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8AsyncWriter.java#L73-L82


On Fri, Jun 7, 2024 at 3:42 AM Ahmed Hamdy  wrote:

> Hi Mingliang,
>
> We already have a mechanism for detecting and propagating
> Fatal/Non-retryable exceptions[1]. We can use that in ElasticSearch similar
> to what we do for AWS connectors[2]. Also, you can check AWS connectors for
> how to add a fail-fast mechanism to disable retrying all along.
>
> > FLIP-451 proposes timeout for retrying which helps with un-acknowledged
> > requests, but not addressing the case when request gets processed and
> > failed items keep failing no matter how many times we retry. Correct me
> if
> > I'm wrong
> >
> yes you are correct, this is mainly to mitigate the issues arising from
> incorrect handling of requests in sink implementers.
> The Failure handling itself has always been assumed to be the Sink
> implementation responsibility, this is done in 3 levels
> - Classifying Fatal exceptions as mentioned above
> - Adding configuration to disable retries as mentioned above as well.
> - Adding mechanism to limit retries as in the proposed ticket for AWS
> connectors[3]
>
> In my opinion at least 1 and 3 are useful in this case for Elasticsearch,
> Adding classifiers and retry mechanisms for elasticsearch.
>
> Or we can allow users to configure
> > "drop/fail" behavior for non-retriable errors
> >
>
> I am not sure I follow this proposal, but in general while "Dropping"
> records seems to boost reliability, it breaks the at-least-once semantics
> and if you don't have proper tracing and debugging mechanisms we will be
> shooting ourselves in the foot.
>
>
> 1-
>
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java
> 2-
>
> https://github.com/apache/flink-connector-aws/blob/c688a8545ac1001c8450e8c9c5fe8bbafa13aeba/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/sink/DynamoDbSinkWriter.java#L227
>
> 3-https://issues.apache.org/jira/browse/FLINK-35541
> Best Regards
> Ahmed Hamdy
>
>
> On Thu, 6 Jun 2024 at 06:53, Mingliang Liu  wrote:
>
> > Hi all,
> >
> > Currently the Elasticsearch 8 connector retries all items if the request
> > fails as a whole, and retries failed items if the request has partial
> > failures [1]. I think this infinitely retries might be problematic in
> some
> > cases when retrying can never eventually succeed. For example, if the
> > request is 400 (bad request) or 404 (not found), retries do not help. If
> > there are too many failed items non-retriable, new requests will get
> > processed less effectively. In extreme cases, it may stall the pipeline
> if
> > in-flight requests are occupied by those failed items.
> >
> > FLIP-451 proposes timeout for retrying which helps with un-acknowledged
> > requests, but not addressing the case when request gets processed and
> > failed items keep failing no matter how many times we retry. Correct me
> if
> > I'm wrong.
> >
> > One opinionated option is