Re: Re: [VOTE] FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-05-10 Thread Shengkai Fang
+1 (binding)

Best,
Shengkai

Ron Liu  于2024年5月10日周五 12:07写道:

> +1(binding)
>
> Best,
> Ron
>
> Jark Wu  于2024年5月10日周五 09:51写道:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Thu, 9 May 2024 at 21:27, Lincoln Lee  wrote:
> >
> > > +1 (binding)
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Feng Jin  于2024年5月9日周四 19:45写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > >
> > > > Best,
> > > > Feng
> > > >
> > > >
> > > > On Thu, May 9, 2024 at 7:37 PM Xuyang  wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best!
> > > > > Xuyang
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2024-05-09 13:57:07, "Ron Liu"  wrote:
> > > > > >Sorry for the re-post, just to format this email content.
> > > > > >
> > > > > >Hi Dev
> > > > > >
> > > > > >Thank you to everyone for the feedback on FLIP-448: Introduce
> > > Pluggable
> > > > > >Workflow Scheduler Interface for Materialized Table[1][2].
> > > > > >I'd like to start a vote for it. The vote will be open for at
> least
> > 72
> > > > > >hours unless there is an objection or not enough votes.
> > > > > >
> > > > > >[1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> > > > > >
> > > > > >[2]
> > https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> > > > > >
> > > > > >Best,
> > > > > >Ron
> > > > > >
> > > > > >Ron Liu  于2024年5月9日周四 13:52写道:
> > > > > >
> > > > > >> Hi Dev, Thank you to everyone for the feedback on FLIP-448:
> > > Introduce
> > > > > >> Pluggable Workflow Scheduler Interface for Materialized
> > Table[1][2].
> > > > I'd
> > > > > >> like to start a vote for it. The vote will be open for at least
> 72
> > > > hours
> > > > > >> unless there is an objection or not enough votes. [1]
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> > > > > >>
> > > > > >> [2]
> > > https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> > > > > >> Best, Ron
> > > > > >>
> > > > >
> > > >
> > >
> >
>


Re: flink-connector-kafka weekly CI job failing

2024-05-10 Thread Hang Ruan
Hi, all.

I see there is already an issue[1] about this problem.
We could copy the new class `TypeSerializerConditions` into kafka connector
like the issue[2], which fixed the failure for 1.18-SNAPSHOT.

I would like to help it.

Best,
Hang

[1] https://issues.apache.org/jira/browse/FLINK-35109
[2] https://issues.apache.org/jira/browse/FLINK-32455

Hang Ruan  于2024年5月11日周六 09:44写道:

> Hi, all.
>
> The class `TypeSerializerMatchers` has been deleted in Flink version
> 1.20-SNAPSHOT.
> If we need to compile kafka connector with both 1.19 and 1.20, I think we
> have to copy `TypeSerializerMatchers` to kafka connector. But it is not a
> good idea.
> Besides this, I find that the flink-core test-jar does not contain the
> classes like `TypeSerializer`. We have to add the flink-core with the
> provided scope.
>
> I am not sure what is the best way to fix this.
>
> Best,
> Hang
>
> Danny Cranmer  于2024年5月11日周六 04:30写道:
>
>> Hello,
>>
>> Is there a reason we cannot fix the code rather than disabling the test?
>> If
>> we skip the tests this will likely be missed and cause delays for 1.20
>> support down the road.
>>
>> Thanks,
>> Danny
>>
>> On Wed, 8 May 2024, 23:35 Robert Young,  wrote:
>>
>> > Hi,
>> >
>> > I noticed the flink-connector-kafka weekly CI job is failing:
>> >
>> > https://github.com/apache/flink-connector-kafka/actions/runs/8954222477
>> >
>> > Looks like flink-connector-kafka main has a compile error against flink
>> > 1.20-SNAPSHOT, I tried locally and get a different compile failure
>> >
>> > KafkaSerializerUpgradeTest.java:[23,45] cannot find symbol
>> > [ERROR]   symbol:   class TypeSerializerMatchers
>> > [ERROR]   location: package org.apache.flink.api.common.typeutils
>> >
>> > Should 1.20-SNAPSHOT be removed from the weekly tests for now?
>> >
>> > Thanks
>> > Rob
>> >
>>
>


Re: [DISCUSSION] FLIP-450: Improve Runtime Configuration for Flink 2.0

2024-05-10 Thread Rui Fan
Thanks Xuannan for the update!

LGTM, +1 for this proposal.

Best,
Rui

On Sat, May 11, 2024 at 10:20 AM Xuannan Su  wrote:

> Hi Rui,
>
> Thanks for the suggestion!
>
> I updated the description of
> taskmanager.network.memory.max-overdraft-buffers-per-gate and
> hard-coded it to 20.
>
> Best regards,
> Xuannan
>
> On Mon, May 6, 2024 at 11:28 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > Thanks Xuannan for driving this proposal!
> >
> > > taskmanager.network.memory.max-overdraft-buffers-per-gate will be
> removed
> > and hard-coded to either 10 or 20.
> >
> > Currently, it's a public option. Could we determine the value of
> > the overdraft buffer in the current FLIP?
> >
> > I vote 20 as the hard code value due to 2 reasons:
> > - Removing this option means users cannot change it, it might be better
> to
> > turn it up.
> > - Most of tasks don't use the overdraft buffer, so increasing it doesn't
> > introduce more risk.
> >
> > Best,
> > Rui
> >
> > On Mon, May 6, 2024 at 10:47 AM Yuxin Tan 
> wrote:
> >
> > > Thanks for the effort, Xuannan.
> > >
> > > +1 for the proposal.
> > >
> > > Best,
> > > Yuxin
> > >
> > >
> > > Xintong Song  于2024年4月29日周一 15:40写道:
> > >
> > > > Thanks for driving this effort, Xuannan.
> > > >
> > > > +1 for the proposed changes.
> > > >
> > > > Just one suggestion: Some of the proposed changes involve not solely
> > > > changing the configuration options, but are bound to changing /
> removal
> > > of
> > > > certain features. E.g., the removal of hash-blocking shuffle and
> legacy
> > > > hybrid shuffle mode, and the behavior change of overdraft network
> > > buffers.
> > > > Therefore, it might be nicer to provide an implementation plan with a
> > > list
> > > > of related tasks in the FLIP. This should not block the FLIP though.
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Thu, Apr 25, 2024 at 4:35 PM Xuannan Su 
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start a discussion on FLIP-450: Improve Runtime
> > > > > Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we have
> > > > > revisited all runtime configurations and identified several
> > > > > improvements to enhance user-friendliness and maintainability. In
> this
> > > > > FLIP, we aim to refine the runtime configuration.
> > > > >
> > > > > Looking forward to everyone's feedback and suggestions. Thank you!
> > > > >
> > > > > Best regards,
> > > > > Xuannan
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-450%3A+Improve+Runtime+Configuration+for+Flink+2.0
> > > > >
> > > >
> > >
>


Re: [DISCUSSION] FLIP-450: Improve Runtime Configuration for Flink 2.0

2024-05-10 Thread Xuannan Su
Hi Rui,

Thanks for the suggestion!

I updated the description of
taskmanager.network.memory.max-overdraft-buffers-per-gate and
hard-coded it to 20.

Best regards,
Xuannan

On Mon, May 6, 2024 at 11:28 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Thanks Xuannan for driving this proposal!
>
> > taskmanager.network.memory.max-overdraft-buffers-per-gate will be removed
> and hard-coded to either 10 or 20.
>
> Currently, it's a public option. Could we determine the value of
> the overdraft buffer in the current FLIP?
>
> I vote 20 as the hard code value due to 2 reasons:
> - Removing this option means users cannot change it, it might be better to
> turn it up.
> - Most of tasks don't use the overdraft buffer, so increasing it doesn't
> introduce more risk.
>
> Best,
> Rui
>
> On Mon, May 6, 2024 at 10:47 AM Yuxin Tan  wrote:
>
> > Thanks for the effort, Xuannan.
> >
> > +1 for the proposal.
> >
> > Best,
> > Yuxin
> >
> >
> > Xintong Song  于2024年4月29日周一 15:40写道:
> >
> > > Thanks for driving this effort, Xuannan.
> > >
> > > +1 for the proposed changes.
> > >
> > > Just one suggestion: Some of the proposed changes involve not solely
> > > changing the configuration options, but are bound to changing / removal
> > of
> > > certain features. E.g., the removal of hash-blocking shuffle and legacy
> > > hybrid shuffle mode, and the behavior change of overdraft network
> > buffers.
> > > Therefore, it might be nicer to provide an implementation plan with a
> > list
> > > of related tasks in the FLIP. This should not block the FLIP though.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Thu, Apr 25, 2024 at 4:35 PM Xuannan Su 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to start a discussion on FLIP-450: Improve Runtime
> > > > Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we have
> > > > revisited all runtime configurations and identified several
> > > > improvements to enhance user-friendliness and maintainability. In this
> > > > FLIP, we aim to refine the runtime configuration.
> > > >
> > > > Looking forward to everyone's feedback and suggestions. Thank you!
> > > >
> > > > Best regards,
> > > > Xuannan
> > > >
> > > > [1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-450%3A+Improve+Runtime+Configuration+for+Flink+2.0
> > > >
> > >
> >


Re: [DISCUSSION] FLIP-450: Improve Runtime Configuration for Flink 2.0

2024-05-10 Thread Junrui Lee
Thanks Xuannan for driving this! +1 for this proposal.

Best,
Junrui

Rui Fan <1996fan...@gmail.com> 于2024年5月6日周一 11:27写道:

> Thanks Xuannan for driving this proposal!
>
> > taskmanager.network.memory.max-overdraft-buffers-per-gate will be removed
> and hard-coded to either 10 or 20.
>
> Currently, it's a public option. Could we determine the value of
> the overdraft buffer in the current FLIP?
>
> I vote 20 as the hard code value due to 2 reasons:
> - Removing this option means users cannot change it, it might be better to
> turn it up.
> - Most of tasks don't use the overdraft buffer, so increasing it doesn't
> introduce more risk.
>
> Best,
> Rui
>
> On Mon, May 6, 2024 at 10:47 AM Yuxin Tan  wrote:
>
> > Thanks for the effort, Xuannan.
> >
> > +1 for the proposal.
> >
> > Best,
> > Yuxin
> >
> >
> > Xintong Song  于2024年4月29日周一 15:40写道:
> >
> > > Thanks for driving this effort, Xuannan.
> > >
> > > +1 for the proposed changes.
> > >
> > > Just one suggestion: Some of the proposed changes involve not solely
> > > changing the configuration options, but are bound to changing / removal
> > of
> > > certain features. E.g., the removal of hash-blocking shuffle and legacy
> > > hybrid shuffle mode, and the behavior change of overdraft network
> > buffers.
> > > Therefore, it might be nicer to provide an implementation plan with a
> > list
> > > of related tasks in the FLIP. This should not block the FLIP though.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Thu, Apr 25, 2024 at 4:35 PM Xuannan Su 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to start a discussion on FLIP-450: Improve Runtime
> > > > Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we have
> > > > revisited all runtime configurations and identified several
> > > > improvements to enhance user-friendliness and maintainability. In
> this
> > > > FLIP, we aim to refine the runtime configuration.
> > > >
> > > > Looking forward to everyone's feedback and suggestions. Thank you!
> > > >
> > > > Best regards,
> > > > Xuannan
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-450%3A+Improve+Runtime+Configuration+for+Flink+2.0
> > > >
> > >
> >
>


Re: [DISCUSSION] FLIP-450: Improve Runtime Configuration for Flink 2.0

2024-05-10 Thread Xuannan Su
Hi Xintong,

Thanks for the suggestion! I updated the FLIP to include a list of
related tasks.

Best regards,
Xuannan

On Mon, Apr 29, 2024 at 3:40 PM Xintong Song  wrote:
>
> Thanks for driving this effort, Xuannan.
>
> +1 for the proposed changes.
>
> Just one suggestion: Some of the proposed changes involve not solely
> changing the configuration options, but are bound to changing / removal of
> certain features. E.g., the removal of hash-blocking shuffle and legacy
> hybrid shuffle mode, and the behavior change of overdraft network buffers.
> Therefore, it might be nicer to provide an implementation plan with a list
> of related tasks in the FLIP. This should not block the FLIP though.
>
> Best,
>
> Xintong
>
>
>
> On Thu, Apr 25, 2024 at 4:35 PM Xuannan Su  wrote:
>
> > Hi all,
> >
> > I'd like to start a discussion on FLIP-450: Improve Runtime
> > Configuration for Flink 2.0 [1]. As Flink moves toward 2.0, we have
> > revisited all runtime configurations and identified several
> > improvements to enhance user-friendliness and maintainability. In this
> > FLIP, we aim to refine the runtime configuration.
> >
> > Looking forward to everyone's feedback and suggestions. Thank you!
> >
> > Best regards,
> > Xuannan
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-450%3A+Improve+Runtime+Configuration+for+Flink+2.0
> >


[VOTE] Apache Flink CDC Release 3.1.0, release candidate #3

2024-05-10 Thread Qingsheng Ren
Hi everyone,

Please review and vote on the release candidate #3 for the version 3.1.0 of
Apache Flink CDC, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Release Overview**

As an overview, the release consists of the following:
a) Flink CDC source release to be deployed to dist.apache.org
b) Maven artifacts to be deployed to the Maven Central Repository

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as follows,
for your review:
* All artifacts for a) can be found in the corresponding dev repository at
dist.apache.org [1], which are signed with the key with fingerprint
A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
* All artifacts for b) can be found at the Apache Nexus Repository [3]

Other links for your review:
* JIRA release notes [4]
* Source code tag "release-3.1.0-rc3" with commit hash
5452f30b704942d0ede64ff3d4c8699d39c63863 [5]
* PR for release announcement blog post of Flink CDC 3.1.0 in flink-web [6]

**Vote Duration**

The voting time will run for at least 72 hours, adopted by majority
approval with at least 3 PMC affirmative votes.

Thanks,
Qingsheng Ren

[1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc3/
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3] https://repository.apache.org/content/repositories/orgapacheflink-1733
[4]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
[5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc3
[6] https://github.com/apache/flink-web/pull/739


Re: flink-connector-kafka weekly CI job failing

2024-05-10 Thread Hang Ruan
Hi, all.

The class `TypeSerializerMatchers` has been deleted in Flink version
1.20-SNAPSHOT.
If we need to compile kafka connector with both 1.19 and 1.20, I think we
have to copy `TypeSerializerMatchers` to kafka connector. But it is not a
good idea.
Besides this, I find that the flink-core test-jar does not contain the
classes like `TypeSerializer`. We have to add the flink-core with the
provided scope.

I am not sure what is the best way to fix this.

Best,
Hang

Danny Cranmer  于2024年5月11日周六 04:30写道:

> Hello,
>
> Is there a reason we cannot fix the code rather than disabling the test? If
> we skip the tests this will likely be missed and cause delays for 1.20
> support down the road.
>
> Thanks,
> Danny
>
> On Wed, 8 May 2024, 23:35 Robert Young,  wrote:
>
> > Hi,
> >
> > I noticed the flink-connector-kafka weekly CI job is failing:
> >
> > https://github.com/apache/flink-connector-kafka/actions/runs/8954222477
> >
> > Looks like flink-connector-kafka main has a compile error against flink
> > 1.20-SNAPSHOT, I tried locally and get a different compile failure
> >
> > KafkaSerializerUpgradeTest.java:[23,45] cannot find symbol
> > [ERROR]   symbol:   class TypeSerializerMatchers
> > [ERROR]   location: package org.apache.flink.api.common.typeutils
> >
> > Should 1.20-SNAPSHOT be removed from the weekly tests for now?
> >
> > Thanks
> > Rob
> >
>


Re: [VOTE] Apache Flink CDC Release 3.1.0, release candidate #2

2024-05-10 Thread Qingsheng Ren
Thanks for your check! I mistakenly used Java 11 to build the binary. I'll
create another RC for validation and sorry for the confusion!

This RC is now cancelled.

Best,
Qingsheng

On Fri, May 10, 2024 at 7:12 PM Jiabao Sun  wrote:

> Thanks Qingsheng,
>
> I think we may build the dist jar by jdk8 instead of jdk11.
> WDYT?
>
> Best,
> Jiabao
>
>
> Muhammet Orazov  于2024年5月10日周五 19:02写道:
>
> > Hey Qingsheng,
> >
> > Thanks, I have run the following steps:
> >
> > - Checked sha512sum hash
> > - Checked GPG signature
> > - Built the source with JDK 11 & 8
> > - Checked that src doesn't contain binary files
> >
> > GitHub web PR is as before with suggestions requested.
> >
> > Best,
> > Muhammet
> >
> > On 2024-05-10 09:59, Qingsheng Ren wrote:
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #2 for the version
> > > 3.1.0 of
> > > Apache Flink CDC, as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > **Release Overview**
> > >
> > > As an overview, the release consists of the following:
> > > a) Flink CDC source release to be deployed to dist.apache.org
> > > b) Maven artifacts to be deployed to the Maven Central Repository
> > >
> > > **Staging Areas to Review**
> > >
> > > The staging areas containing the above mentioned artifacts are as
> > > follows,
> > > for your review:
> > > * All artifacts for a) can be found in the corresponding dev repository
> > > at
> > > dist.apache.org [1], which are signed with the key with fingerprint
> > > A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> > > * All artifacts for b) can be found at the Apache Nexus Repository [3]
> > >
> > > Other links for your review:
> > > * JIRA release notes [4]
> > > * Source code tag "release-3.1.0-rc2" with commit hash
> > > f3ce608903014d410eecc8ad62c7a9d9a27a6a19 [5]
> > > * PR for release announcement blog post of Flink CDC 3.1.0 in flink-web
> > > [6]
> > >
> > > **Vote Duration**
> > >
> > > The voting time will run for at least 72 hours, adopted by majority
> > > approval with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Qingsheng Ren
> > >
> > > [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc2/
> > > [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [3]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1732
> > > [4]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
> > > [5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc2
> > > [6] https://github.com/apache/flink-web/pull/739
> >
>


Re: Alignment with FlinkSQL Standards | Feedback requested

2024-05-10 Thread Jeyhun Karimov
Dear Kanchi,

Thanks for your proposal.
I think a similar feature is available in Flink Pulsar Connector. So, from
the sql semantics and philosophy there should not be any issues IMHO.
Could you please elaborate more on the motivation (e.g., user requests,
enabling new use-cases for Kafka, etc) behind introducing the similar
feature for Kafka connector?

Regards,
Jeyhun

On Fri, May 10, 2024 at 8:20 AM Kanchi Masalia
 wrote:

> Dear Flink Community,
>
> I hope this message finds you well. I am considering the idea of extending
> FlinkSQL's capabilities to include direct queries from Kafka topics,
> specifically through the use of "SELECT * FROM kafka_topic". Currently, as
> we are aware, FlinkSQL permits queries like "SELECT * FROM
> flink_sql_table".
>
> I am writing to seek your valued opinions and insights on whether extending
> this capability to include direct Kafka topic queries would align with the
> guidelines and architectural philosophy of FlinkSQL. This potential feature
> could offer a more streamlined approach for users to access and analyze
> data directly from Kafka potentially simplifying workflow processes.
>
> Here are a few points I would like your input on:
>
>1. Does this idea align with the current and future usage philosophy of
>FlinkSQL?
>2. Are there potential challenges or conflicts this feature might pose
>against existing FlinkSQL guidelines?
>3. Would this enhancement align with how you currently use or plan to
>use Flink in your data processing tasks?
>
> Your feedback will be instrumental in guiding the direction of this
> potential feature.
>
> Thank you very much for considering my inquiry. I am looking forward to
> your insights and hope this can spark a constructive discussion within our
> community.
>
> Thanks,
> Kanchi Masalia
>


Re: [VOTE] FLIP-449: Reorganization of flink-connector-jdbc

2024-05-10 Thread Jeyhun Karimov
Thanks for driving this!

+1 (non-binding)

Regards,
Jeyhun

On Fri, May 10, 2024 at 12:50 PM Muhammet Orazov
 wrote:

> Thanks João for your efforts and driving this!
>
> +1 (non-binding)
>
> Best,
> Muhammet
>
> On 2024-05-09 12:01, Joao Boto wrote:
> > Hi everyone,
> >
> > Thanks for all the feedback, I'd like to start a vote on the FLIP-449:
> > Reorganization of flink-connector-jdbc [1].
> > The discussion thread is here [2].
> >
> > The vote will be open for at least 72 hours unless there is an
> > objection or
> > insufficient votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-449%3A+Reorganization+of+flink-connector-jdbc
> > [2] https://lists.apache.org/thread/jc1yvvo35xwqzlxl5mj77qw3hq6f5sgr
> >
> > Best
> > Joao Boto
>


Re: flink-connector-kafka weekly CI job failing

2024-05-10 Thread Danny Cranmer
Hello,

Is there a reason we cannot fix the code rather than disabling the test? If
we skip the tests this will likely be missed and cause delays for 1.20
support down the road.

Thanks,
Danny

On Wed, 8 May 2024, 23:35 Robert Young,  wrote:

> Hi,
>
> I noticed the flink-connector-kafka weekly CI job is failing:
>
> https://github.com/apache/flink-connector-kafka/actions/runs/8954222477
>
> Looks like flink-connector-kafka main has a compile error against flink
> 1.20-SNAPSHOT, I tried locally and get a different compile failure
>
> KafkaSerializerUpgradeTest.java:[23,45] cannot find symbol
> [ERROR]   symbol:   class TypeSerializerMatchers
> [ERROR]   location: package org.apache.flink.api.common.typeutils
>
> Should 1.20-SNAPSHOT be removed from the weekly tests for now?
>
> Thanks
> Rob
>


Re: [VOTE] Apache Flink CDC Release 3.1.0, release candidate #2

2024-05-10 Thread Jiabao Sun
Thanks Qingsheng,

I think we may build the dist jar by jdk8 instead of jdk11.
WDYT?

Best,
Jiabao


Muhammet Orazov  于2024年5月10日周五 19:02写道:

> Hey Qingsheng,
>
> Thanks, I have run the following steps:
>
> - Checked sha512sum hash
> - Checked GPG signature
> - Built the source with JDK 11 & 8
> - Checked that src doesn't contain binary files
>
> GitHub web PR is as before with suggestions requested.
>
> Best,
> Muhammet
>
> On 2024-05-10 09:59, Qingsheng Ren wrote:
> > Hi everyone,
> >
> > Please review and vote on the release candidate #2 for the version
> > 3.1.0 of
> > Apache Flink CDC, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Flink CDC source release to be deployed to dist.apache.org
> > b) Maven artifacts to be deployed to the Maven Central Repository
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as
> > follows,
> > for your review:
> > * All artifacts for a) can be found in the corresponding dev repository
> > at
> > dist.apache.org [1], which are signed with the key with fingerprint
> > A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> > * All artifacts for b) can be found at the Apache Nexus Repository [3]
> >
> > Other links for your review:
> > * JIRA release notes [4]
> > * Source code tag "release-3.1.0-rc2" with commit hash
> > f3ce608903014d410eecc8ad62c7a9d9a27a6a19 [5]
> > * PR for release announcement blog post of Flink CDC 3.1.0 in flink-web
> > [6]
> >
> > **Vote Duration**
> >
> > The voting time will run for at least 72 hours, adopted by majority
> > approval with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Qingsheng Ren
> >
> > [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc2/
> > [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [3]
> > https://repository.apache.org/content/repositories/orgapacheflink-1732
> > [4]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
> > [5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc2
> > [6] https://github.com/apache/flink-web/pull/739
>


Re: [VOTE] Apache Flink CDC Release 3.1.0, release candidate #2

2024-05-10 Thread Muhammet Orazov

Hey Qingsheng,

Thanks, I have run the following steps:

- Checked sha512sum hash
- Checked GPG signature
- Built the source with JDK 11 & 8
- Checked that src doesn't contain binary files

GitHub web PR is as before with suggestions requested.

Best,
Muhammet

On 2024-05-10 09:59, Qingsheng Ren wrote:

Hi everyone,

Please review and vote on the release candidate #2 for the version 
3.1.0 of

Apache Flink CDC, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Release Overview**

As an overview, the release consists of the following:
a) Flink CDC source release to be deployed to dist.apache.org
b) Maven artifacts to be deployed to the Maven Central Repository

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as 
follows,

for your review:
* All artifacts for a) can be found in the corresponding dev repository 
at

dist.apache.org [1], which are signed with the key with fingerprint
A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
* All artifacts for b) can be found at the Apache Nexus Repository [3]

Other links for your review:
* JIRA release notes [4]
* Source code tag "release-3.1.0-rc2" with commit hash
f3ce608903014d410eecc8ad62c7a9d9a27a6a19 [5]
* PR for release announcement blog post of Flink CDC 3.1.0 in flink-web 
[6]


**Vote Duration**

The voting time will run for at least 72 hours, adopted by majority
approval with at least 3 PMC affirmative votes.

Thanks,
Qingsheng Ren

[1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc2/
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3] 
https://repository.apache.org/content/repositories/orgapacheflink-1732

[4]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
[5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc2
[6] https://github.com/apache/flink-web/pull/739


Re: [VOTE] FLIP-449: Reorganization of flink-connector-jdbc

2024-05-10 Thread Muhammet Orazov

Thanks João for your efforts and driving this!

+1 (non-binding)

Best,
Muhammet

On 2024-05-09 12:01, Joao Boto wrote:

Hi everyone,

Thanks for all the feedback, I'd like to start a vote on the FLIP-449:
Reorganization of flink-connector-jdbc [1].
The discussion thread is here [2].

The vote will be open for at least 72 hours unless there is an 
objection or

insufficient votes.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-449%3A+Reorganization+of+flink-connector-jdbc
[2] https://lists.apache.org/thread/jc1yvvo35xwqzlxl5mj77qw3hq6f5sgr

Best
Joao Boto


Re: [VOTE] FLIP-449: Reorganization of flink-connector-jdbc

2024-05-10 Thread Aleksandr Pilipenko
Thanks for driving this!

+1 (non-binding)

Thanks,
Aleksandr

On Fri, 10 May 2024 at 10:58, Rui Fan <1996fan...@gmail.com> wrote:

> Thanks for driving this proposal!
>
> +1(binding)
>
> Best,
> Rui
>
> On Fri, May 10, 2024 at 9:50 AM Yuepeng Pan  wrote:
>
> > Hi, Boto.
> >
> > +1 (non-binding).
> >
> > Thanks for your driving it.
> >
> > Best regards,
> > Yuepeng Pan.
> >
> > On 2024/05/09 12:01:04 Joao Boto wrote:
> > > Hi everyone,
> > >
> > > Thanks for all the feedback, I'd like to start a vote on the FLIP-449:
> > > Reorganization of flink-connector-jdbc [1].
> > > The discussion thread is here [2].
> > >
> > > The vote will be open for at least 72 hours unless there is an
> objection
> > or
> > > insufficient votes.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-449%3A+Reorganization+of+flink-connector-jdbc
> > > [2] https://lists.apache.org/thread/jc1yvvo35xwqzlxl5mj77qw3hq6f5sgr
> > >
> > > Best
> > > Joao Boto
> > >
> >
>


[VOTE] Apache Flink CDC Release 3.1.0, release candidate #2

2024-05-10 Thread Qingsheng Ren
Hi everyone,

Please review and vote on the release candidate #2 for the version 3.1.0 of
Apache Flink CDC, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Release Overview**

As an overview, the release consists of the following:
a) Flink CDC source release to be deployed to dist.apache.org
b) Maven artifacts to be deployed to the Maven Central Repository

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as follows,
for your review:
* All artifacts for a) can be found in the corresponding dev repository at
dist.apache.org [1], which are signed with the key with fingerprint
A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
* All artifacts for b) can be found at the Apache Nexus Repository [3]

Other links for your review:
* JIRA release notes [4]
* Source code tag "release-3.1.0-rc2" with commit hash
f3ce608903014d410eecc8ad62c7a9d9a27a6a19 [5]
* PR for release announcement blog post of Flink CDC 3.1.0 in flink-web [6]

**Vote Duration**

The voting time will run for at least 72 hours, adopted by majority
approval with at least 3 PMC affirmative votes.

Thanks,
Qingsheng Ren

[1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc2/
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3] https://repository.apache.org/content/repositories/orgapacheflink-1732
[4]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
[5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc2
[6] https://github.com/apache/flink-web/pull/739


Re: [VOTE] FLIP-449: Reorganization of flink-connector-jdbc

2024-05-10 Thread Rui Fan
Thanks for driving this proposal!

+1(binding)

Best,
Rui

On Fri, May 10, 2024 at 9:50 AM Yuepeng Pan  wrote:

> Hi, Boto.
>
> +1 (non-binding).
>
> Thanks for your driving it.
>
> Best regards,
> Yuepeng Pan.
>
> On 2024/05/09 12:01:04 Joao Boto wrote:
> > Hi everyone,
> >
> > Thanks for all the feedback, I'd like to start a vote on the FLIP-449:
> > Reorganization of flink-connector-jdbc [1].
> > The discussion thread is here [2].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > insufficient votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-449%3A+Reorganization+of+flink-connector-jdbc
> > [2] https://lists.apache.org/thread/jc1yvvo35xwqzlxl5mj77qw3hq6f5sgr
> >
> > Best
> > Joao Boto
> >
>


[DISCUSSION] HybridSource multiway tree feature

2024-05-10 Thread Or Keren
Hello Flink devs!

My feature suggestion is to allow HybridSource component to have a multiway
graph of sources, each node might have multiple branches instead of a
linear sources structure.
On each step of the way, there can be multiple sources that runs in
parallel.

For example:

[batch-source]-> [live-source1, live-source2]

[batch-source1, batch-source2] -> live-source

As a Flink user, a lot of my use cases required "state warm up", which
allowed me to ingest all of the pre existing state to Flink when booting up
my application without a checkpoint (for example because of a structure
change in the graph or a change in the state structure that is not backward
compatible).
>From my experience, the easiest way to implement such a state warm up was
through the HybridSource component, which allowed me to connect batch
source that ingested the state first, and then after it's done allow the
real time streams to start reading the messages.

The problem is that, when the real time streams composes multiple sources
from different places which has to be unioned afterwards, the HybridSource
component doesn't support that. It only supports putting a single live
source after the batch one.
Plus, from the warm up side, there isn't an option to set multiple batch
sources in parallel.

I know that there's a way to do that with a batch application that creates
a savepoint beforehand, and than starting the live application from that
savepoint. That solution requires a lot of ops overhead for the developer,
which has to create a process with an outside orchestrator.

Any feedback would be welcomed! Thank you so much


Re: [DISCUSS] FLIP-444: Native file copy support

2024-05-10 Thread Piotr Nowojski
Hi!

Thanks for your suggestions!

> I'd prefer a unified one interface

I have updated the FLIP to take that into account. In this case, I would
also propose to completely drop `DuplicatingFileSystem` in favour of a
basically renamed version of it `PathsCopyingFileSystem`.
`DuplicatingFileSystem` was not marked as PublicEvolving/Experimental
(probably by mistake), so technically we can do it. Even if not for that
mistake, I would still vote to replace it to simplify the code, as any
migration would be very easy. At the same time to the best of my knowledge,
no one has ever implemented it.

> The proposal mentions that s5cmd utilises 100% of CPU similar to Flink
> 1.18. However, this will be a native process outside of the JVM. Are there
> risk of large/long state download starving the TM of CPU cycle causing
> issues such as heartbeat or ask timeout?
>
> Do you know if there is a way to limit the CPU utilisation of s5cmd? I see
> worker and concurrency configuration but these do not map directly to cap
> in CPU usage. The experience for feature user in this case will be one of
> trial and error.

Those are good points. As a matter of fact, shortly after publishing this
FLIP, we started experimenting with using `cpulimit` to achieve just that.
If everything will work out fine, we are planning to expose this as a
configuration option for the S3 file system. I've added that to the FLIP.

> 2. copyFiles is not an atomic operation, How could we handle the situation
> when some partial files fail ?
> Could we return the list of successful files then the caller could decide
> to retry or just know them ?

Do you have some suggestions on how that should be implemented in the
interface and how should it be used?

Note that for both recovery and checkpoints,  there are no retring
mechanisms. If any part of downloading or
uploading fails, the job fails over, so actually using such interface
extension would be out of scope of this FLIP. In
that case, maybe if this could be extended in the future without breaking
compatibility we could leave it as a
future improvement?

Best,
Piotrek


pt., 10 maj 2024 o 07:40 Hangxiang Yu  napisał(a):

> Hi Piotr.
> Thanks for your proposal.
>
> I have some comments, PTAL:
> 1. +1 about unifying the interface with DuplicatingFileSystem.
> IIUC, DuplicatingFileSystem also covers the logic from/to both local and
> remote paths.
> The implementations could define their own logic about how to fast
> copy/duplicate files, e.g. hard link or transfer manager.
>
> 2. copyFiles is not an atomic operation, How could we handle the situation
> when some partial files fail ?
> Could we return the list of successful files then the caller could decide
> to retry or just know them ?
>
> On Thu, May 9, 2024 at 3:46 PM Keith Lee 
> wrote:
>
> > Hi Piotr,
> >
> > Thank you for the proposal. Looks great.
> >
> > Along similar line of Aleks' question on memory usage.
> >
> > The proposal mentions that s5cmd utilises 100% of CPU similar to Flink
> > 1.18. However, this will be a native process outside of the JVM. Are
> there
> > risk of large/long state download starving the TM of CPU cycle causing
> > issues such as heartbeat or ask timeout?
> >
> > Do you know if there is a way to limit the CPU utilisation of s5cmd? I
> see
> > worker and concurrency configuration but these do not map directly to cap
> > in CPU usage. The experience for feature user in this case will be one of
> > trial and error.
> >
> > Thanks
> > Keith
> >
> > On Wed, May 8, 2024 at 12:47 PM Ahmed Hamdy 
> wrote:
> >
> > > Hi Piotr
> > > +1 for the proposal, it seems to have a lot of gains.
> > >
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Mon, 6 May 2024 at 12:06, Zakelly Lan 
> wrote:
> > >
> > > > Hi Piotrek,
> > > >
> > > > Thanks for your answers!
> > > >
> > > > Good question. The intention and use case behind
> > `DuplicatingFileSystem`
> > > is
> > > > > different. It marks if `FileSystem` can quickly copy/duplicate
> files
> > > > > in the remote `FileSystem`. For example an equivalent of a hard
> link
> > or
> > > > > bumping a reference count in the remote system. That's a bit
> > different
> > > > > to copy paths between remote and local file systems.
> > > > >
> > > > > However, it could arguably be unified under one interface where we
> > > would
> > > > > re-use or re-name `canFastDuplicate(Path, Path)` to
> > > > > `canFastCopy(Path, Path)` with the following use cases:
> > > > > - `canFastCopy(remoteA, remoteB)` returns true - current equivalent
> > of
> > > > > `DuplicatingFileSystem` - quickly duplicate/hard link remote path
> > > > > - `canFastCopy(local, remote)` returns true - FS can natively
> upload
> > > > local
> > > > > file to a remote location
> > > > > - `canFastCopy(remote, local)` returns true - FS can natively
> > download
> > > > > local file from a remote location
> > > > >
> > > > > Maybe indeed that's a better solution vs having two separate
> > interfaces
> > > > for
> > > > > 

[jira] [Created] (FLINK-35331) Download links for binary releases are displayed as source releases on website

2024-05-10 Thread Xintong Song (Jira)
Xintong Song created FLINK-35331:


 Summary: Download links for binary releases are displayed as 
source releases on website
 Key: FLINK-35331
 URL: https://issues.apache.org/jira/browse/FLINK-35331
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Xintong Song


Take Pre-bundled Hadoop as examples. The content for downloading are binary 
releases, while the link is displayed as "Pre-bundled Hadoop 2.x.y Source 
Release (asc, sha512)". The problem is caused by misusing 
`source_release_[url|asc_url|sha512_url]` for binary contents in the 
corresponding [yaml 
file.|https://github.com/apache/flink-web/blob/asf-site/docs/data/additional_components.yml]

There are many similar cases in the webpage.

And a relevant issues is that, some source releases are displayed as "XXX 
Source Release Source Release", due to including "Source Release" in the `name` 
field of the corresponding yaml file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35330) Support for Vertica Connector

2024-05-10 Thread Vikas Garg (Jira)
Vikas Garg created FLINK-35330:
--

 Summary: Support for Vertica Connector
 Key: FLINK-35330
 URL: https://issues.apache.org/jira/browse/FLINK-35330
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: jdbc-3.1.2
Reporter: Vikas Garg
 Fix For: 1.8.4


I am from Opentext and our product Vertica is one of the most widely used 
analytical database.

we have developed a Flink-Vertica connector which will be made opensource.

 

We will be managing the forward development, enhancements and improvements of 
the Flink connector.

 

Looking to get it approved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


FLINK-33759

2024-05-10 Thread Frank Yin
Hi,

I opened a PR on GitHub to address FLINK-33759 that causes issues when
writing Parquet files with complex data schema.
https://github.com/apache/flink/pull/24029

Can anyone help review this PR?

Thanks,
Frank


Alignment with FlinkSQL Standards | Feedback requested

2024-05-10 Thread Kanchi Masalia
Dear Flink Community,

I hope this message finds you well. I am considering the idea of extending
FlinkSQL's capabilities to include direct queries from Kafka topics,
specifically through the use of "SELECT * FROM kafka_topic". Currently, as
we are aware, FlinkSQL permits queries like "SELECT * FROM flink_sql_table".

I am writing to seek your valued opinions and insights on whether extending
this capability to include direct Kafka topic queries would align with the
guidelines and architectural philosophy of FlinkSQL. This potential feature
could offer a more streamlined approach for users to access and analyze
data directly from Kafka potentially simplifying workflow processes.

Here are a few points I would like your input on:

   1. Does this idea align with the current and future usage philosophy of
   FlinkSQL?
   2. Are there potential challenges or conflicts this feature might pose
   against existing FlinkSQL guidelines?
   3. Would this enhancement align with how you currently use or plan to
   use Flink in your data processing tasks?

Your feedback will be instrumental in guiding the direction of this
potential feature.

Thank you very much for considering my inquiry. I am looking forward to
your insights and hope this can spark a constructive discussion within our
community.

Thanks,
Kanchi Masalia