[off for 3 months]

2024-05-25 Thread Etienne Chauchot

Hi all,

I'll be off for 3 months backpacking in Asia. So I'll be unavailable 
during that time.


See you when I get back.

Best

Etienne



Re: [connectors] Notice to connector authors

2024-03-04 Thread Etienne Chauchot

Hi Lorenzo,

There is a umbrella ticket [1] but it is intended to track the CI 
feature implementation and is now closed.


I think the best is that each connector maintainer does his own ticket. 
Here is the one I did for Cassandra (there is an ongoing PR also): [2]


Best

Etienne

[1] https://issues.apache.org/jira/browse/FLINK-34136

[2] https://issues.apache.org/jira/browse/FLINK-32353


Le 04/03/2024 à 11:40, lorenzo.affe...@ververica.com a écrit :

Thank you Etienne for the information.

Is there any Jira ticket about this?
Also an umbrella one?

Or connector maintainers will do on their own?

Thank you!
On Mar 4, 2024 at 10:23 +0100, Etienne Chauchot 
, wrote:

Hi connector authors,

I'd like to bring a piece of information about the CI of connectors:

With the release of flink-connector-parent 1.1.0, you can now skip the
archunit tests for a particular CI build as described here [1].

Indeed, each connector needs to be tested against last 2 minor versions
of Flink but there is a single archunit rules violation store.
Sometimes, the rules change between these 2 Flink versions and you need
to freeze the violations store to tackle the violations later on. This
change is useful to freeze the violation store and execute the archunit
tests for the Flink version the connector was built against [2] but
disable the archunit tests with the other Flink version [1].


Best

Etienne

[1]
https://github.com/apache/flink-connector-shared-utils/blob/b5ad097df25973ad05d0d6af0f988b65b0d8cd22/.github/workflows/_testing.yml#L39

[2]
https://github.com/apache/flink-connector-shared-utils/blob/b5ad097df25973ad05d0d6af0f988b65b0d8cd22/.github/workflows/_testing.yml#L46



[connectors] Notice to connector authors

2024-03-04 Thread Etienne Chauchot

Hi connector authors,

I'd like to bring a piece of information about the CI of connectors:

With the release of flink-connector-parent 1.1.0, you can now skip the 
archunit tests for a particular CI build as described here [1].


Indeed, each connector needs to be tested against last 2 minor versions 
of Flink but there is a single archunit rules violation store. 
Sometimes, the rules change between these 2 Flink versions and you need 
to freeze the violations store to tackle the violations later on. This 
change is useful to freeze the violation store and execute the archunit 
tests for the Flink version the connector was built against [2] but 
disable the archunit tests with the other Flink version [1].



Best

Etienne

[1] 
https://github.com/apache/flink-connector-shared-utils/blob/b5ad097df25973ad05d0d6af0f988b65b0d8cd22/.github/workflows/_testing.yml#L39


[2] 
https://github.com/apache/flink-connector-shared-utils/blob/b5ad097df25973ad05d0d6af0f988b65b0d8cd22/.github/workflows/_testing.yml#L46




[ANNOUNCE] Apache flink-connector-parent 1.1.0 released

2024-02-29 Thread Etienne Chauchot
The Apache Flink community is very happy to announce the release of 
Apache flink-connector-parent 1.1.0.


Apache Flink® is an open-source stream processing framework for 
distributed, high-performing, always-available, and accurate data 
streaming applications.


The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442

We would like to thank all contributors of the Apache Flink community 
who made this release possible!


Regards,
Etienne Chauchot


[RESULT] [VOTE] flink-connector-parent 1.1.0, release candidate #2

2024-02-28 Thread Etienne Chauchot

Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 3 of which are binding:

 * Qingsheng Ren (binding)
 * Rui Fan (non-binding)
 * Leonard Xu (binding)
 * Hang Ruan (non-binding)
 * Sergey Nuyanzin (non-binding)
 * Zhongqiang Gong (non-binding)
 * Yanquan Lv (non-binding)
 * Chesnay Schepler (binding)


There are no disapproving votes.

Thanks everyone!

Best

Etienne


Re: [VOTE] Release flink-connector-parent 1.1.0 release candidate #2

2024-02-28 Thread Etienne Chauchot

Hi all,

The vote on flink-connector-parent 1.1.0 RC2 is now closed. The result 
will be announced in a separate email.


Best

Etienne

Le 27/02/2024 à 12:23, Chesnay Schepler a écrit :

+1
- pom contents
- source contents
- Website PR

On 19/02/2024 18:33, Etienne Chauchot wrote:

Hi everyone,
Please review and vote on the release candidate #2 for the version 
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to 
dist.apache.org [2], which are signed with the key with fingerprint 
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc2 [5],
* website pull request listing the new release [6].

* confluence wiki: connector parent upgrade to version 1.1.0 that 
will be validated after the artifact is released (there is no PR 
mechanism on the wiki) [7]



The vote will be open for at least 72 hours. It is adopted by 
majority approval, with at least 3 PMC affirmative votes.


Thanks,
Etienne

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442
[2] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc2

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1707
[5] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc2


[6] https://github.com/apache/flink-web/pull/717

[7] 
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development




Re: [VOTE] Release flink-connector-parent 1.1.0 release candidate #2

2024-02-22 Thread Etienne Chauchot

Thanks everyone for your vote.

So far we are one binding vote short for the release to pass.

Best

Etienne

Le 20/02/2024 à 09:20, Sergey Nuyanzin a écrit :

Thanks for driving this, Etienne!

+1 (non-binding)

- Verified checksum and signature
- Verified pom
- Built from source
- Verified no binaries
- Checked staging repo on Maven central
- Checked source code tag
- Reviewed web PR


One thing (probably minor) I noticed that the artifacts (uploaded to nexus)
are built with jdk11 while usually it should be with jdk8
Since there is no jars I think it should be ok

On Tue, Feb 20, 2024 at 9:19 AM Hang Ruan  wrote:


+1 (non-binding)

- verified checksum and signature
- checked Github release tag
- checked release notes
- verified no binaries in source
- reviewed the web PR

Best,
Hang

Leonard Xu  于2024年2月20日周二 14:26写道:


+1 (binding)

- verified signatures
- verified hashsums
- built from source code succeeded
- checked Github release tag
- checked release notes
- reviewed all Jira tickets have been resolved
- reviewed the web PR

Best,
Leonard



2024年2月20日 上午11:14,Rui Fan<1996fan...@gmail.com>  写道:

Thanks for driving this, Etienne!

+1 (non-binding)

- Verified checksum and signature
- Verified pom content
- Build source on my Mac with jdk8
- Verified no binaries in source
- Checked staging repo on Maven central
- Checked source code tag
- Reviewed web PR

Best,
Rui

On Tue, Feb 20, 2024 at 10:33 AM Qingsheng Ren

wrote:

Thanks for driving this, Etienne!

+1 (binding)

- Checked release note
- Verified checksum and signature
- Verified pom content
- Verified no binaries in source
- Checked staging repo on Maven central
- Checked source code tag
- Reviewed web PR
- Built Kafka connector from source with parent pom in staging repo

Best,
Qingsheng

On Tue, Feb 20, 2024 at 1:34 AM Etienne Chauchot <

echauc...@apache.org>

wrote:


Hi everyone,
Please review and vote on the release candidate #2 for the version
1.1.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which

includes:

* JIRA release notes [1],
* the official Apache source release to be deployed to

dist.apache.org

[2], which are signed with the key with fingerprint
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc2 [5],
* website pull request listing the new release [6].

* confluence wiki: connector parent upgrade to version 1.1.0 that

will

be validated after the artifact is released (there is no PR mechanism

on

the wiki) [7]


The vote will be open for at least 72 hours. It is adopted by

majority

approval, with at least 3 PMC affirmative votes.

Thanks,
Etienne

[1]



https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442

[2]



https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc2

[3]https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1707

[5]



https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc2

[6]https://github.com/apache/flink-web/pull/717

[7]



https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development




[VOTE] Release flink-connector-parent 1.1.0 release candidate #2

2024-02-19 Thread Etienne Chauchot

Hi everyone,
Please review and vote on the release candidate #2 for the version 
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint 
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc2 [5],
* website pull request listing the new release [6].

* confluence wiki: connector parent upgrade to version 1.1.0 that will 
be validated after the artifact is released (there is no PR mechanism on 
the wiki) [7]



The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Etienne

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442
[2] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc2

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1707
[5] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc2


[6] https://github.com/apache/flink-web/pull/717

[7] 
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development


Re: [VOTE] Release flink-connector-parent, release candidate #1

2024-02-16 Thread Etienne Chauchot

Hi,

OK. I thought that not changing the artifact did not require a new RC as 
the release doc says that "some changes" do not require RC invalidation 
(it is my first release :) ). Thanks.


I'll cancel this release vote and start a RC2 vote.

Regarding the empty tool directory, as mentioned below all the 
connectors (cassandra, kafka, pulsar, parent, ...) released so far have 
this empty directory in the source release which I find confusing as 
well as mentioned.


I'll take the opportunity of the connector-parent RC2 to exclude that 
directory from connector-parent and I'll do another hotfix PR to change 
the release script and exclude the whole tools directory instead of 
tools/releasing/shared as now. That way, with next release, all the 
connectors will have their source release cleaned.


Best

Etienne

Le 15/02/2024 à 17:41, Chesnay Schepler a écrit :
Martijn is correct in that if the source release got changed we need 
to vet everything again.
If anything else had been changed instead there'd be a bit of leeway 
but the source release itself is the most important bit of a release 
and we just can't allow changes to that after the fact and still allow 
votes to count.


As for the change itself, it's good that the shared directory is now 
excluded, but we still have a de-facto empty tools directory.
That's ultimately the (really minor) issue I was referring to that I'd 
like to see resolved /eventually/, because an empty directory in a 
source release is just a bit confusing ("Is the release missing 
something?").


On 15/02/2024 17:27, Martijn Visser wrote:

Hi Etienne,

I fixed the source release [1] as requested, it no more contains 
tools/release/shared directory.

I don't think that is the correct way: my understanding is that this
invalidates basically all the votes, because now the checked artifact
has changed. It was requested to file a ticket as a follow-up, not to
immediately change the binary. We can't have a lazy consensus on a
release topic, with a changed artifact.

Best regards,

Martijn

On Thu, Feb 15, 2024 at 2:32 PM Etienne 
Chauchot  wrote:

Hi,

Considering that the code and artifact have note changed since last 
vote

(only source release and tag have changed) and considering that there
were already 3 binding votes for this RC1, I'll do this on a lazy
consensus. I'll release if no one objects until tomorrow as it will be
72h since last change.

Best

Etienne

Le 13/02/2024 à 13:24, Etienne Chauchot a écrit :

Hi all,

I fixed the source release [1] as requested, it no more contains
tools/release/shared directory.

I found out why it contained that directory, it was because parent_pom
branch was referring to an incorrect sub-module mount point for
release_utils branch (cf FLINK-34364 [2]). Here is the fixing PR (3).

And by the way I noticed that all the connectors source releases were
containing an empty tools/releasing directory because only
tools/releasing/shared is excluded in the source release script and
not the whole tools/releasing directory. It seems a bit messy to me so
I think we should fix that in the release scripts later on for next
connectors releases.

I also found out that the RC1 tag was pointing to my fork instead of
the main repo so I remade the tag (4)

Apart of that, the code and artifact have not changed so I did not
invalidate the RC1.

Please confirm that I can proceed to the release.

Best

Etienne

[1]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1/ 



[2]https://issues.apache.org/jira/browse/FLINK-34364

[3]https://github.com/apache/flink-connector-shared-utils/pull/36

[4]
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1 




Le 05/02/2024 à 12:36, Etienne Chauchot a écrit :

Hi,

I just got back from vacations. I'll close the vote thread and
proceed to the release later this week.

Here is the ticket:https://issues.apache.org/jira/browse/FLINK-34364

Best

Etienne

Le 04/02/2024 à 05:06, Qingsheng Ren a écrit :

+1 (binding)

- Verified checksum and signature
- Verified pom content
- Built flink-connector-kafka from source with the parent pom in 
staging


Best,
Qingsheng

On Thu, Feb 1, 2024 at 11:19 PM Chesnay 
Schepler   wrote:



- checked source/maven pom contents

Please file a ticket to exclude tools/release from the source 
release.


+1 (binding)

On 29/01/2024 15:59, Maximilian Michels wrote:

- Inspected the source for licenses and corresponding headers
- Checksums and signature OK

+1 (binding)

On Tue, Jan 23, 2024 at 4:08 PM Etienne 
Chauchot

wrote:

Hi everyone,

Please review and vote on the release candidate #1 for the 
version

1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific 
comments)


The complete staging area is available for your review, which 
includes:

* JIRA release notes [1],
* the official Apache source release to be deployed to 
dist.apache.org

[2], which are

Re: [VOTE] Release flink-connector-parent, release candidate #1

2024-02-15 Thread Etienne Chauchot

Hi,

Considering that the code and artifact have note changed since last vote 
(only source release and tag have changed) and considering that there 
were already 3 binding votes for this RC1, I'll do this on a lazy 
consensus. I'll release if no one objects until tomorrow as it will be 
72h since last change.


Best

Etienne

Le 13/02/2024 à 13:24, Etienne Chauchot a écrit :


Hi all,

I fixed the source release [1] as requested, it no more contains 
tools/release/shared directory.


I found out why it contained that directory, it was because parent_pom 
branch was referring to an incorrect sub-module mount point for 
release_utils branch (cf FLINK-34364 [2]). Here is the fixing PR (3).


And by the way I noticed that all the connectors source releases were 
containing an empty tools/releasing directory because only 
tools/releasing/shared is excluded in the source release script and 
not the whole tools/releasing directory. It seems a bit messy to me so 
I think we should fix that in the release scripts later on for next 
connectors releases.


I also found out that the RC1 tag was pointing to my fork instead of 
the main repo so I remade the tag (4)


Apart of that, the code and artifact have not changed so I did not 
invalidate the RC1.


Please confirm that I can proceed to the release.

Best

Etienne

[1] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1/


[2] https://issues.apache.org/jira/browse/FLINK-34364

[3] https://github.com/apache/flink-connector-shared-utils/pull/36

[4] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1



Le 05/02/2024 à 12:36, Etienne Chauchot a écrit :


Hi,

I just got back from vacations. I'll close the vote thread and 
proceed to the release later this week.


Here is the ticket: https://issues.apache.org/jira/browse/FLINK-34364

Best

Etienne

Le 04/02/2024 à 05:06, Qingsheng Ren a écrit :

+1 (binding)

- Verified checksum and signature
- Verified pom content
- Built flink-connector-kafka from source with the parent pom in staging

Best,
Qingsheng

On Thu, Feb 1, 2024 at 11:19 PM Chesnay Schepler  wrote:


- checked source/maven pom contents

Please file a ticket to exclude tools/release from the source release.

+1 (binding)

On 29/01/2024 15:59, Maximilian Michels wrote:

- Inspected the source for licenses and corresponding headers
- Checksums and signature OK

+1 (binding)

On Tue, Jan 23, 2024 at 4:08 PM Etienne Chauchot

wrote:

Hi everyone,

Please review and vote on the release candidate #1 for the version
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2], which are signed with the key with fingerprint
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc1 [5],
* website pull request listing the new release [6]

* confluence wiki: connector parent upgrade to version 1.1.0 that will
be validated after the artifact is released (there is no PR mechanism on
the wiki) [7]

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,

Etienne

[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442

[2]


https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1

[3]https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1698/

[5]


https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1

[6]https://github.com/apache/flink-web/pull/717

[7]


https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development



Re: [VOTE] Release flink-connector-parent, release candidate #1

2024-02-13 Thread Etienne Chauchot

Hi all,

I fixed the source release [1] as requested, it no more contains 
tools/release/shared directory.


I found out why it contained that directory, it was because parent_pom 
branch was referring to an incorrect sub-module mount point for 
release_utils branch (cf FLINK-34364 [2]). Here is the fixing PR (3).


And by the way I noticed that all the connectors source releases were 
containing an empty tools/releasing directory because only 
tools/releasing/shared is excluded in the source release script and not 
the whole tools/releasing directory. It seems a bit messy to me so I 
think we should fix that in the release scripts later on for next 
connectors releases.


I also found out that the RC1 tag was pointing to my fork instead of the 
main repo so I remade the tag (4)


Apart of that, the code and artifact have not changed so I did not 
invalidate the RC1.


Please confirm that I can proceed to the release.

Best

Etienne

[1] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1/


[2] https://issues.apache.org/jira/browse/FLINK-34364

[3] https://github.com/apache/flink-connector-shared-utils/pull/36

[4] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1



Le 05/02/2024 à 12:36, Etienne Chauchot a écrit :


Hi,

I just got back from vacations. I'll close the vote thread and proceed 
to the release later this week.


Here is the ticket: https://issues.apache.org/jira/browse/FLINK-34364

Best

Etienne

Le 04/02/2024 à 05:06, Qingsheng Ren a écrit :

+1 (binding)

- Verified checksum and signature
- Verified pom content
- Built flink-connector-kafka from source with the parent pom in staging

Best,
Qingsheng

On Thu, Feb 1, 2024 at 11:19 PM Chesnay Schepler  wrote:


- checked source/maven pom contents

Please file a ticket to exclude tools/release from the source release.

+1 (binding)

On 29/01/2024 15:59, Maximilian Michels wrote:

- Inspected the source for licenses and corresponding headers
- Checksums and signature OK

+1 (binding)

On Tue, Jan 23, 2024 at 4:08 PM Etienne Chauchot

wrote:

Hi everyone,

Please review and vote on the release candidate #1 for the version
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2], which are signed with the key with fingerprint
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc1 [5],
* website pull request listing the new release [6]

* confluence wiki: connector parent upgrade to version 1.1.0 that will
be validated after the artifact is released (there is no PR mechanism on
the wiki) [7]

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,

Etienne

[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442

[2]


https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1

[3]https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1698/

[5]


https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1

[6]https://github.com/apache/flink-web/pull/717

[7]


https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development



Re: [VOTE] Release flink-connector-parent, release candidate #1

2024-02-05 Thread Etienne Chauchot

Hi,

I just got back from vacations. I'll close the vote thread and proceed 
to the release later this week.


Here is the ticket: https://issues.apache.org/jira/browse/FLINK-34364

Best

Etienne

Le 04/02/2024 à 05:06, Qingsheng Ren a écrit :

+1 (binding)

- Verified checksum and signature
- Verified pom content
- Built flink-connector-kafka from source with the parent pom in staging

Best,
Qingsheng

On Thu, Feb 1, 2024 at 11:19 PM Chesnay Schepler  wrote:


- checked source/maven pom contents

Please file a ticket to exclude tools/release from the source release.

+1 (binding)

On 29/01/2024 15:59, Maximilian Michels wrote:

- Inspected the source for licenses and corresponding headers
- Checksums and signature OK

+1 (binding)

On Tue, Jan 23, 2024 at 4:08 PM Etienne Chauchot

wrote:

Hi everyone,

Please review and vote on the release candidate #1 for the version
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2], which are signed with the key with fingerprint
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc1 [5],
* website pull request listing the new release [6]

* confluence wiki: connector parent upgrade to version 1.1.0 that will
be validated after the artifact is released (there is no PR mechanism on
the wiki) [7]

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,

Etienne

[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442

[2]


https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1

[3]https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1698/

[5]


https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1

[6]https://github.com/apache/flink-web/pull/717

[7]


https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development



[jira] [Created] (FLINK-34364) stage_source_release.sh should exclude tools/release directory from the source release

2024-02-05 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-34364:


 Summary: stage_source_release.sh should exclude tools/release 
directory from the source release
 Key: FLINK-34364
 URL: https://issues.apache.org/jira/browse/FLINK-34364
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Parent, Release System
Reporter: Etienne Chauchot


This directory is the mount point of the release utils repository and should be 
excluded from the source release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34363) Connectors release utils should allow to not specify flink version in stage_jars.sh

2024-02-05 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-34363:


 Summary: Connectors release utils should allow to not specify 
flink version in stage_jars.sh
 Key: FLINK-34363
 URL: https://issues.apache.org/jira/browse/FLINK-34363
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Parent, Release System
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


For connectors-parent release, Flink version is not needed. The stage_jars.sh 
script should allow to specify only ${project_version} and not 
${project_version}-${flink_minor_version}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] Release flink-connector-parent, release candidate #1

2024-01-23 Thread Etienne Chauchot

Hi everyone,

Please review and vote on the release candidate #1 for the version 
1.1.0, as follows:


[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint 
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc1 [5],
* website pull request listing the new release [6]

* confluence wiki: connector parent upgrade to version 1.1.0 that will 
be validated after the artifact is released (there is no PR mechanism on 
the wiki) [7]


The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,

Etienne

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442
[2] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1698/
[5] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1

[6] https://github.com/apache/flink-web/pull/717

[7] 
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development


[jira] [Created] (FLINK-34137) Update CI to test archunit configuration

2024-01-17 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-34137:


 Summary: Update CI to test archunit configuration
 Key: FLINK-34137
 URL: https://issues.apache.org/jira/browse/FLINK-34137
 Project: Flink
  Issue Type: Sub-task
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


Update CI to test skiping archunit tests on non-main Flink versions. Test on 
submodules with archunit tests and without





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34136) Execute archunit tests only with Flink version that connectors were built against

2024-01-17 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-34136:


 Summary: Execute archunit tests only with Flink version that 
connectors were built against
 Key: FLINK-34136
 URL: https://issues.apache.org/jira/browse/FLINK-34136
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


As part of [this 
discussion|https://lists.apache.org/thread/pr0g812olzpgz21d9oodhc46db9jpxo3] , 
the need for connectors to specify the main flink version that a connector 
supports has arisen. 

This CI variable will allow to configure the build and tests differently 
depending on this version. This parameter would be optional.

The first use case is to run archunit tests only on the main supported version 
as discussed in the above thread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-04 Thread Etienne Chauchot

Congrats! Welcome onboard.

Best

Etienne

Le 04/01/2024 à 03:14, Jane Chan a écrit :

Congratulations, Alex!

Best,
Jane

On Thu, Jan 4, 2024 at 10:03 AM Junrui Lee  wrote:


Congratulations, Alex!

Best,
Junrui

weijie guo  于2024年1月4日周四 09:57写道:


Congratulations, Alex!

Best regards,

Weijie


Steven Wu  于2024年1月4日周四 02:07写道:


Congra, Alex! Well deserved!

On Wed, Jan 3, 2024 at 2:31 AM David Radley
wrote:


Sorry for my typo.

Many congratulations Alex!

From: David Radley
Date: Wednesday, 3 January 2024 at 10:23
To: David Anderson
Cc:dev@flink.apache.org  
Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer -

Alexander

Fedulov
Many Congratulations David .

From: Maximilian Michels
Date: Tuesday, 2 January 2024 at 12:16
To: dev
Cc: Alexander Fedulov
Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
Fedulov
Happy New Year everyone,

I'd like to start the year off by announcing Alexander Fedulov as a
new Flink committer.

Alex has been active in the Flink community since 2019. He has
contributed more than 100 commits to Flink, its Kubernetes operator,
and various connectors [1][2].

Especially noteworthy are his contributions on deprecating and
migrating the old Source API functions and test harnesses, the
enhancement to flame graphs, the dynamic rescale time computation in
Flink Autoscaling, as well as all the small enhancements Alex has
contributed which make a huge difference.

Beyond code contributions, Alex has been an active community member
with his activity on the mailing lists [3][4], as well as various
talks and blog posts about Apache Flink [5][6].

Congratulations Alex! The Flink community is proud to have you.

Best,
The Flink PMC

[1]


https://github.com/search?type=commits=author%3Aafedulov+org%3Aapache

[2]


https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC

[3]

https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov

[4]

https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov

[5]


https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/

[6]


https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6

3AU

Re: [DISCUSS] Release flink-connector-parent v1.01

2023-12-22 Thread Etienne Chauchot

Thanks Max !

Etienne

Le 21/12/2023 à 16:22, Maximilian Michels a écrit :

Anyone for pushing my pub key to apache dist ?

Done.

On Thu, Dec 21, 2023 at 2:36 PM Etienne Chauchot  wrote:

Hello,

All the ongoing PRs on this repo were merged. But, I'd like to leave
some more days until feature freeze in case someone had a feature ready
to integrate.

Let' put the feature freeze to  00:00:00 UTC on December 27th.

Best

Etienne

Le 15/12/2023 à 16:41, Ryan Skraba a écrit :

Hello!  I've been following this discussion (while looking and
building a lot of the connectors):

+1 (non-binding) to doing a 1.1.0 release adding the configurability
of surefire and jvm flags.

Thanks for driving this!

Ryan

On Fri, Dec 15, 2023 at 2:06 PM Etienne Chauchot   wrote:

Hi PMC members,

Version will be 1.1.0 and not 1.0.1 as one of the PMC members already
created this version tag in jira and tickets are targeted to this version.

Anyone for pushing my pub key to apache dist ?

Thanks

Etienne

Le 14/12/2023 à 17:51, Etienne Chauchot a écrit :

Hi all,

It has been 2 weeks since the start of this release discussion. For
now only Sergey agreed to release. On a lazy consensus basis, let's
say that we leave until Monday for people to express concerns about
releasing connector-parent.

In the meantime, I'm doing my environment setup and I miss the rights
to upload my GPG pub key to flink apache dist repo. Can one of the PMC
members push it ?

Joint to this email is the updated KEYS file with my pub key added.

Thanks

Best

Etienne

Le 05/12/2023 à 16:30, Etienne Chauchot a écrit :

Hi Péter,

My answers are inline


Best

Etienne


Le 05/12/2023 à 05:27, Péter Váry a écrit :

Hi Etienne,

Which branch would you cut the release from?

the parent_pom branch (consisting of a single maven pom file)

I find the flink-connector-parent branches confusing.

If I merge a PR to the ci_utils branch, would it immediately change the CI
workflow of all of the connectors?

The ci_utils branch is basically one ci.yml workflow. _testing.yml
and maven test-project are both for testing the ci.yml workflow and
display what it can do to connector authors.

As the connectors workflows refer ci.yml as this:
apache/flink-connector-shared-utils/.github/workflows/ci.yml@ci_utils,
if we merge changes to ci.yml all the CIs in the connectors' repo
will change.


If I merge something to the release_utils branch, would it immediately
change the release process of all of the connectors?

I don't know how release-utils scripts are integrated with the
connectors' code yet

I would like to add the possibility of creating Python packages for the
connectors [1]. This would consist of some common code, which should reside
in flink-connector-parent, like:
- scripts for running Python test - test infra. I expect that this would
evolve in time
- ci workflow - this would be more slow moving, but might change if the
infra is charging
- release scripts - this would be slow moving, but might change too.

I think we should have a release for all of the above components, so the
connectors could move forward on their own pace.

I think it is quite out of the scope of this release: here we are
only talking about releasing a parent pom maven file for the connectors.


What do you think?

Thanks,
Péter

[1]https://issues.apache.org/jira/browse/FLINK-33528

On Thu, Nov 30, 2023, 16:55 Etienne Chauchotwrote:


Thanks Sergey for your vote. Indeed I have listed only the PRs merged
since last release but there are these 2 open PRs that could be worth
reviewing/merging before release.

https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of

flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would

Re: [DISCUSS] Release flink-connector-parent v1.01

2023-12-21 Thread Etienne Chauchot

Hello,

All the ongoing PRs on this repo were merged. But, I'd like to leave 
some more days until feature freeze in case someone had a feature ready 
to integrate.


Let' put the feature freeze to  00:00:00 UTC on December 27th.

Best

Etienne

Le 15/12/2023 à 16:41, Ryan Skraba a écrit :

Hello!  I've been following this discussion (while looking and
building a lot of the connectors):

+1 (non-binding) to doing a 1.1.0 release adding the configurability
of surefire and jvm flags.

Thanks for driving this!

Ryan

On Fri, Dec 15, 2023 at 2:06 PM Etienne Chauchot  wrote:

Hi PMC members,

Version will be 1.1.0 and not 1.0.1 as one of the PMC members already
created this version tag in jira and tickets are targeted to this version.

Anyone for pushing my pub key to apache dist ?

Thanks

Etienne

Le 14/12/2023 à 17:51, Etienne Chauchot a écrit :

Hi all,

It has been 2 weeks since the start of this release discussion. For
now only Sergey agreed to release. On a lazy consensus basis, let's
say that we leave until Monday for people to express concerns about
releasing connector-parent.

In the meantime, I'm doing my environment setup and I miss the rights
to upload my GPG pub key to flink apache dist repo. Can one of the PMC
members push it ?

Joint to this email is the updated KEYS file with my pub key added.

Thanks

Best

Etienne

Le 05/12/2023 à 16:30, Etienne Chauchot a écrit :

Hi Péter,

My answers are inline


Best

Etienne


Le 05/12/2023 à 05:27, Péter Váry a écrit :

Hi Etienne,

Which branch would you cut the release from?

the parent_pom branch (consisting of a single maven pom file)

I find the flink-connector-parent branches confusing.

If I merge a PR to the ci_utils branch, would it immediately change the CI
workflow of all of the connectors?

The ci_utils branch is basically one ci.yml workflow. _testing.yml
and maven test-project are both for testing the ci.yml workflow and
display what it can do to connector authors.

As the connectors workflows refer ci.yml as this:
apache/flink-connector-shared-utils/.github/workflows/ci.yml@ci_utils,
if we merge changes to ci.yml all the CIs in the connectors' repo
will change.


If I merge something to the release_utils branch, would it immediately
change the release process of all of the connectors?

I don't know how release-utils scripts are integrated with the
connectors' code yet

I would like to add the possibility of creating Python packages for the
connectors [1]. This would consist of some common code, which should reside
in flink-connector-parent, like:
- scripts for running Python test - test infra. I expect that this would
evolve in time
- ci workflow - this would be more slow moving, but might change if the
infra is charging
- release scripts - this would be slow moving, but might change too.

I think we should have a release for all of the above components, so the
connectors could move forward on their own pace.


I think it is quite out of the scope of this release: here we are
only talking about releasing a parent pom maven file for the connectors.


What do you think?

Thanks,
Péter

[1]https://issues.apache.org/jira/browse/FLINK-33528

On Thu, Nov 30, 2023, 16:55 Etienne Chauchot   wrote:


Thanks Sergey for your vote. Indeed I have listed only the PRs merged
since last release but there are these 2 open PRs that could be worth
reviewing/merging before release.

https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of

flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would be my first ASF release, I
might require the guidance of one of the PMC members.


Best

Etienne






Re: [DISCUSS] Release flink-connector-parent v1.01

2023-12-15 Thread Etienne Chauchot

Hi PMC members,

Version will be 1.1.0 and not 1.0.1 as one of the PMC members already 
created this version tag in jira and tickets are targeted to this version.


Anyone for pushing my pub key to apache dist ?

Thanks

Etienne

Le 14/12/2023 à 17:51, Etienne Chauchot a écrit :


Hi all,

It has been 2 weeks since the start of this release discussion. For 
now only Sergey agreed to release. On a lazy consensus basis, let's 
say that we leave until Monday for people to express concerns about 
releasing connector-parent.


In the meantime, I'm doing my environment setup and I miss the rights 
to upload my GPG pub key to flink apache dist repo. Can one of the PMC 
members push it ?


Joint to this email is the updated KEYS file with my pub key added.

Thanks

Best

Etienne

Le 05/12/2023 à 16:30, Etienne Chauchot a écrit :


Hi Péter,

My answers are inline


Best

Etienne


Le 05/12/2023 à 05:27, Péter Váry a écrit :

Hi Etienne,

Which branch would you cut the release from?

the parent_pom branch (consisting of a single maven pom file)

I find the flink-connector-parent branches confusing.

If I merge a PR to the ci_utils branch, would it immediately change the CI
workflow of all of the connectors?


The ci_utils branch is basically one ci.yml workflow. _testing.yml 
and maven test-project are both for testing the ci.yml workflow and 
display what it can do to connector authors.


As the connectors workflows refer ci.yml as this: 
apache/flink-connector-shared-utils/.github/workflows/ci.yml@ci_utils, 
if we merge changes to ci.yml all the CIs in the connectors' repo 
will change.



If I merge something to the release_utils branch, would it immediately
change the release process of all of the connectors?
I don't know how release-utils scripts are integrated with the 
connectors' code yet

I would like to add the possibility of creating Python packages for the
connectors [1]. This would consist of some common code, which should reside
in flink-connector-parent, like:
- scripts for running Python test - test infra. I expect that this would
evolve in time
- ci workflow - this would be more slow moving, but might change if the
infra is charging
- release scripts - this would be slow moving, but might change too.

I think we should have a release for all of the above components, so the
connectors could move forward on their own pace.



I think it is quite out of the scope of this release: here we are 
only talking about releasing a parent pom maven file for the connectors.



What do you think?

Thanks,
Péter

[1]https://issues.apache.org/jira/browse/FLINK-33528

On Thu, Nov 30, 2023, 16:55 Etienne Chauchot  wrote:


Thanks Sergey for your vote. Indeed I have listed only the PRs merged
since last release but there are these 2 open PRs that could be worth
reviewing/merging before release.

https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of

flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would be my first ASF release, I
might require the guidance of one of the PMC members.


Best

Etienne






Re: [VOTE] FLIP-396: Trial to test GitHub Actions as an alternative for Flink's current Azure CI infrastructure

2023-12-13 Thread Etienne Chauchot

Thanks Matthias for your hard work !

+1 (binding)

Best

Etienne

Le 12/12/2023 à 11:23, Lincoln Lee a écrit :

+1 (binding)

Thanks for driving this!

Best,
Lincoln Lee


Yun Tang  于2023年12月12日周二 17:52写道:


Thanks for Matthias driving this work!

+1 (binding)

Best
Yun Tang

From: Yangze Guo
Sent: Tuesday, December 12, 2023 16:12
To:dev@flink.apache.org  
Subject: Re: [VOTE] FLIP-396: Trial to test GitHub Actions as an
alternative for Flink's current Azure CI infrastructure

+1 (binding)

Best,
Yangze Guo

On Tue, Dec 12, 2023 at 3:51 PM Yuxin Tan  wrote:

+1 (non binding)
Thanks for the effort.

Best,
Yuxin


Samrat Deb  于2023年12月12日周二 15:25写道:


+1 (non binding)
Thanks for driving

On Tue, 12 Dec 2023 at 11:59 AM, Sergey Nuyanzin
wrote:


+1 (binding)

Thanks for driving this

On Tue, Dec 12, 2023, 07:22 Rui Fan<1996fan...@gmail.com>  wrote:


+1(binding)

Best,
Rui

On Tue, Dec 12, 2023 at 11:58 AM weijie guo <

guoweijieres...@gmail.com

wrote:


Thanks Matthias for this efforts.

+1(binding)


Best regards,

Weijie


Matthias Pohl  于2023年12月11日周一

21:51写道:

Hi everyone,
I'd like to start a vote on FLIP-396 [1]. It covers enabling

GitHub

Actions

(GHA) in Apache Flink. This means that GHA workflows will run

aside

from

the usual Azure CI workflows in a trial phase (which ends

earliest

with

the

release of Flink 1.19). Azure CI will still serve as the

project's

ground

of truth until the community decides in a final vote to switch

to

GHA

or

stick to Azure CI.

The related discussion thread can be found in [2].

The vote will remain open for at least 72 hours and only

concluded

if

there

are no objections and enough (i.e. at least 3) binding votes.

Matthias

[1]



https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+to+test+GitHub+Actions+as+an+alternative+for+Flink%27s+current+Azure+CI+infrastructure

[2]

https://lists.apache.org/thread/h4cmv7l3y8mxx2t435dmq4ltco4sbrgb

--

[image: Aiven]

*Matthias Pohl*
Opensource Software Engineer, *Aiven*
matthias.p...@aiven.io  |  +49 170 9869525
aiven.io|   <

https://www.facebook.com/aivencloud

   < https://twitter.com/aiven_io>
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: [PROPOSAL] Contribute Flink CDC Connectors project to Apache Flink

2023-12-07 Thread Etienne Chauchot

Big +1, thanks this will be a very useful addition to Flink.

Best

Etienne

Le 07/12/2023 à 09:26, Hang Ruan a écrit :

+1 for contributing CDC Connectors  to Apache Flink.

Best,
Hang

Yuxin Tan  于2023年12月7日周四 16:05写道:


Cool, +1 for contributing CDC Connectors to Apache Flink.

Best,
Yuxin


Jing Ge  于2023年12月7日周四 15:43写道:


Awesome! +1

Best regards,
Jing

On Thu, Dec 7, 2023 at 8:34 AM Sergey Nuyanzin
wrote:


thanks for working on this and driving it

+1

On Thu, Dec 7, 2023 at 7:26 AM Feng Jin  wrote:


This is incredibly exciting news, a big +1 for this.

Thank you for the fantastic work on Flink CDC. We have created

thousands

of

real-time integration jobs using Flink CDC connectors.


Best,
Feng

On Thu, Dec 7, 2023 at 1:45 PM gongzhongqiang <

gongzhongqi...@apache.org

wrote:


It's very exciting to hear the news.
+1 for adding CDC Connectors  to Apache Flink !


Best,
Zhongqiang

Leonard Xu  于2023年12月7日周四 11:25写道:


Dear Flink devs,


As you may have heard, we at Alibaba (Ververica) are planning to

donate

CDC Connectors for the Apache Flink project

*[1]* to the Apache Flink community.



CDC Connectors for Apache Flink comprise a collection of source

connectors designed specifically for Apache Flink. These connectors

*[2]*
  enable the ingestion of changes from various databases using

Change

Data Capture (CDC), most of these CDC connectors are powered by

Debezium

*[3]*
. They support both the DataStream API and the Table/SQL API,

facilitating the reading of database snapshots and continuous

reading

of

transaction logs with exactly-once processing, even in the event of
failures.



Additionally, in the latest version 3.0, we have introduced many

long-awaited features. Starting from CDC version 3.0, we've built a
Streaming ELT Framework available for streaming data integration.

This

framework allows users to write their data synchronization logic

in a

simple YAML file, which will automatically be translated into a

Flink

DataStreaming job. It emphasizes optimizing the task submission

process

and

offers advanced functionalities such as whole database

synchronization,

merging sharded tables, and schema evolution

*[4]*.




I believe this initiative is a perfect match for both sides. For

the

Flink community, it presents an opportunity to enhance Flink's

competitive

advantage in streaming data integration, promoting the healthy

growth

and

prosperity of the Apache Flink ecosystem. For the CDC Connectors

project,

becoming a sub-project of Apache Flink means being part of a

neutral

open-source community, which can attract a more diverse pool of
contributors.


Please note that the aforementioned points represent only some of

our

motivations and vision for this donation. Specific future

operations

need

to be further discussed in this thread. For example, the

sub-project

name

after the donation; we hope to name it Flink-CDC

aiming to streaming data intergration through Apache Flink,
following the naming convention of Flink-ML; And this project is

managed

by a total of 8 maintainers, including 3 Flink PMC members and 1

Flink

Committer. The remaining 4 maintainers are also highly active

contributors

to the Flink community, donating this project to the Flink

community

implies that their permissions might be reduced. Therefore, we may

need

to

bring up this topic for further discussion within the Flink PMC.
Additionally, we need to discuss how to migrate existing users and
documents. We have a user group of nearly 10,000 people and a

multi-version

documentation site need to migrate. We also need to plan for the

migration

of CI/CD processes and other specifics.



While there are many intricate details that require

implementation,

we

are committed to progressing and finalizing this donation process.



Despite being Flink’s most active ecological project (as

evaluated

by

GitHub metrics), it also boasts a significant user base. However, I

believe

it's essential to commence discussions on future operations only

after

the

community reaches a consensus on whether they desire this donation.


Really looking forward to hear what you think!



Best,
Leonard (on behalf of the Flink CDC Connectors project

maintainers)

[1]https://github.com/ververica/flink-cdc-connectors
[2]


https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-connectors.html

[3]https://debezium.io
[4]


https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-pipeline.html


--
Best regards,
Sergey


Re: [DISCUSS] Release flink-connector-parent v1.01

2023-12-05 Thread Etienne Chauchot

Hi Péter,

My answers are inline


Best

Etienne


Le 05/12/2023 à 05:27, Péter Váry a écrit :

Hi Etienne,

Which branch would you cut the release from?

the parent_pom branch (consisting of a single maven pom file)


I find the flink-connector-parent branches confusing.

If I merge a PR to the ci_utils branch, would it immediately change the CI
workflow of all of the connectors?


The ci_utils branch is basically one ci.yml workflow. _testing.yml and 
maven test-project are both for testing the ci.yml workflow and display 
what it can do to connector authors.


As the connectors workflows refer ci.yml as this: 
apache/flink-connector-shared-utils/.github/workflows/ci.yml@ci_utils, 
if we merge changes to ci.yml all the CIs in the connectors' repo will 
change.




If I merge something to the release_utils branch, would it immediately
change the release process of all of the connectors?
I don't know how release-utils scripts are integrated with the 
connectors' code yet


I would like to add the possibility of creating Python packages for the
connectors [1]. This would consist of some common code, which should reside
in flink-connector-parent, like:
- scripts for running Python test - test infra. I expect that this would
evolve in time
- ci workflow - this would be more slow moving, but might change if the
infra is charging
- release scripts - this would be slow moving, but might change too.

I think we should have a release for all of the above components, so the
connectors could move forward on their own pace.



I think it is quite out of the scope of this release: here we are only 
talking about releasing a parent pom maven file for the connectors.




What do you think?

Thanks,
Péter

[1]https://issues.apache.org/jira/browse/FLINK-33528

On Thu, Nov 30, 2023, 16:55 Etienne Chauchot  wrote:


Thanks Sergey for your vote. Indeed I have listed only the PRs merged
since last release but there are these 2 open PRs that could be worth
reviewing/merging before release.

https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of

flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would be my first ASF release, I
might require the guidance of one of the PMC members.


Best

Etienne






Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-12-04 Thread Etienne Chauchot

Correct,

I forgot that in the bylaws, committer vote is binding for FLIPs thanks 
for the reminder.


Best

Etienne

Le 30/11/2023 à 10:43, Leonard Xu a écrit :

+1(binding)

Btw, @Etienne, IIRC, your vote should be a binding one.


Best,
Leonard


2023年11月30日 下午5:03,Etienne Chauchot  写道:

+1 (non-biding)

Etienne

Le 30/11/2023 à 09:13, Rui Fan a écrit :

+1(binding)

Best,
Rui

On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang   wrote:


+1 (binding)

Best,
Lijie

Zhu Zhu   于2023年11月30日周四 13:13写道:


+1

Thanks,
Zhu

Xia Sun   于2023年11月30日周四 11:41写道:


Hi everyone,

I'd like to start a vote on FLIP-379: Dynamic source parallelism

inference

for batch jobs[1] which has been discussed in this thread [2].

The vote will be open for at least 72 hours unless there is an

objection

or

not enough votes.


[1]



https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs

[2]https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8


Best Regards,
Xia

Re: [DISCUSS] Release flink-connector-parent v1.01

2023-11-30 Thread Etienne Chauchot
Thanks Sergey for your vote. Indeed I have listed only the PRs merged 
since last release but there are these 2 open PRs that could be worth 
reviewing/merging before release.


https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would be my first ASF release, I
might require the guidance of one of the PMC members.


Best

Etienne






Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-11-30 Thread Etienne Chauchot

+1 (non-biding)

Etienne

Le 30/11/2023 à 09:13, Rui Fan a écrit :

+1(binding)

Best,
Rui

On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang  wrote:


+1 (binding)

Best,
Lijie

Zhu Zhu  于2023年11月30日周四 13:13写道:


+1

Thanks,
Zhu

Xia Sun  于2023年11月30日周四 11:41写道:


Hi everyone,

I'd like to start a vote on FLIP-379: Dynamic source parallelism

inference

for batch jobs[1] which has been discussed in this thread [2].

The vote will be open for at least 72 hours unless there is an

objection

or

not enough votes.


[1]



https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs

[2]https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8


Best Regards,
Xia


[DISCUSS] Release flink-connector-parent v1.01

2023-11-29 Thread Etienne Chauchot

Hi all,

I would like to discuss making a v1.0.1 release of flink-connector-parent.

Since last release, there were only 2 changes:

- https://github.com/apache/flink-connector-shared-utils/pull/19 
(spotless addition)


- https://github.com/apache/flink-connector-shared-utils/pull/26 
(surefire configuration)


The new release would bring the ability to skip some tests in the 
connectors and among other things skip the archunit tests. It is 
important for connectors to skip archunit tests when tested against a 
version of Flink that changes the archunit rules leading to a change of 
the violation store. As there is only one violation store and the 
connector needs to be tested against last 2 minor Flink versions, only 
the version the connector was built against needs to run the archunit 
tests and have them reflected in the violation store.



I volunteer to make the release. As it would be my first ASF release, I 
might require the guidance of one of the PMC members.



Best

Etienne






Re: [DISCUSSION] flink-connector-shared-utils release process

2023-11-29 Thread Etienne Chauchot

Thanks Leonard for the review.

Best,

Etienne

Le 29/11/2023 à 08:25, Leonard Xu a écrit :

Thanks Etienne for update the wiki,

I just checked the change by comparing with history version, LGTM


Best,
Leonard


2023年11月28日 下午10:43,Etienne Chauchot  写道:

Hi all,

I just updated the 
doc:https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development

Best

Etienne

Le 27/11/2023 à 12:43, Etienne Chauchot a écrit :

Sure!

Le 23/11/2023 à 02:57, Leonard Xu a écrit :

Thanks Etienne for driving this.


- a flink-connector-shared-utils-*test* clone repo and a 
*io.github.user.flink*:flink-connector-parent custom artifact to be able to 
directly commit and install the artifact in the CI
- a custom ci script that does the cloning and mvn install in the ci.yml github 
action script for testing with the new flink-connector-parent artifact
If people agree on the process and location

+1 for the process and location, could we also add a short paragraph and the 
location link as well in [1] to remind connector developers?

Best,
Leonard

[1]https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development



Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Etienne Chauchot

Hi all,

FYI there is the ASF infra roundtable soon. One of the subjects for this 
session is GitHub Actions. It could be worth passing by:


December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.

For information about theroundtables, and about how to join, 
see:https://infra.apache.org/roundtable.html 



Best

Etienne

Le 24/11/2023 à 14:16, Maximilian Michels a écrit :

Thanks for reviving the efforts here Matthias! +1 for the transition
to GitHub Actions.

As for ASF Infra Jenkins, it works fine. Jenkins is extremely
feature-rich. Not sure about the spare capacity though. I know that
for Apache Beam, Google donated a bunch of servers to get additional
build capacity.

-Max


On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
  wrote:

Btw. even though we've been focusing on GitHub Actions with this FLIP, I'm
curious whether somebody has experience with Apache Infra's Jenkins
deployment. The discussion I found about Jenkins [1] is quite out-dated
(2014). I haven't worked with it myself but could imagine that there are
some features provided through plugins which are missing in GitHub Actions.

[1]https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4

On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl
wrote:


That's a valid point. I updated the FLIP accordingly:


Currently, the secrets (e.g. for S3 access tokens) are maintained by
certain PMC members with access to the corresponding configuration in the
Azure CI project. This responsibility will be moved to Apache Infra. They
are in charge of handling secrets in the Apache organization. As a
consequence, updating secrets is becoming a bit more complicated. This can
be still considered an improvement from a legal standpoint because the
responsibility is transferred from an individual company (i.e. Ververica
who's the maintainer of the Azure CI project) to the Apache Foundation.


On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser
wrote:


Hi Matthias,

Thanks for the write-up and for the efforts on this. I really hope
that we can move away from Azure towards GHA for a better integration
as well (directly seeing if a PR can be merged due to CI passing for
example).

The one thing I'm missing in the FLIP is how we would setup the
secrets for the nightly runs (for the S3 tests, potential tests with
external services etc). My guess is we need to provide the secret to
ASF Infra and then we would be able to refer to them in a pipeline?

Best regards,

Martijn

On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
  wrote:

I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
switched to FLIP-396 [2] for the sake of consistency. 8)

[1]https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
[2]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions

On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl
wrote:


Hi everyone,

The Flink community discussed migrating from Azure CI to GitHub

Actions

quite some time ago [1]. The efforts around that stalled due to

limitations

around self-hosted runner support from Apache Infra’s side. There

were some

recent developments on that topic. Apache Infra is experimenting with
ephemeral runners now which might enable us to move ahead with GitHub
Actions.

The goal is to join the trial phase for ephemeral runners and

experiment

with our CI workflows in terms of stability and performance. At the

end we

can decide whether we want to abandon Azure CI and move to GitHub

Actions

or stick to the former one.

Nico Weidner and Chesnay laid the groundwork on this topic in the

past. I

picked up the work they did and continued experimenting with it in my

own

fork XComp/flink [2] the past few weeks. The workflows are in a state

where

I think that we start moving the relevant code into Flink’s

repository.

Example runs for the basic workflow [3] and the extended (nightly)

workflow

[4] are provided.

This will bring a few more changes to the Flink contributors. That is

why

I wanted to bring this discussion to the mailing list first. I did a

write

up on (hopefully) all related topics in FLIP-395 [5].

I’m looking forward to your feedback.

Matthias

[1]https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k

[2]https://github.com/XComp/flink/actions

[3]https://github.com/XComp/flink/actions/runs/6926309782

[4]https://github.com/XComp/flink/actions/runs/6927443941

[5]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions


--

[image: Aiven]

*Matthias Pohl*
Opensource Software Engineer, *Aiven*
matthias.p...@aiven.io  |  +49 170 9869525
aiven.io|

<

https://twitter.com/aiven_io>

*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: [DISCUSSION] flink-connector-shared-utils release process

2023-11-28 Thread Etienne Chauchot

Hi all,

I just updated the doc: 
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development


Best

Etienne

Le 27/11/2023 à 12:43, Etienne Chauchot a écrit :


Sure!

Le 23/11/2023 à 02:57, Leonard Xu a écrit :

Thanks Etienne for driving this.


- a flink-connector-shared-utils-*test* clone repo and a 
*io.github.user.flink*:flink-connector-parent custom artifact to be able to 
directly commit and install the artifact in the CI
- a custom ci script that does the cloning and mvn install in the ci.yml github 
action script for testing with the new flink-connector-parent artifact
If people agree on the process and location

+1 for the process and location, could we also add a short paragraph and the 
location link as well in [1] to remind connector developers?

Best,
Leonard

[1]https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development




Re: [DISCUSS] Contribute Flink Doris Connector to the Flink community

2023-11-27 Thread Etienne Chauchot

+1 as well

Best

Etienne

Le 27/11/2023 à 06:22, Jing Ge a écrit :

That sounds great! +1

Best regards
Jing

On Mon, Nov 27, 2023 at 3:38 AM Leonard Xu  wrote:


Thanks wudi for kicking off the discussion,

+1 for the idea from my side.

A FLIP like Yun posted is required if no other objections.

Best,
Leonard


2023年11月26日 下午6:22,wudi<676366...@qq.com.INVALID>  写道:

Hi all,

At present, Flink Connector and Flink's repository have been

decoupled[1].

At the same time, the Flink-Doris-Connector[3] has been maintained based

on the Apache Doris[2] community.

I think the Flink Doris Connector can be migrated to the Flink community

because it It is part of Flink Connectors and can also expand the ecosystem
of Flink Connectors.

I volunteer to move this forward if I can.

[1]

https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development

[2]https://doris.apache.org/
[3]https://github.com/apache/doris-flink-connector

--

Brs,
di.wu


Re: Sending a hi!

2023-11-27 Thread Etienne Chauchot

Welcome to the community Pranav !

Best

Etienne

Le 25/11/2023 à 17:59, Pranav Sharma a écrit :

Hi everyone,

I am Pranav, and I am getting started to contribute to the Apache Flink
project. I have been previously contributing to another Apache project,
Allura.

I came across flink during my day job as a data engineer, and hoping to
contribute to the codebase as I learn more about the internal workings of
the framework.

Regards,
Pranav S


Re: [DISCUSSION] flink-connector-shared-utils release process

2023-11-27 Thread Etienne Chauchot

Sure!

Le 23/11/2023 à 02:57, Leonard Xu a écrit :

Thanks Etienne for driving this.


- a flink-connector-shared-utils-*test* clone repo and a 
*io.github.user.flink*:flink-connector-parent custom artifact to be able to 
directly commit and install the artifact in the CI
- a custom ci script that does the cloning and mvn install in the ci.yml github 
action script for testing with the new flink-connector-parent artifact
If people agree on the process and location

+1 for the process and location, could we also add a short paragraph and the 
location link as well in [1] to remind connector developers?

Best,
Leonard

[1]https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development




Re: [DISCUSSION] flink-connector-shared-utils release process

2023-11-22 Thread Etienne Chauchot
I agree, it would make sense to add this process to 
https://cwiki.apache.org/confluence/display/FLINK/Continuous+Integration


If people agree on the process and location, I can do the addition to 
the doc.


Best

Etienne

Le 22/11/2023 à 14:29, Sergey Nuyanzin a écrit :

yes, that's the way how I tested my changes
thanks for confirming that it's ok

I wonder whether it should be documented somewhere, since it is not very
obvious?

On Wed, Nov 22, 2023 at 10:35 AM Etienne Chauchot
wrote:


Hi Sergey,

The other alternative that allows for autonomous testing of ci changes
(thanks @Chesnay for the suggestion) is to create you own repo:

- a flink-connector-shared-utils-*test* clone repo and a
*io.github.user.flink*:flink-connector-parent custom artifact to be able
to directly commit and install the artifact in the CI
- a custom ci script that does the cloning and mvn install in the ci.yml
github action script for testing with the new flink-connector-parent
artifact

seehttps://github.com/echauchot/flink-connector-shared-utils-test
FLINK-* branches for details.

I hope it helps,

Best

Etienne


Le 22/11/2023 à 00:52, Sergey Nuyanzin a écrit :

Hi Etienne

thanks for starting this discussion
+1 from my side for snapshots  since I also faced same issue



On Thu, Nov 9, 2023 at 11:21 AM Etienne Chauchot
wrote:


Hi all,

flink-connector-shared-utils contains utilities for connectors (parent
pom, ci scripts, template test connector project etc...). It is divided
into 2 main branches:

- parent_pom (1) containing just a pom.xml

- ci_utils (2) containing the test project using the parent pom and the
ci scripts.


The problem is when we want to test changes to the parent pom in the
test project, we need to release org.apache.flink:flink-connector-parent
so that the test project in the other branch can use this updated
parent. This seems bad to trigger a release for testing.

So I would like to propose setting up a snapshot for
org.apache.flink:flink-connector-parent with regular  deployment to
https://repository.apache.org/content/repositories/snapshots   like we

do

for flink.

An alternative could be to keep using
io.github.zentol.flink:flink-connector-parent for testing and release
this special artifact when needed by ci_utils test project.

WDYT ?


[1]

https://github.com/apache/flink-connector-shared-utils/tree/parent_pom

[2]https://github.com/apache/flink-connector-shared-utils/tree/ci_utils

Best

Etienne





Re: [DISCUSSION] flink-connector-shared-utils release process

2023-11-22 Thread Etienne Chauchot

Hi Sergey,

The other alternative that allows for autonomous testing of ci changes 
(thanks @Chesnay for the suggestion) is to create you own repo:


- a flink-connector-shared-utils-*test* clone repo and a 
*io.github.user.flink*:flink-connector-parent custom artifact to be able 
to directly commit and install the artifact in the CI
- a custom ci script that does the cloning and mvn install in the ci.yml 
github action script for testing with the new flink-connector-parent 
artifact


see https://github.com/echauchot/flink-connector-shared-utils-test 
FLINK-* branches for details.


I hope it helps,

Best

Etienne


Le 22/11/2023 à 00:52, Sergey Nuyanzin a écrit :

Hi Etienne

thanks for starting this discussion
+1 from my side for snapshots  since I also faced same issue



On Thu, Nov 9, 2023 at 11:21 AM Etienne Chauchot
wrote:


Hi all,

flink-connector-shared-utils contains utilities for connectors (parent
pom, ci scripts, template test connector project etc...). It is divided
into 2 main branches:

- parent_pom (1) containing just a pom.xml

- ci_utils (2) containing the test project using the parent pom and the
ci scripts.


The problem is when we want to test changes to the parent pom in the
test project, we need to release org.apache.flink:flink-connector-parent
so that the test project in the other branch can use this updated
parent. This seems bad to trigger a release for testing.

So I would like to propose setting up a snapshot for
org.apache.flink:flink-connector-parent with regular  deployment to
https://repository.apache.org/content/repositories/snapshots  like we do
for flink.

An alternative could be to keep using
io.github.zentol.flink:flink-connector-parent for testing and release
this special artifact when needed by ci_utils test project.

WDYT ?


[1]https://github.com/apache/flink-connector-shared-utils/tree/parent_pom

[2]https://github.com/apache/flink-connector-shared-utils/tree/ci_utils

Best

Etienne



[DISCUSSION] flink-connector-shared-utils release process

2023-11-09 Thread Etienne Chauchot

Hi all,

flink-connector-shared-utils contains utilities for connectors (parent 
pom, ci scripts, template test connector project etc...). It is divided 
into 2 main branches:


- parent_pom (1) containing just a pom.xml

- ci_utils (2) containing the test project using the parent pom and the 
ci scripts.



The problem is when we want to test changes to the parent pom in the 
test project, we need to release org.apache.flink:flink-connector-parent 
so that the test project in the other branch can use this updated 
parent. This seems bad to trigger a release for testing.


So I would like to propose setting up a snapshot for 
org.apache.flink:flink-connector-parent with regular  deployment to 
https://repository.apache.org/content/repositories/snapshots like we do 
for flink.


An alternative could be to keep using 
io.github.zentol.flink:flink-connector-parent for testing and release 
this special artifact when needed by ci_utils test project.


WDYT ?


[1] https://github.com/apache/flink-connector-shared-utils/tree/parent_pom

[2] https://github.com/apache/flink-connector-shared-utils/tree/ci_utils

Best

Etienne


Re: [DISCUSS] Connector releases for Flink 1.18

2023-11-08 Thread Etienne Chauchot

Hi,

Thanks for starting the discussion, I already started testing Cassandra 
connector on Flink 1.18. The error in the nightly build linked here is a 
known issue tracked by FLINK-32353. To fix it, it needs a change in 
connector-parent and in the ci utils that are tracked by FLINK-32563.


Best

Etienne

Le 07/11/2023 à 02:20, mystic lama a écrit :

Hi,

I looked into both connectors, they both are failing at archunit. I tried a
few things but with my limited experience couldn't make much progress.
Don't want to block the release. If anyone knows what needs to be done,
please move forward with the fixes.

Hopefully, shall be able to help more in the future.

Thanks
Ash


On Fri, 3 Nov 2023 at 08:35, mystic lama  wrote:


Hi,

I can look into pulsar and Cassandra fixes. I agree it's archunit issue
based on the build logs.
I don't have much experience with it, but I was able to reproduce it
locally.

I can send out PR's over the weekend. Do we have JIRA's for this? If not I
can create to track.

Thanks
@sh

On Fri, 3 Nov 2023 at 02:20, Martijn Visser
wrote:


Hi Danny,

Thanks a lot for starting the discussion on this topic! I know that
Pulsar is failing because of Archunit, which I expect the same issue
to be for Cassandra (I know that Etienne was working on this). Happy
to help.

Best regards,

Martijn

On Thu, Nov 2, 2023 at 9:08 PM Danny Cranmer
wrote:

Hey all.

Now Flink 1.18 is released we need to do some connector releases for
integration parity. We can use this thread to start the discussions for
each connector release and spawn separate vote threads. Kafka is done

[1]

(thanks Gordon) and AWS connectors are in process [2], I appreciate help
with votes on that one.

Opensearch: Flink 1.18 nightly build passing [3]. I volunteer to be

release

manager for this one. I will consolidate 1.0.2 [4] and 1.1.0 [5] into a
single release as 1.1.0.
MongoDB: Flink 1.18 nightly build passing [6]. I volunteer to be release
manager for this one. I will work with Jiabao to get FLINK-33257 merged
into 1.1.0 and release that [7].
GCP Pub Sub: Flink 1.18 nightly build passing [8]. I volunteer to be
release manager for this one. Looks like 3.0.2 is ready to go [9], we
should proceed with this.

ElasticSearch: Flink 1.18 nightly build passing [10]. There are a good
stack of changes ready for 3.1.0 [11], suggest we release that.
JDBC: Flink 1.18 nightly build passing [12]. There are a good stack of
changes ready for 3.2.0 [13], suggest we release that.
RabbitMQ: Flink 1.18 nightly build passing [14]. There are no changes

ready

for 3.0.2 [15], recommend we do a minimal 3.0.1-1.18.

Pulsar: The nightly CI is failing [16], needs a deeper look
Cassandra: The nightly CI is failing [17], needs a deeper look

Once I have completed Opensearch/MongoDB/GCP I will pick up others, but
hope others can help out.

Thanks,
Danny

[1]https://lists.apache.org/thread/0lvrm9hl3hnn1fpr74k68lsm22my8xh7
[2]https://lists.apache.org/thread/ko6nrtfsykkz9c9k9392jfj4l9f7qg11
[3]https://github.com/apache/flink-connector-opensearch/actions
[4]https://issues.apache.org/jira/projects/FLINK/versions/12353142
[5]https://issues.apache.org/jira/projects/FLINK/versions/12353141
[6]https://github.com/apache/flink-connector-mongodb/actions
[7]https://issues.apache.org/jira/projects/FLINK/versions/12353483
[8]https://github.com/apache/flink-connector-gcp-pubsub/actions
[9]https://issues.apache.org/jira/projects/FLINK/versions/12353144
[10]https://github.com/apache/flink-connector-elasticsearch/actions
[11]https://issues.apache.org/jira/projects/FLINK/versions/12352520
[12]https://github.com/apache/flink-connector-jdbc/actions
[13]https://issues.apache.org/jira/projects/FLINK/versions/12353143
[14]https://github.com/apache/flink-connector-rabbitmq/actions
[15]https://issues.apache.org/jira/projects/FLINK/versions/12353145
[16]https://github.com/apache/flink-connector-pulsar/actions
[17]https://github.com/apache/flink-connector-cassandra/actions

Review request

2023-11-06 Thread Etienne Chauchot

Hi everyone,

Is there anyone available to review this PR (1) that I opened 1,5 month 
ago ? People I've pinged seem to be unavailable at the moment.


Thanks

[1] https://github.com/apache/flink/pull/23443

Best

Etienne



Re: off for a week

2023-10-27 Thread Etienne Chauchot

Thanks Max !

Le 26/10/2023 à 15:44, Maximilian Michels a écrit :

Have a great time off, Etienne!

On Thu, Oct 26, 2023 at 3:38 PM Etienne Chauchot  wrote:

Hi,

FYI, I'll be off and unresponsive for a week starting tomorrow evening.
For ongoing work, please ping me before tomorrow evening or within a week

Best

Etienne

off for a week

2023-10-26 Thread Etienne Chauchot

Hi,

FYI, I'll be off and unresponsive for a week starting tomorrow evening. 
For ongoing work, please ping me before tomorrow evening or within a week


Best

Etienne


Re: [ANNOUNCE] The Flink Speed Center and benchmark daily run are back online

2023-10-25 Thread Etienne Chauchot

Nice !

Thank you and everyone involved for the hard work.

Etienne

Le 19/10/2023 à 10:24, Zakelly Lan a écrit :

Hi everyone,

Flink benchmarks [1] generate daily performance reports in the Apache
Flink slack channel (#flink-dev-benchmarks) to detect performance
regression [2]. Those benchmarks previously were running on several
machines donated and maintained by Ververica. Unfortunately, those
machines were gone due to account issues [3] and the benchmarks daily
run stopped since August 24th delaying the release of Flink 1.18 a
bit. [4].

Ververica donated several new machines! After several weeks of work, I
have successfully re-established the codespeed panel and benchmark
daily run pipelines on them. At this time, we are pleased to announce
that the Flink Speed Center and benchmark pipelines are back online.
These new machines have a more formal management to ensure that
previous accidents will not occur in the future.

What's more, I successfully recovered historical data backed up by
Yanfei Lei [5]. So with the old domain [6] redirected to the new
machines, the old links that existed in previous records will still be
valid. Besides the benchmarks with Java8 and Java11, I also added a
pipeline for Java17 running daily.

How to use it:
We also registered a new domain name 'flink-speed.xyz' for the Flink
Speed Center [7]. It is recommended to use the new domain in the
future. Currently, the self-service method of triggering benchmarks is
unavailable considering the lack of resources and potential
vulnerabilities of Jenkins. Please contact one of Apache Flink PMCs to
submit a benchmark. More info is updated on the wiki[8].

Daily Monitoring:
The performance daily monitoring on the Apache Flink slack channel [2]
is still unavailable as the benchmark results need more time to
stabilize in the new environment. Once the baseline results become
available for regression detection, I will enable the daily
monitoring.

Please feel free to reach out to me if you have any suggestions or
questions. Thanks Ververica again for denoting machines!


Best,
Zakelly

[1]https://github.com/apache/flink-benchmarks
[2]https://lists.apache.org/thread/zok62sx4m50c79htfp18ymq5vmtgbgxj
[3]https://issues.apache.org/jira/browse/FLINK-33052
[4]https://lists.apache.org//thread/5x28rp3zct4p603hm4zdwx6kfr101w38
[5]https://issues.apache.org/jira/browse/FLINK-30890
[6]http://codespeed.dak8s.net:8000
[7]http://flink-speed.xyz
[8]https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=115511847

Re: [DISCUSSION] test connectors against Flink master in PRs

2023-10-11 Thread Etienne Chauchot

Hi,

Cross-posting in that thread too:

FYI, build issues for connectors related to architecture tests adding or 
removing violations is under fixing and tracked here (1)


[1] https://issues.apache.org/jira/browse/FLINK-32563

Best

Etienne


Le 03/07/2023 à 10:05, Etienne Chauchot a écrit :


Hi all,

I wanted to post here the result of a discussion I had in private with 
Chesnay related to this subject. The question was regarding archunit 
with connectors:


"How to deal with different archunit violations between 2 versions of 
Flink ?  If a violation is normal and should be added to the violation 
store but the related rule has changed in a recent Flink version, how 
to have different set of violations between 2 flink versions for one 
single violation store?"


We concluded by saying that even if a connector should support (and 
therefore be tested against) the last 2 versions of Flink, the 
archunit tests should run only on the main supported Flink version 
(usually the most recent one).


As a consequence, I'll configure that in Cassandra connector and 
update the connectors migration wiki doc to serve as an example for 
such cases.


Best

Etienne


Le 29/06/2023 à 15:57, Etienne Chauchot a écrit :


Hi Martijn,

Thanks for your feedback. I makes total sense to me.

I'll enable it for Cassandra.

Best

Etienne

Le 29/06/2023 à 10:54, Martijn Visser a écrit :


Hi Etienne,

I think it all depends on the actual maintainers of the connector to
make a decision on that: if their unreleased version of the connector
should be compatible with a new Flink version, then they should test
against it. For example, that's already done at Elasticsearch [1] and
JDBC [2].

Choosing which versions to support is a decision by the maintainers in
the community, and it always requires an action by a maintainer to
update the CI config to set the correct versions whenever a new Flink
version is released.

Best regards,

Martijn

[1]https://github.com/apache/flink-connector-elasticsearch/blob/main/.github/workflows/push_pr.yml
[2]https://github.com/apache/flink-connector-jdbc/blob/main/.github/workflows/push_pr.yml

On Wed, Jun 28, 2023 at 6:09 PM Etienne Chauchot  wrote:

Hi all,

Connectors are external to flink. As such, they need to be tested
against stable (released) versions of Flink.

But I was wondering if it would make sense to test connectors in PRs
also against latest flink master snapshot to allow to discover failures
before merging the PRs, ** while the author is still available **,
rather than discovering them in nightly tests (that test against
snapshot) after the merge. That would allow the author to anticipate
potential failures and provide more future proof code (even if master is
subject to change before the connector release).

Of course, if a breaking change was introduced in master, such tests
will fail. But they should be considered as a preview of how the code
will behave against the current snapshot of the next flink version.

WDYT ?


Best

Etienne

Re: [ANNOUNCE] Release 1.18.0, release candidate #1

2023-10-11 Thread Etienne Chauchot

Hi all,

FYI, build issues for connectors related to architecture tests adding or 
removing violations is under fixing and tracked here (1)


Of course, it is indeed not a blocker as connectors development is 
decoupled from Flink development.


[1] https://issues.apache.org/jira/browse/FLINK-32563

Best

Etienne

Le 09/10/2023 à 09:07, Jing Ge a écrit :

Hi Sergey and devs,

Thanks for bringing this to our attention. I am open to discuss that. I
have the following thoughts:

1. Like I already mentioned in many other threads, build issues in
downstream repos should not block upstream release. I understand the
concern that developers want to have stable connectors. But it violates the
intention of connector externalization.
2. It is expensive to download Flink jar(roughly 500M) from S3 for each PR
and nightly build of each connector. Does it make sense to leverage [1].
Many Flink docs have been using it.
3. I will check internally at Ververica to see if we could make the file
publicly accessible to temporarily solve this issue.

Looking forward to your feedback.

Best regards,
Jing

[1]https://nightlies.apache.org/flink/

On Fri, Oct 6, 2023 at 2:03 PM Konstantin Knauf  wrote:


Hi everyone,

I've just opened a PR for the release announcement [1] and I am looking
forward to reviews and feedback.

Cheers,

Konstantin

[1]https://github.com/apache/flink-web/pull/680

Am Fr., 6. Okt. 2023 um 11:03 Uhr schrieb Sergey Nuyanzin <
snuyan...@gmail.com>:


sorry for not mentioning it in previous mail

based on the reason above I'm
-1 (non-binding)

also there is one more issue [1]
which blocks all the externalised connectors testing against the most
recent commits in
to corresponding branches
[1]https://issues.apache.org/jira/browse/FLINK-33175


On Thu, Oct 5, 2023 at 11:19 PM Sergey Nuyanzin
wrote:


Thanks for creating RC1

* Downloaded artifacts
* Built from sources
* Verified checksums and gpg signatures
* Verified versions in pom files
* Checked NOTICE, LICENSE files

The strange thing I faced is
CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished
fails on AZP [1]

which looks like it is related to [2], [3] fixed  in 1.18.0 (not 100%
sure).


[1]https://issues.apache.org/jira/browse/FLINK-33186
[2]https://issues.apache.org/jira/browse/FLINK-32996
[3]https://issues.apache.org/jira/browse/FLINK-32907

On Tue, Oct 3, 2023 at 2:53 PM Ferenc Csaky 
Thanks everyone for the efforts!

Checked the following:

- Downloaded artifacts
- Built Flink from source
- Verified checksums/signatures
- Verified NOTICE, LICENSE files
- Deployed dummy SELECT job via SQL gateway on standalone cluster,

things

seemed fine according to the log files

+1 (non-binding)

Best,
Ferenc


--- Original Message ---
On Friday, September 29th, 2023 at 22:12, Gabor Somogyi <
gabor.g.somo...@gmail.com> wrote:




Thanks for the efforts!

+1 (non-binding)

* Verified versions in the poms
* Built from source
* Verified checksums and signatures
* Started basic workloads with kubernetes operator
* Verified NOTICE and LICENSE files

G

On Fri, Sep 29, 2023, 18:16 Matthias Pohl

matthias.p...@aiven.io.invalid

wrote:


Thanks for creating RC1. I did the following checks:

* Downloaded artifacts
* Built Flink from sources
* Verified SHA512 checksums GPG signatures
* Compared checkout with provided sources
* Verified pom file versions
* Went over NOTICE file/pom files changes without finding anything
suspicious
* Deployed standalone session cluster and ran WordCount example in

batch

and streaming: Nothing suspicious in log files found

+1 (binding)

On Fri, Sep 29, 2023 at 10:34 AM Etienne Chauchot

echauc...@apache.org

wrote:


Hi all,

Thanks to the team for this RC.

I did a quick check of this RC against user pipelines (1) coded

with

DataSet (even if deprecated and soon removed), DataStream and

SQL

APIs

based on the small scope of this test, LGTM

+1 (non-binding)

[1]https://github.com/echauchot/tpcds-benchmark-flink

Best
Etienne

Le 28/09/2023 à 19:35, Jing Ge a écrit :


Hi everyone,

The RC1 for Apache Flink 1.18.0 has been created. The related

voting

process will be triggered once the announcement is ready. The

RC1

has

all
the artifacts that we would typically have for a release,

except

for

the
release note and the website pull request for the release

announcement.

The following contents are available for your review:

- Confirmation of no benchmarks regression at the thread[1].
- The preview source release and binary convenience releases

[2],

which

are signed with the key with fingerprint 96AE0E32CBE6E0753CE6

[3].

- all artifacts that would normally be deployed to the Maven
Central Repository [4].
- source code tag "release-1.18.0-rc1" [5]

Your help testing the release will be greatly appreciated! And

we'll

create the rc1 release and the voting thread as soon as all

the

efforts

are
finished.

[1]

https://lists.apache.org/thread/yxyphglwwvq57wcqlfrnk3

Re: [ANNOUNCE] Release 1.18.0, release candidate #1

2023-09-29 Thread Etienne Chauchot

Hi all,

Thanks to the team for this RC.

I did a quick check of this RC against user pipelines (1) coded with 
DataSet (even if deprecated and soon removed), DataStream and SQL APIs


based on the small scope of this test, LGTM

+1 (non-binding)

[1] https://github.com/echauchot/tpcds-benchmark-flink

Best
Etienne

Le 28/09/2023 à 19:35, Jing Ge a écrit :

Hi everyone,

The RC1 for Apache Flink 1.18.0 has been created. The related voting
process will be triggered once the announcement is ready. The RC1 has all
the artifacts that we would typically have for a release, except for the
release note and the website pull request for the release announcement.

The following contents are available for your review:

- Confirmation of no benchmarks regression at the thread[1].
- The preview source release and binary convenience releases [2], which
are signed with the key with fingerprint 96AE0E32CBE6E0753CE6 [3].
- all artifacts that would normally be deployed to the Maven
Central Repository [4].
- source code tag "release-1.18.0-rc1" [5]

Your help testing the release will be greatly appreciated! And we'll
create the rc1 release and the voting thread as soon as all the efforts are
finished.

[1]https://lists.apache.org/thread/yxyphglwwvq57wcqlfrnk3qo9t3sr2ro
[2]https://dist.apache.org/repos/dist/dev/flink/flink-1.18.0-rc1/
[3]https://dist.apache.org/repos/dist/release/flink/KEYS
[4]https://repository.apache.org/content/repositories/orgapacheflink-1657
[5]https://github.com/apache/flink/releases/tag/release-1.18.0-rc1

Best regards,
Qingsheng, Sergei, Konstantin and Jing


[jira] [Created] (FLINK-33059) Support transparent compression for file-connector for all file input formats

2023-09-07 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-33059:


 Summary: Support transparent compression for file-connector for 
all file input formats
 Key: FLINK-33059
 URL: https://issues.apache.org/jira/browse/FLINK-33059
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / FileSystem
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


Delimited file input formats (contrary to binary input format etc...) do not 
support compression via the existing decorator because split length is 
determined by the compressed file length lead to 
[this|https://issues.apache.org/jira/browse/FLINK-30314] bug .  We should force 
reading the whole file split (like it is done for binary input formats) on 
compressed files. Parallelism is still done at the file level (as now)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Updates to Flink's external connector CI workflows

2023-08-16 Thread Etienne Chauchot

This is great! Thanks for working on this.

Best

Etienne

Le 15/06/2023 à 13:19, Martijn Visser a écrit :

Hi all,

I would like to inform you of two changes that have been made to the shared
CI workflow that's used for Flink's externalized connectors.

1. Up until now, weekly builds were running to validate that connector code
(still) works with Flink. However, these builds were only running for code
on the "main" branch of the connector, and not for the branches of the
connector (like v3.0 for Elasticsearch, v1.0 for Opensearch etc). This was
tracked underhttps://issues.apache.org/jira/browse/FLINK-31923.

That issue has now been fixed, with the Github Action workflow now
accepting a map with arrays, which can contain a combination of Flink
versions to test for and the connector branch it should test. See
https://github.com/apache/flink-connector-jdbc/blob/main/.github/workflows/weekly.yml#L28-L47
for an example on the Flink JDBC connector

This change has already been applied on the externalized connectors GCP
PubSub, RabbitMQ, JDBC, Pulsar, MongoDB, Opensearch, Cassandra,
Elasticsearch. AWS is pending the merging of the PR. For Kafka, Hive and
HBase, since they haven't finished externalization, this isn't applicable
to them yet.

2. When working on the debugging of a problem with the JDBC connector, one
of the things that was needed to debug that problem was the ability to see
the JVM thread dump. Withhttps://issues.apache.org/jira/browse/FLINK-32331
now completed, every failed CI run will have a JVM thread dump. You can see
the implementation for that in
https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml#L161-L195

Best regards,

Martijn


[RESULT][VOTE] FLIP-322 Cooldown period for adaptive scheduler

2023-08-10 Thread Etienne Chauchot

Hi,
The vote has been closed. Thank you for your votes. Here is the result:

Binding votes (3 in total):

- Martijn Visser
- Rui Fan
- Maximilian Michels

Non-binding votes (1 in total):

- Prabhu Joseph


According to the Flink Bylaws [1], consensus + three committer votes are
required to accept a FLIP. This requirement has been reached via 3 binding
votes (PMC + committer).

Thanks,
Etienne

[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Actions


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-08-10 Thread Etienne Chauchot

Thanks Max!

Etienne

Le 09/08/2023 à 12:53, Maximilian Michels a écrit :

+1 (binding)

-Max

On Tue, Aug 8, 2023 at 10:56 AM Etienne Chauchot  wrote:

Hi all,

As part of Flink bylaws, binding votes for FLIP changes are active
committer votes.

Up to now, we have only 2 binding votes. Can one of the committers/PMC
members vote on this FLIP ?

Thanks

Etienne


Le 08/08/2023 à 10:19, Etienne Chauchot a écrit :

Hi Joseph,

Thanks for the detailled review !

Best

Etienne

Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :

*+1 (non-binding)*

Thanks for working on this. We have seen good improvement during the cool
down period with this feature.
Below are details on the test results from one of our clusters:

On a scale-out operation, 8 new nodes were added one by one with a gap of
~30 seconds. There were 8 restarts within 4 minutes with the default
behaviour,
whereas only one with this feature (cooldown period of 4 minutes).

The number of records processed by the job with this feature during the
restart window is higher (2909764), whereas it is only 1323960 with the
default
behaviour due to multiple restarts, where it spends most of the time
recovering, and also whatever work progressed by the tasks after the last
successful completed checkpoint is lost.

Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
Remarks
NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
the difference the cool down period brings in. When the job is doing
multiple restarts, the task spends most of the time recovering, and the
progress the task made will be lost during the restart.

2. There is only one restart with Cool Down Period which happened when the
8th node got added back.

Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
NumRestarts 8 1








On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot
wrote:


Hi all,

I'm going on vacation tonight for 3 weeks.

Even if the vote is not finished, as the implementation is rather quick
and the design discussion had settled, I preferred I implementing
FLIP-322 [1] to allow people to take a look while I'm off.

[1]https://github.com/apache/flink/pull/22985

Best

Etienne

Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :

Hi all,

Would you mind casting your vote to this second vote thread (opened
after new discussions) so that the subject can move forward ?

@David, @Chesnay, @Robert you took part to the discussions, can you
please sent your vote ?

Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :

Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-08-08 Thread Etienne Chauchot

Hi all,

As part of Flink bylaws, binding votes for FLIP changes are active 
committer votes.


Up to now, we have only 2 binding votes. Can one of the committers/PMC 
members vote on this FLIP ?


Thanks

Etienne


Le 08/08/2023 à 10:19, Etienne Chauchot a écrit :


Hi Joseph,

Thanks for the detailled review !

Best

Etienne

Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :

*+1 (non-binding)*

Thanks for working on this. We have seen good improvement during the cool
down period with this feature.
Below are details on the test results from one of our clusters:

On a scale-out operation, 8 new nodes were added one by one with a gap of
~30 seconds. There were 8 restarts within 4 minutes with the default
behaviour,
whereas only one with this feature (cooldown period of 4 minutes).

The number of records processed by the job with this feature during the
restart window is higher (2909764), whereas it is only 1323960 with the
default
behaviour due to multiple restarts, where it spends most of the time
recovering, and also whatever work progressed by the tasks after the last
successful completed checkpoint is lost.

Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
Remarks
NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
the difference the cool down period brings in. When the job is doing
multiple restarts, the task spends most of the time recovering, and the
progress the task made will be lost during the restart.

2. There is only one restart with Cool Down Period which happened when the
8th node got added back.

Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
NumRestarts 8 1








On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot
wrote:


Hi all,

I'm going on vacation tonight for 3 weeks.

Even if the vote is not finished, as the implementation is rather quick
and the design discussion had settled, I preferred I implementing
FLIP-322 [1] to allow people to take a look while I'm off.

[1]https://github.com/apache/flink/pull/22985

Best

Etienne

Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :

Hi all,

Would you mind casting your vote to this second vote thread (opened
after new discussions) so that the subject can move forward ?

@David, @Chesnay, @Robert you took part to the discussions, can you
please sent your vote ?

Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :

Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-08-08 Thread Etienne Chauchot

Hi Joseph,

Thanks for the detailled review !

Best

Etienne

Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :

*+1 (non-binding)*

Thanks for working on this. We have seen good improvement during the cool
down period with this feature.
Below are details on the test results from one of our clusters:

On a scale-out operation, 8 new nodes were added one by one with a gap of
~30 seconds. There were 8 restarts within 4 minutes with the default
behaviour,
whereas only one with this feature (cooldown period of 4 minutes).

The number of records processed by the job with this feature during the
restart window is higher (2909764), whereas it is only 1323960 with the
default
behaviour due to multiple restarts, where it spends most of the time
recovering, and also whatever work progressed by the tasks after the last
successful completed checkpoint is lost.

Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
Remarks
NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
the difference the cool down period brings in. When the job is doing
multiple restarts, the task spends most of the time recovering, and the
progress the task made will be lost during the restart.

2. There is only one restart with Cool Down Period which happened when the
8th node got added back.

Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
NumRestarts 8 1








On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot
wrote:


Hi all,

I'm going on vacation tonight for 3 weeks.

Even if the vote is not finished, as the implementation is rather quick
and the design discussion had settled, I preferred I implementing
FLIP-322 [1] to allow people to take a look while I'm off.

[1]https://github.com/apache/flink/pull/22985

Best

Etienne

Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :

Hi all,

Would you mind casting your vote to this second vote thread (opened
after new discussions) so that the subject can move forward ?

@David, @Chesnay, @Robert you took part to the discussions, can you
please sent your vote ?

Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :

Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-07-12 Thread Etienne Chauchot

Hi all,

I'm going on vacation tonight for 3 weeks.

Even if the vote is not finished, as the implementation is rather quick 
and the design discussion had settled, I preferred I implementing 
FLIP-322 [1] to allow people to take a look while I'm off.


[1] https://github.com/apache/flink/pull/22985

Best

Etienne

Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :


Hi all,

Would you mind casting your vote to this second vote thread (opened 
after new discussions) so that the subject can move forward ?


@David, @Chesnay, @Robert you took part to the discussions, can you 
please sent your vote ?


Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :


Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for 
adaptive scheduler [1].


This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne 

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-07-12 Thread Etienne Chauchot

Hi all,

Would you mind casting your vote to this second vote thread (opened 
after new discussions) so that the subject can move forward ?


@David, @Chesnay, @Robert you took part to the discussions, can you 
please sent your vote ?


Thank you very much

Best

Etienne

Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :


Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for 
adaptive scheduler [1].


This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne 

[jira] [Created] (FLINK-32563) Allow connectors CI to specify the main supported Flink version

2023-07-07 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-32563:


 Summary: Allow connectors CI to specify the main supported Flink 
version
 Key: FLINK-32563
 URL: https://issues.apache.org/jira/browse/FLINK-32563
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System / CI
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


As part of [this 
discussion|https://lists.apache.org/thread/pr0g812olzpgz21d9oodhc46db9jpxo3] , 
the need for connectors to specify the main flink version that a connector 
supports has arisen. 

This CI variable will allow to configure the build and tests differently 
depending on this version. This parameter would be optional.

The first use case is to run archunit tests only on the main supported version 
as discussed in the above thread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-07-06 Thread Etienne Chauchot

Hi all,

Thanks for your feedback about the FLIP-322: Cooldown period for 
adaptive scheduler [1].


This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 9th 15:00 GMT) unless there is an objection or
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-06 Thread Etienne Chauchot

Hi,

I think we have reached a consensus here. I have updated the FLIP to 
reflect recent suggestions. I will start a new vote.


Best

Etienne

Le 05/07/2023 à 14:42, Etienne Chauchot a écrit :


Hi all,

Thanks David for your suggestions. Comments inline.

Le 04/07/2023 à 13:35, David Morávek a écrit :

waiting 2 min between 2 requirements push seems ok to me

This depends on the workload. Would you care if the cost of rescaling were
close to zero (which is for most out-of-the-box workloads)? In that case,
it would be desirable to rescale more frequently, for example, if TMs join
incrementally.

Creating a value that covers everything is impossible unless it's
self-tuning, so I'd prefer having a smooth experience for people trying
things out (just imagine doing a demo at the conference) and having them
opt-in for longer cooldowns.

The users still have the ability to lower the cooldown period for high 
workloads but we could definitely set a default value to a lower 
number. I agree to favo 
<https://www.linguee.fr/anglais-francais/traduction/favour.html>r 
lower numbers (for smooth rescale experience) and consider higher 
numbers (for high workloads) as exceptions. But we still need to agree 
on a suitable default for most cases: 30s ?

One idea to keep the timeouts lower while getting more balance would be
restarting the cooldown period when new resources or requirements are
received. This would also bring the cooldown's behavior closer to the
resource-stabilization timeout. Would that make sense?



you mean, if slots are received during the cooldown period instead of 
proposed behavior (A),  do behavior (B) ?


A. schedule a rescale at lastRescale + cooldown point in time

B. schedule a rescale at ** now ** + cooldown point in time

It looks fine to me. It is even better because it avoids having 2 
rescales scheduled at the same time if 2 slots change arrive during 
the same cooldown period.



Etienne



Depends on how you implement it. If you ignore all of shouldRescale, yes,

but you shouldn't do that in the first place.



I agree, this is not what I planned to implement.



That sounds great; let's go ahead and outline this in the FLIP.

Best,
D.


On Tue, Jul 4, 2023 at 12:30 PM Etienne Chauchot
wrote:


Hi all,

Thanks David for your feedback. My comments are inline

Le 04/07/2023 à 09:16, David Morávek a écrit :

They will struggle if they add new resources and nothing happens for 5

minutes.

The same applies if they start playing with FLIP-291 APIs. I'm wondering

if

the cooldown makes sense there since it was the user's deliberate choice

to

push new requirements. 樂

Sure, but remember that the initial rescale is always done immediately.
Only the time between 2 rescales is controlled by the cooldown period. I
don't see a user adding resources every 10s (your proposed default
value) or even with, let's say 2 min, waiting 2 min between 2
requirements push seems ok to me.



Best,
D.

On Tue, Jul 4, 2023 at 9:11 AM David Morávek   wrote:


The FLIP reads sane to me. I'm unsure about the default values, though;

5

minutes of wait time between rescales feels rather strict, and we should
rethink it to provide a better out-of-the-box experience.

I'd focus on newcomers trying AS / Reactive Mode out. They will struggle
if they add new resources and nothing happens for 5 minutes. I'd suggest
defaulting to
*jobmanager.adaptive-scheduler.resource-stabilization-timeout* (which
defaults to 10s).

If users add resources, the re-scale will happen right away. It is only
for next additions that they will have to wait for the coolDown period
to end.

But anyway, we could lower the default value, I just took what Robert
suggested in the ticket.



I'm still struggling to grasp max internal (force rescale). Ignoring

`AdaptiveScheduler#shouldRescale()`

condition seems rather dangerous. Wouldn't a simple case where you add a
new TM and remove it before the max interval is reached (so there is
nothing to do) result in an unnecessary job restart?

With current behavior (on master) : adding the TM will result in
restarting if the number of slots added leads to job parallelism
increase of more than 2. Then removing it can have 2 consequences:
either it is removed before the resource-stabilisation timeout and there
will be no restart. Or it is removed after this timeout (the job is in
Running state) and it will entail another restart and parallelism decrease.

With the proposed behavior: what the scaling-interval.max will change is
only on the resource addition part: when the TM is added, if the time
since last rescale > scaling-interval.max, then a restart and
parallelism increase will be done even if it leads to a parallelism
increase < 2. The rest regarding TM removal does not change.

=> So, the real difference with the current behavior is ** if the slots
addition was too little ** : in the current behavior nothing happens. In
the new behavior nothing happens unless the addition arrive

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-05 Thread Etienne Chauchot

Hi all,

Thanks David for your suggestions. Comments inline.

Le 04/07/2023 à 13:35, David Morávek a écrit :

waiting 2 min between 2 requirements push seems ok to me

This depends on the workload. Would you care if the cost of rescaling were
close to zero (which is for most out-of-the-box workloads)? In that case,
it would be desirable to rescale more frequently, for example, if TMs join
incrementally.

Creating a value that covers everything is impossible unless it's
self-tuning, so I'd prefer having a smooth experience for people trying
things out (just imagine doing a demo at the conference) and having them
opt-in for longer cooldowns.

The users still have the ability to lower the cooldown period for high 
workloads but we could definitely set a default value to a lower number. 
I agree to favo 
<https://www.linguee.fr/anglais-francais/traduction/favour.html>r lower 
numbers (for smooth rescale experience) and consider higher numbers (for 
high workloads) as exceptions. But we still need to agree on a suitable 
default for most cases: 30s ?

One idea to keep the timeouts lower while getting more balance would be
restarting the cooldown period when new resources or requirements are
received. This would also bring the cooldown's behavior closer to the
resource-stabilization timeout. Would that make sense?



you mean, if slots are received during the cooldown period instead of 
proposed behavior (A),  do behavior (B) ?


A. schedule a rescale at lastRescale + cooldown point in time

B. schedule a rescale at ** now ** + cooldown point in time

It looks fine to me. It is even better because it avoids having 2 
rescales scheduled at the same time if 2 slots change arrive during the 
same cooldown period.



Etienne





Depends on how you implement it. If you ignore all of shouldRescale, yes,

but you shouldn't do that in the first place.



I agree, this is not what I planned to implement.




That sounds great; let's go ahead and outline this in the FLIP.

Best,
D.


On Tue, Jul 4, 2023 at 12:30 PM Etienne Chauchot
wrote:


Hi all,

Thanks David for your feedback. My comments are inline

Le 04/07/2023 à 09:16, David Morávek a écrit :

They will struggle if they add new resources and nothing happens for 5

minutes.

The same applies if they start playing with FLIP-291 APIs. I'm wondering

if

the cooldown makes sense there since it was the user's deliberate choice

to

push new requirements. 樂


Sure, but remember that the initial rescale is always done immediately.
Only the time between 2 rescales is controlled by the cooldown period. I
don't see a user adding resources every 10s (your proposed default
value) or even with, let's say 2 min, waiting 2 min between 2
requirements push seems ok to me.



Best,
D.

On Tue, Jul 4, 2023 at 9:11 AM David Morávek   wrote:


The FLIP reads sane to me. I'm unsure about the default values, though;

5

minutes of wait time between rescales feels rather strict, and we should
rethink it to provide a better out-of-the-box experience.

I'd focus on newcomers trying AS / Reactive Mode out. They will struggle
if they add new resources and nothing happens for 5 minutes. I'd suggest
defaulting to
*jobmanager.adaptive-scheduler.resource-stabilization-timeout* (which
defaults to 10s).


If users add resources, the re-scale will happen right away. It is only
for next additions that they will have to wait for the coolDown period
to end.

But anyway, we could lower the default value, I just took what Robert
suggested in the ticket.



I'm still struggling to grasp max internal (force rescale). Ignoring

`AdaptiveScheduler#shouldRescale()`

condition seems rather dangerous. Wouldn't a simple case where you add a
new TM and remove it before the max interval is reached (so there is
nothing to do) result in an unnecessary job restart?

With current behavior (on master) : adding the TM will result in
restarting if the number of slots added leads to job parallelism
increase of more than 2. Then removing it can have 2 consequences:
either it is removed before the resource-stabilisation timeout and there
will be no restart. Or it is removed after this timeout (the job is in
Running state) and it will entail another restart and parallelism decrease.

With the proposed behavior: what the scaling-interval.max will change is
only on the resource addition part: when the TM is added, if the time
since last rescale > scaling-interval.max, then a restart and
parallelism increase will be done even if it leads to a parallelism
increase < 2. The rest regarding TM removal does not change.

=> So, the real difference with the current behavior is ** if the slots
addition was too little ** : in the current behavior nothing happens. In
the new behavior nothing happens unless the addition arrives after
scaling-interval.max.


Best

Etienne


Best,
D.

On Thu, Jun 29, 2023 at 3:43 PM Etienne Chauchot
wrote:


Thanks Chesnay for your feedback. I have updated the FLIP. I'll start a
vot

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread Etienne Chauchot

Hi all,

Thanks David for your feedback. My comments are inline

Le 04/07/2023 à 09:16, David Morávek a écrit :

They will struggle if they add new resources and nothing happens for 5

minutes.

The same applies if they start playing with FLIP-291 APIs. I'm wondering if
the cooldown makes sense there since it was the user's deliberate choice to
push new requirements. 樂



Sure, but remember that the initial rescale is always done immediately. 
Only the time between 2 rescales is controlled by the cooldown period. I 
don't see a user adding resources every 10s (your proposed default 
value) or even with, let's say 2 min, waiting 2 min between 2 
requirements push seems ok to me.





Best,
D.

On Tue, Jul 4, 2023 at 9:11 AM David Morávek  wrote:


The FLIP reads sane to me. I'm unsure about the default values, though; 5
minutes of wait time between rescales feels rather strict, and we should
rethink it to provide a better out-of-the-box experience.

I'd focus on newcomers trying AS / Reactive Mode out. They will struggle
if they add new resources and nothing happens for 5 minutes. I'd suggest
defaulting to
*jobmanager.adaptive-scheduler.resource-stabilization-timeout* (which
defaults to 10s).



If users add resources, the re-scale will happen right away. It is only 
for next additions that they will have to wait for the coolDown period 
to end.


But anyway, we could lower the default value, I just took what Robert 
suggested in the ticket.





I'm still struggling to grasp max internal (force rescale). Ignoring 
`AdaptiveScheduler#shouldRescale()`
condition seems rather dangerous. Wouldn't a simple case where you add a
new TM and remove it before the max interval is reached (so there is
nothing to do) result in an unnecessary job restart?


With current behavior (on master) : adding the TM will result in 
restarting if the number of slots added leads to job parallelism 
increase of more than 2. Then removing it can have 2 consequences: 
either it is removed before the resource-stabilisation timeout and there 
will be no restart. Or it is removed after this timeout (the job is in 
Running state) and it will entail another restart and parallelism decrease.


With the proposed behavior: what the scaling-interval.max will change is 
only on the resource addition part: when the TM is added, if the time 
since last rescale > scaling-interval.max, then a restart and 
parallelism increase will be done even if it leads to a parallelism 
increase < 2. The rest regarding TM removal does not change.


=> So, the real difference with the current behavior is ** if the slots 
addition was too little ** : in the current behavior nothing happens. In 
the new behavior nothing happens unless the addition arrives after 
scaling-interval.max.



Best

Etienne



Best,
D.

On Thu, Jun 29, 2023 at 3:43 PM Etienne Chauchot
wrote:


Thanks Chesnay for your feedback. I have updated the FLIP. I'll start a
vote thread.

Best

Etienne

Le 28/06/2023 à 11:49, Chesnay Schepler a écrit :

we should schedule a check that will rescale if

min-parallelism-increase is met. Then, what it the use of
scaling-interval.max timeout in that context ?

To force a rescale if min-parallelism-increase is not met (but we
could still run above the current parallelism).

min-parallelism-increase is a trade-off between the cost of rescaling
vs the performance benefit of the parallelism increase. Over time the
balance tips more and more in favor of the parallelism increase, hence
we should eventually rescale anyway even if the minimum isn't met, or
at least give users the option to do so.


I meant the opposite: not having only the cooldown but having only

the stabilization time. I must have missed something because what I
wonder is: if every rescale entails a restart of the pipeline and
every restart entails passing in waiting for resources state, then why
introduce a cooldown when there is already at each rescale a stable
resource timeout ?

It is technically correct that the stable resource timeout can be used
to limit the number of rescale operations per interval, however during
that time the job isn't running, in contrast to the cooldown.

Having both just gives you a lot more flexibility.
"I want at most 1 rescale operation per hour, and wait at most 1
minute for resource to stabilize when a rescale happens".
You can't express this with only one of the options.

On 20/06/2023 14:41, Etienne Chauchot wrote:

Hi Chesnay,

Thanks for your feedback. Comments inline

Le 16/06/2023 à 17:24, Chesnay Schepler a écrit :

1) Options specific to the adaptive scheduler should start with
"jobmanager.adaptive-scheduler".


ok



2)
There isn't /really /a notion of a "scaling event". The scheduler is
informed about new/lost slots and job failures, and reacts
accordingly by maybe rescaling the job.
(sure, you can think of these as events, but you can think of
practically everything as events)

There shouldn't be a queue fo

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread Etienne Chauchot

Hi all,

@David, I just saw your comments on the [DISCUSSION] thread. I'll close 
this vote for now and open a new voting thread when the discussion 
reaches a consensus.


Best

Etienne

Le 04/07/2023 à 10:49, Etienne Chauchot a écrit :


Hi David,

Indeed, this thread was a proper separate [VOTE] thread 

You withdrew your -1 vote but did not cast a new vote. Can you do so ?

Thanks

Etienne

Le 04/07/2023 à 08:52, David Morávek a écrit :

Hmm, sorry for the confusion; it seems that Google is playing games with me
(I see this chained under the old [DISCUSS] thread), but it seems correct
in the mail archive [1] :/

Just ignore the -1 above.

[1]https://lists.apache.org/thread/22fovrkmzcvzblcohhtsp5t96vd64obj

On Tue, Jul 4, 2023 at 8:49 AM David Morávek  wrote:


The vote closes within 6 hours and, as for now, there was no vote. This

is a very short FLIP, that takes a few minutes to read.


Maybe because there should have been a dedicated voting thread (marked as
[VOTE]), this one was hidden and hard to notice.

We should restart the vote with proper mechanics to allow everyone to
participate.

Soft -1 from my side until there is a proper voting thread.

Best,
D.

On Mon, Jul 3, 2023 at 4:40 PM ConradJam  wrote:


+1 (no-binding)

Etienne Chauchot  于2023年7月3日周一 15:57写道:


Hi all,

The vote closes within 6 hours and, as for now, there was no vote. This
is a very short FLIP, that takes a few minutes to read.

Please cast your vote so that the development could start.

Thanks.

Best

Etienne

Le 29/06/2023 à 15:51, Etienne Chauchot a écrit :

Hi all,

Thanks for all the feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 3rd 14:00 GMT) unless there is an objection or
insufficient votes.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler

2023-07-04 Thread Etienne Chauchot

Hi David,

Indeed, this thread was a proper separate [VOTE] thread 

You withdrew your -1 vote but did not cast a new vote. Can you do so ?

Thanks

Etienne

Le 04/07/2023 à 08:52, David Morávek a écrit :

Hmm, sorry for the confusion; it seems that Google is playing games with me
(I see this chained under the old [DISCUSS] thread), but it seems correct
in the mail archive [1] :/

Just ignore the -1 above.

[1]https://lists.apache.org/thread/22fovrkmzcvzblcohhtsp5t96vd64obj

On Tue, Jul 4, 2023 at 8:49 AM David Morávek  wrote:


The vote closes within 6 hours and, as for now, there was no vote. This

is a very short FLIP, that takes a few minutes to read.


Maybe because there should have been a dedicated voting thread (marked as
[VOTE]), this one was hidden and hard to notice.

We should restart the vote with proper mechanics to allow everyone to
participate.

Soft -1 from my side until there is a proper voting thread.

Best,
D.

On Mon, Jul 3, 2023 at 4:40 PM ConradJam  wrote:


+1 (no-binding)

Etienne Chauchot  于2023年7月3日周一 15:57写道:


Hi all,

The vote closes within 6 hours and, as for now, there was no vote. This
is a very short FLIP, that takes a few minutes to read.

Please cast your vote so that the development could start.

Thanks.

Best

Etienne

Le 29/06/2023 à 15:51, Etienne Chauchot a écrit :

Hi all,

Thanks for all the feedback about the FLIP-322: Cooldown period for
adaptive scheduler [1].

This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 3rd 14:00 GMT) unless there is an objection or
insufficient votes.

[1]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne


Re: [DISCUSSION] test connectors against Flink master in PRs

2023-07-03 Thread Etienne Chauchot

Hi all,

I wanted to post here the result of a discussion I had in private with 
Chesnay related to this subject. The question was regarding archunit 
with connectors:


"How to deal with different archunit violations between 2 versions of 
Flink ?  If a violation is normal and should be added to the violation 
store but the related rule has changed in a recent Flink version, how to 
have different set of violations between 2 flink versions for one single 
violation store?"


We concluded by saying that even if a connector should support (and 
therefore be tested against) the last 2 versions of Flink, the archunit 
tests should run only on the main supported Flink version (usually the 
most recent one).


As a consequence, I'll configure that in Cassandra connector and update 
the connectors migration wiki doc to serve as an example for such cases.


Best

Etienne


Le 29/06/2023 à 15:57, Etienne Chauchot a écrit :


Hi Martijn,

Thanks for your feedback. I makes total sense to me.

I'll enable it for Cassandra.

Best

Etienne

Le 29/06/2023 à 10:54, Martijn Visser a écrit :


Hi Etienne,

I think it all depends on the actual maintainers of the connector to
make a decision on that: if their unreleased version of the connector
should be compatible with a new Flink version, then they should test
against it. For example, that's already done at Elasticsearch [1] and
JDBC [2].

Choosing which versions to support is a decision by the maintainers in
the community, and it always requires an action by a maintainer to
update the CI config to set the correct versions whenever a new Flink
version is released.

Best regards,

Martijn

[1]https://github.com/apache/flink-connector-elasticsearch/blob/main/.github/workflows/push_pr.yml
[2]https://github.com/apache/flink-connector-jdbc/blob/main/.github/workflows/push_pr.yml

On Wed, Jun 28, 2023 at 6:09 PM Etienne Chauchot  wrote:

Hi all,

Connectors are external to flink. As such, they need to be tested
against stable (released) versions of Flink.

But I was wondering if it would make sense to test connectors in PRs
also against latest flink master snapshot to allow to discover failures
before merging the PRs, ** while the author is still available **,
rather than discovering them in nightly tests (that test against
snapshot) after the merge. That would allow the author to anticipate
potential failures and provide more future proof code (even if master is
subject to change before the connector release).

Of course, if a breaking change was introduced in master, such tests
will fail. But they should be considered as a preview of how the code
will behave against the current snapshot of the next flink version.

WDYT ?


Best

Etienne

Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler

2023-07-03 Thread Etienne Chauchot

Hi all,

The vote closes within 6 hours and, as for now, there was no vote. This 
is a very short FLIP, that takes a few minutes to read.


Please cast your vote so that the development could start.

Thanks.

Best

Etienne

Le 29/06/2023 à 15:51, Etienne Chauchot a écrit :


Hi all,

Thanks for all the feedback about the FLIP-322: Cooldown period for 
adaptive scheduler [1].


This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 3rd 14:00 GMT) unless there is an objection or
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne


Re: [DISCUSSION] test connectors against Flink master in PRs

2023-06-29 Thread Etienne Chauchot

Hi Martijn,

Thanks for your feedback. I makes total sense to me.

I'll enable it for Cassandra.

Best

Etienne

Le 29/06/2023 à 10:54, Martijn Visser a écrit :


Hi Etienne,

I think it all depends on the actual maintainers of the connector to
make a decision on that: if their unreleased version of the connector
should be compatible with a new Flink version, then they should test
against it. For example, that's already done at Elasticsearch [1] and
JDBC [2].

Choosing which versions to support is a decision by the maintainers in
the community, and it always requires an action by a maintainer to
update the CI config to set the correct versions whenever a new Flink
version is released.

Best regards,

Martijn

[1]https://github.com/apache/flink-connector-elasticsearch/blob/main/.github/workflows/push_pr.yml
[2]https://github.com/apache/flink-connector-jdbc/blob/main/.github/workflows/push_pr.yml

On Wed, Jun 28, 2023 at 6:09 PM Etienne Chauchot  wrote:

Hi all,

Connectors are external to flink. As such, they need to be tested
against stable (released) versions of Flink.

But I was wondering if it would make sense to test connectors in PRs
also against latest flink master snapshot to allow to discover failures
before merging the PRs, ** while the author is still available **,
rather than discovering them in nightly tests (that test against
snapshot) after the merge. That would allow the author to anticipate
potential failures and provide more future proof code (even if master is
subject to change before the connector release).

Of course, if a breaking change was introduced in master, such tests
will fail. But they should be considered as a preview of how the code
will behave against the current snapshot of the next flink version.

WDYT ?


Best

Etienne

[VOTE] FLIP-322 Cooldown period for adaptive scheduler

2023-06-29 Thread Etienne Chauchot

Hi all,

Thanks for all the feedback about the FLIP-322: Cooldown period for 
adaptive scheduler [1].


This FLIP was discussed in [2].

I'd like to start a vote for it. The vote will be open for at least 72
hours (until July 3rd 14:00 GMT) unless there is an objection or
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler

[2] https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6

Best,

Etienne


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-29 Thread Etienne Chauchot
Thanks Chesnay for your feedback. I have updated the FLIP. I'll start a 
vote thread.


Best

Etienne

Le 28/06/2023 à 11:49, Chesnay Schepler a écrit :
> we should schedule a check that will rescale if 
min-parallelism-increase is met. Then, what it the use of 
scaling-interval.max timeout in that context ?


To force a rescale if min-parallelism-increase is not met (but we 
could still run above the current parallelism).


min-parallelism-increase is a trade-off between the cost of rescaling 
vs the performance benefit of the parallelism increase. Over time the 
balance tips more and more in favor of the parallelism increase, hence 
we should eventually rescale anyway even if the minimum isn't met, or 
at least give users the option to do so.


> I meant the opposite: not having only the cooldown but having only 
the stabilization time. I must have missed something because what I 
wonder is: if every rescale entails a restart of the pipeline and 
every restart entails passing in waiting for resources state, then why 
introduce a cooldown when there is already at each rescale a stable 
resource timeout ?


It is technically correct that the stable resource timeout can be used 
to limit the number of rescale operations per interval, however during 
that time the job isn't running, in contrast to the cooldown.


Having both just gives you a lot more flexibility.
"I want at most 1 rescale operation per hour, and wait at most 1 
minute for resource to stabilize when a rescale happens".

You can't express this with only one of the options.

On 20/06/2023 14:41, Etienne Chauchot wrote:

Hi Chesnay,

Thanks for your feedback. Comments inline

Le 16/06/2023 à 17:24, Chesnay Schepler a écrit :
1) Options specific to the adaptive scheduler should start with 
"jobmanager.adaptive-scheduler".



ok



2)
There isn't /really /a notion of a "scaling event". The scheduler is 
informed about new/lost slots and job failures, and reacts 
accordingly by maybe rescaling the job.
(sure, you can think of these as events, but you can think of 
practically everything as events)


There shouldn't be a queue for events. All the scheduler should have 
to know is that the next rescale check is scheduled for time T, 
which in practice boils down to a flag and a scheduled action that 
runs Executing#maybeRescale.



Makes total sense, its very simple like this. Thanks for the 
precision and pointer. After the related FLIPs, I'll look at the code 
now.



With that in mind, we also have to look at how we keep this state 
around. Presumably it is scoped to the current state, such that the 
cooldown is reset if a job fails.
Maybe we should add a separate ExecutingWithCooldown state; not sure 
yet.



Yes loosing cooldown state and cooldown reset upon failure is what I 
suggested in point 3 in previous email. Not sure either for a new 
state, I'll figure it out after experimenting with the code. I'll 
update the FLIP then.





It would be good to clarify whether this FLIP only attempts to cover 
scale up operations, or also scale downs in case of slot losses.



When there are slots loss, most of the time it is due to a TM loss so 
there should be several slots lost at the same time but (hopefully) 
only once. There should not be many scale downs in a row (but still 
cascading failures can happen). I think, we should just protect 
against having scale ups immediately following. For that, I think we 
could just keep the current behavior of transitioning to Restarting 
state and then back to Waiting for Resources state. This state will 
protect us against scale ups immediately following failure/restart.





We should also think about how it relates to the externalized 
declarative resource management. Should we always rescale 
immediately? Should we wait until the cooldown is over?



It relates to point 2, no ? we should rescale immediately only if 
last rescale was done more than scaling-interval.min ago otherwise 
schedule a rescale at last-rescale + scaling-interval.min time.



Related to this, there's the min-parallelism-increase option, that 
if for example set to "2" restricts rescale operations to only occur 
if the parallelism increases by at least 2.



yes I saw that in the code



Ideally however there would be a max timeout for this.

As such we could maybe think about this a bit differently:
Add 2 new options instead of 1:
jobmanager.adaptive-scheduler.scaling-interval.min: The minimum time 
the scheduler will wait for the next effective rescale operations.
jobmanager.adaptive-scheduler.scaling-interval.max: The maximum time 
the scheduler will wait for the next effective rescale operations.



At point 2, we said that when slots change (requirements change or 
new slots available), if last rescale check (call to maybeRescale) 
was done less than scaling-interval.min ago, we should schedule a 
check that will rescale if min-parallelism-increase is met. Then, 
what it the use o

[DISCUSSION] test connectors against Flink master in PRs

2023-06-28 Thread Etienne Chauchot

Hi all,

Connectors are external to flink. As such, they need to be tested 
against stable (released) versions of Flink.


But I was wondering if it would make sense to test connectors in PRs 
also against latest flink master snapshot to allow to discover failures 
before merging the PRs, ** while the author is still available **, 
rather than discovering them in nightly tests (that test against 
snapshot) after the merge. That would allow the author to anticipate 
potential failures and provide more future proof code (even if master is 
subject to change before the connector release).


Of course, if a breaking change was introduced in master, such tests 
will fail. But they should be considered as a preview of how the code 
will behave against the current snapshot of the next flink version.


WDYT ?


Best

Etienne


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-20 Thread Etienne Chauchot

Hi Chesnay,

Thanks for your feedback. Comments inline

Le 16/06/2023 à 17:24, Chesnay Schepler a écrit :
1) Options specific to the adaptive scheduler should start with 
"jobmanager.adaptive-scheduler".



ok



2)
There isn't /really /a notion of a "scaling event". The scheduler is 
informed about new/lost slots and job failures, and reacts accordingly 
by maybe rescaling the job.
(sure, you can think of these as events, but you can think of 
practically everything as events)


There shouldn't be a queue for events. All the scheduler should have 
to know is that the next rescale check is scheduled for time T, which 
in practice boils down to a flag and a scheduled action that runs 
Executing#maybeRescale.



Makes total sense, its very simple like this. Thanks for the precision 
and pointer. After the related FLIPs, I'll look at the code now.



With that in mind, we also have to look at how we keep this state 
around. Presumably it is scoped to the current state, such that the 
cooldown is reset if a job fails.

Maybe we should add a separate ExecutingWithCooldown state; not sure yet.



Yes loosing cooldown state and cooldown reset upon failure is what I 
suggested in point 3 in previous email. Not sure either for a new state, 
I'll figure it out after experimenting with the code. I'll update the 
FLIP then.





It would be good to clarify whether this FLIP only attempts to cover 
scale up operations, or also scale downs in case of slot losses.



When there are slots loss, most of the time it is due to a TM loss so 
there should be several slots lost at the same time but (hopefully) only 
once. There should not be many scale downs in a row (but still cascading 
failures can happen). I think, we should just protect against having 
scale ups immediately following. For that, I think we could just keep 
the current behavior of transitioning to Restarting state and then back 
to Waiting for Resources state. This state will protect us against scale 
ups immediately following failure/restart.





We should also think about how it relates to the externalized 
declarative resource management. Should we always rescale immediately? 
Should we wait until the cooldown is over?



It relates to point 2, no ? we should rescale immediately only if last 
rescale was done more than scaling-interval.min ago otherwise schedule a 
rescale at last-rescale + scaling-interval.min time.



Related to this, there's the min-parallelism-increase option, that if 
for example set to "2" restricts rescale operations to only occur if 
the parallelism increases by at least 2.



yes I saw that in the code



Ideally however there would be a max timeout for this.

As such we could maybe think about this a bit differently:
Add 2 new options instead of 1:
jobmanager.adaptive-scheduler.scaling-interval.min: The minimum time 
the scheduler will wait for the next effective rescale operations.
jobmanager.adaptive-scheduler.scaling-interval.max: The maximum time 
the scheduler will wait for the next effective rescale operations.



At point 2, we said that when slots change (requirements change or new 
slots available), if last rescale check (call to maybeRescale) was done 
less than scaling-interval.min ago, we should schedule a check that will 
rescale if min-parallelism-increase is met. Then, what it the use of 
scaling-interval.max timeout in that context ?





3) It sounds fine that we lose the cooldown state, because imo we want 
to reset the cooldown anyway on job failures (because a job failure 
inherently implies a potential rescaling).



exactly.




4) The stabilization time isn't really redundant and serves a 
different use-case. The idea behind is that if a users adds multiple 
TMs at once then we don't want to rescale immediately at the first 
received slot. Without the stabilization time the cooldown would 
actually cause bad behavior here, because not only would we rescale 
immediately upon receiving the minimum required slots to scale up, but 
we also wouldn't use the remaining slots just because the cooldown 
says so.



I meant the opposite: not having only the cooldown but having only the 
stabilization time. I must have missed something because what I wonder 
is: if every rescale entails a restart of the pipeline and every restart 
entails passing in waiting for resources state, then why introduce a 
cooldown when there is already at each rescale a stable resource timeout ?



Best

Etienne





On 16/06/2023 15:47, Etienne Chauchot wrote:

Hi Robert,

Thanks for your feedback. I don't know the scheduler part well enough 
yet and I'm taking this ticket as a learning workshop.


Regarding your comments:

1. Taking a look at the AdaptiveScheduler class which takes all its 
configuration from the JobManagerOptions, and also to be consistent 
with other parameters name, I'd suggest 
/jobmanager.scheduler-scaling-cooldown-period/


2. I thought scaling events existed already and the schedul

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-16 Thread Etienne Chauchot

Hi Robert,

Thanks for your feedback. I don't know the scheduler part well enough 
yet and I'm taking this ticket as a learning workshop.


Regarding your comments:

1. Taking a look at the AdaptiveScheduler class which takes all its 
configuration from the JobManagerOptions, and also to be consistent with 
other parameters name, I'd suggest 
/jobmanager.scheduler-scaling-cooldown-period/


2. I thought scaling events existed already and the scheduler received 
them as mentioned in FLIP-160 (cf "Whenever the scheduler is in the 
Executing state and receives new slots") or in FLIP-138 (cf "Whenever 
new slots are available the SlotPool notifies the Scheduler"). If it is 
not the case (it is the scheduler who asks for slots), then there is no 
need for storing scaling requests indeed.


=> I need a confirmation here

3. If we loose the JobManager, we loose both the AdaptiveScheduler state 
and the CoolDownTimer state. So, upon recovery, it would be as if there 
was no ongoing coolDown period. So, a first re-scale could happen right 
away and it will start a coolDown period. A second re-scale would have 
to wait for the end of this period.


4. When a pipeline is re-scaled, it is restarted. Upon restart, the 
AdaptiveScheduler passes again in the "waiting for resources" state as 
FLIP-160 suggests. If so, then it seems that the coolDown period is kind 
of redundant with the resource-stabilization-timeout. I guess it is not 
the case otherwise the FLINK-21883 ticket would not have been created.


=> I need a confirmation here also.


Thanks for your views on point 2 and 4.


Best

Etienne

Le 15/06/2023 à 13:35, Robert Metzger a écrit :

Thanks for the FLIP.

Some comments:
1. Can you specify the full proposed configuration name? "
scaling-cooldown-period" is probably not the full config name?
2. Why is the concept of scaling events and a scaling queue needed? If I
remember correctly, the adaptive scheduler will just check how many
TaskManagers are available and then adjust the execution graph accordingly.
There's no need to store a number of scaling events. We just need to
determine the time to trigger an adjustment of the execution graph.
3. What's the behavior wrt to JobManager failures (e.g. we lose the state
of the Adaptive Scheduler?). My proposal would be to just reset the
cooldown period, so after recovery of a JobManager, we have to wait at
least for the cooldown period until further scaling operations are done.
4. What's the relationship to the
"jobmanager.adaptive-scheduler.resource-stabilization-timeout"
configuration?

Thanks a lot for working on this!

Best,
Robert

On Wed, Jun 14, 2023 at 3:38 PM Etienne Chauchot
wrote:


Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked
operations that we discussed below PTAL.

Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations
depends on the configured length of the cooldown period.



The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to
finish all", please keep in mind that the reactive mode (at least for
now) is only available for streaming pipeline which are in essence
infinite processing.

Another side note: when you mention "every taskManagers connecting",
if you are referring to the start of the pipeline, please keep in mind
that the adaptive scheduler has a "waiting for resources" timeout
period before starting the pipeline in which all taskmanagers connect
and the parallelism is decided.

Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

 From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]:https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"
收件人:
&qu

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-14 Thread Etienne Chauchot

Hi all,

@Yukia,I updated the FLIP to include the aggregation of the staked 
operations that we discussed below PTAL.


Best

Etienne


Le 13/06/2023 à 16:31, Etienne Chauchot a écrit :

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations 
depends on the configured length of the cooldown period.




The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to 
finish all", please keep in mind that the reactive mode (at least for 
now) is only available for streaming pipeline which are in essence 
infinite processing.


Another side note: when you mention "every taskManagers connecting", 
if you are referring to the start of the pipeline, please keep in mind 
that the adaptive scheduler has a "waiting for resources" timeout 
period before starting the pipeline in which all taskmanagers connect 
and the parallelism is decided.


Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]: https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"  
收件人:
"dev" , "Robert Metzger"  
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] FLIP-322 
Cooldown

period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler





Best


Etienne

Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-13 Thread Etienne Chauchot

Hi Yuxia,

Thanks for your feedback. The number of potentially stacked operations 
depends on the configured length of the cooldown period.




The proposition in the FLIP is to add a minimum delay between 2 scaling
operations. But, indeed, an optimization could be to still stack the
operations (that arrive during a cooldown period) but maybe not take
only the last operation but rather aggregate them in order to end up
with a single aggregated operation when the cooldown period ends. For
example, let's say 3 taskManagers come up and 1 comes down during the
cooldown period, we could generate a single operation of scale up +2
when the period ends.

As a side note regarding your comment on "it'll take a long time to 
finish all", please keep in mind that the reactive mode (at least for 
now) is only available for streaming pipeline which are in essence 
infinite processing.


Another side note: when you mention "every taskManagers connecting", if 
you are referring to the start of the pipeline, please keep in mind that 
the adaptive scheduler has a "waiting for resources" timeout period 
before starting the pipeline in which all taskmanagers connect and the 
parallelism is decided.


Best

Etienne

Le 13/06/2023 à 03:58, yuxia a écrit :

Hi, Etienne. Thanks for driving it. I have one question about the
mechanism of the cooldown timeout.

From the Proposed Changes part, if a scalling event is received and
it falls during the cooldown period, it'll be stacked to be executed
after the period ends. Also, from the description of FLINK-21883[1],
cooldown timeout is to avoid rescaling the job very frequently,
because TaskManagers are not all connecting at the same time.

So, is it possible that every taskmanager connecting will produce a
scalling event and it'll be stacked with many scale up event which
causes it'll take a long time to finish all? Can we just take the
last one event?

[1]: https://issues.apache.org/jira/browse/FLINK-21883

Best regards, Yuxia

- 原始邮件 - 发件人: "Etienne Chauchot"  收件人:
"dev" , "Robert Metzger"  
发送时间: 星期一, 2023年 6 月 12日 下午 11:34:25 主题: [DISCUSS] FLIP-322 Cooldown

period for adaptive scheduler

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler





Best


Etienne


[DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-12 Thread Etienne Chauchot

Hi,

I’d like to start a discussion about FLIP-322 [1] which introduces a 
cooldown period for the adaptive scheduler.


I'd like to get your feedback especially @Robert as you opened the 
related ticket and worked on the reactive mode a lot.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler


Best

Etienne



[DISCUSS] Make Preconditions/VisibleForTesting public

2023-06-05 Thread Etienne Chauchot

Hi all,

As part of fixing the architecture-tests-production Connector rule in 
this PR [1], it appeared that (external) connectors depending on the 
non-public Preconditions/VisibleForTesting classes was a violation to 
the "connectors should not depend on non-public classes that are not in 
the connectors packages". I'd like to discuss making these tools public.


[1] https://github.com/apache/flink/pull/22667


WDYT?

Best

Etienne



[jira] [Created] (FLINK-32222) Cassandra Source uses DataInputDeserializer and DataOutputSerializer non public apis

2023-05-31 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-3:


 Summary: Cassandra Source uses DataInputDeserializer and 
DataOutputSerializer non public apis
 Key: FLINK-3
 URL: https://issues.apache.org/jira/browse/FLINK-3
 Project: Flink
  Issue Type: Technical Debt
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


in class _CassandraSplitSerializer,_ these non public APIs usage __ violate

_ConnectorRules#CONNECTOR_CLASSES_ONLY_DEPEND_ON_PUBLIC_API_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-cassandra 3.1.0, release candidate #2

2023-05-09 Thread Etienne Chauchot

Hi everyone,

+1 (non-binding)

I checked:

- release notes

- tag

- tested the prod artifact with https://github.com/echauchot/flink-samples

Best

Etienne

Le 05/05/2023 à 11:39, Danny Cranmer a écrit :

Hi everyone,
Please review and vote on the release candidate #2 for the version 3.1.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.1.0-rc2 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353030
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.1.0-rc2
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1631
[5] https://github.com/apache/flink-connector-cassandra/tree/v3.1.0-rc2
[6] https://github.com/apache/flink-web/pull/642



[jira] [Created] (FLINK-32014) Cassandra source documentation is missing

2023-05-05 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-32014:


 Summary: Cassandra source documentation is missing
 Key: FLINK-32014
 URL: https://issues.apache.org/jira/browse/FLINK-32014
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Cassandra, Documentation
Reporter: Etienne Chauchot






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-cassandra v3.1.0, release candidate #1

2023-05-04 Thread Etienne Chauchot

@Danny, the fix is merged and the ticket is closed.

You can open a RC2 whenever you want.

Best

Etienne

Le 04/05/2023 à 06:17, Leonard Xu a écrit :

-1 (binding)  as the blocker issue FLINK-31927[1] found

- Reviewed the PR [2] and left one comment that the linked issue is incorrect.


Best,
Leonard
[1] https://issues.apache.org/jira/browse/FLINK-31927
[2] https://github.com/apache/flink-connector-cassandra/pull/13




2023年5月3日 下午5:26,Etienne Chauchot  写道:

Hi all,

@Danny, I just submitted the fix PR to unblock the release: 
https://github.com/apache/flink-connector-cassandra/pull/13

Best

Etienne

Le 02/05/2023 à 14:52, Danny Cranmer a écrit :

Thanks for reporting this issue Etienne. Why was it not detected by the
unit/integration tests? Can we cover this on the CI?

This VOTE is closed, I will open RC2 once the issue has been resolved. In
the meantime we could consider reopening 3.0.1 [1] for Flink 1.17 support.
I will reopen if there is a demand for it.

Thanks,
Danny


[1] https://lists.apache.org/thread/30c3yhd561o57x0prt7jqt055r4xd6lf

On Mon, Apr 24, 2023 at 7:28 PM Etienne Chauchot 
wrote:


Hi,

Thanks Danny for driving this new release. It now contains the new
source, thanks.

I'm off but I wanted to test this release still. I made a very quick job
(1) to read from a Cassandra cluster with the new source.

I found an issue: the source raises a j"ava.lang.NoClassDefFoundError:
com/codahale/metrics/Gauge" when trying to connect to the cluster on
Flink 1.16.0.

As I'm on vacation right now, I don't have time to solve this now but
I'll do within a week.

vote: -1 (non-binding)

here is the blocker ticket:
https://issues.apache.org/jira/browse/FLINK-31927

[1]

https://github.com/echauchot/flink-samples/blob/edf4ad1624b2ad02af380efa6b5caa26bb7a274a/src/main/java/org/example/CassandraPojoSource.java

Best

Etienne

Le 19/04/2023 à 21:07, Martijn Visser a écrit :

+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Fri, Apr 14, 2023 at 2:42 PM Elphas Toringepi 
wrote:


Thanks Danny

+1 (non-binding)

* Checked release notes
* Validated signature and checksum
* Apache source builds with JDK 11
* Approved website PR

Kind regards,
Elphas


On Fri, Apr 14, 2023 at 1:14 PM Danny Cranmer 
wrote:


Hi everyone,
Please review and vote on the release candidate #1 for the version

3.1.0,

as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This version supports both Flink 1.16.x and Flink 1.17.x

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2],
which are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.1.0-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]



https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353030

[2]



https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.1.0-rc1/

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1627

[5]


https://github.com/apache/flink-connector-cassandra/releases/tag/v3.1.0-rc1

[6] https://github.com/apache/flink-web/pull/642



Re: [VOTE] Release flink-connector-cassandra v3.1.0, release candidate #1

2023-05-03 Thread Etienne Chauchot

Hi all,

@Danny, I just submitted the fix PR to unblock the release: 
https://github.com/apache/flink-connector-cassandra/pull/13


Best

Etienne

Le 02/05/2023 à 14:52, Danny Cranmer a écrit :

Thanks for reporting this issue Etienne. Why was it not detected by the
unit/integration tests? Can we cover this on the CI?

This VOTE is closed, I will open RC2 once the issue has been resolved. In
the meantime we could consider reopening 3.0.1 [1] for Flink 1.17 support.
I will reopen if there is a demand for it.

Thanks,
Danny


[1] https://lists.apache.org/thread/30c3yhd561o57x0prt7jqt055r4xd6lf

On Mon, Apr 24, 2023 at 7:28 PM Etienne Chauchot 
wrote:


Hi,

Thanks Danny for driving this new release. It now contains the new
source, thanks.

I'm off but I wanted to test this release still. I made a very quick job
(1) to read from a Cassandra cluster with the new source.

I found an issue: the source raises a j"ava.lang.NoClassDefFoundError:
com/codahale/metrics/Gauge" when trying to connect to the cluster on
Flink 1.16.0.

As I'm on vacation right now, I don't have time to solve this now but
I'll do within a week.

vote: -1 (non-binding)

here is the blocker ticket:
https://issues.apache.org/jira/browse/FLINK-31927

[1]

https://github.com/echauchot/flink-samples/blob/edf4ad1624b2ad02af380efa6b5caa26bb7a274a/src/main/java/org/example/CassandraPojoSource.java

Best

Etienne

Le 19/04/2023 à 21:07, Martijn Visser a écrit :

+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Fri, Apr 14, 2023 at 2:42 PM Elphas Toringepi 
wrote:


Thanks Danny

+1 (non-binding)

* Checked release notes
* Validated signature and checksum
* Apache source builds with JDK 11
* Approved website PR

Kind regards,
Elphas


On Fri, Apr 14, 2023 at 1:14 PM Danny Cranmer 
wrote:


Hi everyone,
Please review and vote on the release candidate #1 for the version

3.1.0,

as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This version supports both Flink 1.16.x and Flink 1.17.x

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2],
which are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.1.0-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]



https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353030

[2]



https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.1.0-rc1/

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1627

[5]


https://github.com/apache/flink-connector-cassandra/releases/tag/v3.1.0-rc1

[6] https://github.com/apache/flink-web/pull/642



Re: [VOTE] Release flink-connector-cassandra v3.1.0, release candidate #1

2023-04-24 Thread Etienne Chauchot

Hi,

Thanks Danny for driving this new release. It now contains the new 
source, thanks.


I'm off but I wanted to test this release still. I made a very quick job 
(1) to read from a Cassandra cluster with the new source.


I found an issue: the source raises a j"ava.lang.NoClassDefFoundError: 
com/codahale/metrics/Gauge" when trying to connect to the cluster on 
Flink 1.16.0.


As I'm on vacation right now, I don't have time to solve this now but 
I'll do within a week.


vote: -1 (non-binding)

here is the blocker ticket: 
https://issues.apache.org/jira/browse/FLINK-31927


[1] 
https://github.com/echauchot/flink-samples/blob/edf4ad1624b2ad02af380efa6b5caa26bb7a274a/src/main/java/org/example/CassandraPojoSource.java


Best

Etienne

Le 19/04/2023 à 21:07, Martijn Visser a écrit :

+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Fri, Apr 14, 2023 at 2:42 PM Elphas Toringepi 
wrote:


Thanks Danny

+1 (non-binding)

* Checked release notes
* Validated signature and checksum
* Apache source builds with JDK 11
* Approved website PR

Kind regards,
Elphas


On Fri, Apr 14, 2023 at 1:14 PM Danny Cranmer 
wrote:


Hi everyone,
Please review and vote on the release candidate #1 for the version 3.1.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This version supports both Flink 1.16.x and Flink 1.17.x

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2],
which are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.1.0-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]



https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353030

[2]



https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.1.0-rc1/

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1627

[5]


https://github.com/apache/flink-connector-cassandra/releases/tag/v3.1.0-rc1

[6] https://github.com/apache/flink-web/pull/642



[jira] [Created] (FLINK-31927) Cassandra source raises an exception on Flink 1.16.0 on a real Cassandra cluster

2023-04-24 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-31927:


 Summary: Cassandra source raises an exception on Flink 1.16.0 on a 
real Cassandra cluster
 Key: FLINK-31927
 URL: https://issues.apache.org/jira/browse/FLINK-31927
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Cassandra
Affects Versions: 1.16.0, cassandra-3.1.0
Reporter: Etienne Chauchot


CassandraSplitEnumerator#prepareSplits() raises  
java.lang.NoClassDefFoundError: com/codahale/metrics/Gauge when calling 
cluster.getMetadata() leading to NPE in CassandraSplitEnumerator#start() async 
callback. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Off for a week

2023-04-21 Thread Etienne Chauchot

Hi all,

Just to let you know, I'll be off and unresponsive for a week starting 
tonight.


Best

Etienne



[jira] [Created] (FLINK-31870) Cassandra input/output formats documentation is missing

2023-04-20 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-31870:


 Summary:  Cassandra input/output formats documentation is missing
 Key: FLINK-31870
 URL: https://issues.apache.org/jira/browse/FLINK-31870
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation
Reporter: Etienne Chauchot






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-cassandra v3.0.1, release candidate #1

2023-04-13 Thread Etienne Chauchot

Hi,

Thanks for driving this. Why is this release not including the latest 
changes mainly the new source connector merged in March 22nd?


Unless I'm missing something here, I would vote -1 (not binding) on the 
current state.


Best

Etienne

Le 13/04/2023 à 13:31, Danny Cranmer a écrit :

Hi everyone,
Please review and vote on the release candidate #1 for the version 3.0.1,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This version supports both Flink 1.16.x and Flink 1.17.x

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.0.1-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352911
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.0.1-rc1
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1611/
[5]
https://github.com/apache/flink-connector-cassandra/releases/tag/v3.0.1-rc1
[6] https://github.com/apache/flink-web/pull/635



Re: [blog article] Howto create a batch source with the new Source framework

2023-04-03 Thread Etienne Chauchot

Hi all,

I just published the last article of the series about creating a batch 
source with the new Source framework. This one is about testing the source.


Can you tell me if you think it would make sense to publish both 
articles to the official Flink blog as they could serve as a detailed 
documentation ?


Thanks

Etienne

[1] 
https://echauchot.blogspot.com/2023/04/flink-howto-test-batch-source-with-new.html


Le 03/04/2023 à 03:37, yuxia a écrit :

Thanks Etienne for detail explanation.

Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" 
发送时间: 星期五, 2023年 3 月 31日 下午 9:08:36
主题: Re: [blog article] Howto create a batch source with the new Source framework

Hi Yuxia,

Thanks for your feedback.

Comments inline


Le 31/03/2023 à 04:21, yuxia a écrit :

Hi, Etienne.

Thanks for Etienne for sharing this article. I really like it and learn much 
from it.

=> Glad it was useful, that was precisely the point :)

I'd like to raise some questions about implementing batch source. Welcome devs 
to share insights about them.

The first question is how to generate splits:
As the article mentioned:
"Whenever possible, it is preferable to generate the splits lazily, meaning that 
each time a reader asks the enumerator for a split, the enumerator generates one on 
demand and assigns it to the reader."
I think it maybe not for all cases. In some cases, generating split may be time 
counsuming, then it may be better to generate a batch of splits on demand to 
amortize the expense.
But it then raises another question, how many splits should be generated in a 
batch, too many maywell cause OOM, too less may not make good use of batch 
generating splits.
To solve it, I think maybe we can provide a configuration to make user to 
configure how many splits should be generated in a batch.
What's your opinion on it. Have you ever encountered this problem in your 
implementation?

=> I agree, lazy splits is not the only way. I've mentioned in the
article that batch generation is another in case of high split
generation cost, thanks for the suggestion. During the implementation I
didn't have this problem as generating a split was not costly, the only
costly processing was the splits preparation. It was run asynchronously
and only once, then each split generation was straightforward. That
being said, during development, I had OOM risks in the size and number
of splits. For the number of splits, lazy generation solved it as no
list of splits was stored in the ennumerator apart from the splits to
reassign. For the size of split I used a user provided max split memory
size similar to what you suggest here. In the batch generation case, we
could allow the user to set a max memory size for the batch : number of
splits in batch looks more dangerous to me if we don't know the size of
a split but if we are talking about storing the split objects and not
their content then that is ok. IMO, memory size is more clear for the
user as it is linked to the memory of a task manager.



The second question is how to assign splits:
What's your split assign stratgy?

=> the naïve one: a reader asks for a split, the enumerator receives the
request, generates a split and assigns it to the demanding reader.

In flink, we provide `LocalityAwareSplitAssigner` to make use of locality to 
assign split to reader.

=> well, it has interest only when the data-backend cluster nodes can be
co-localized with Flink task managers right? That would rarely be the
case as clusters seem to be separated most of the time to use the
maximum available CPU (at least for CPU-band workloads) no ?

But it may not perfert for the case of failover

=> Agree: it would require costly shuffle to keep the co-location after
restoration and this cost would not be balanced by the gain raised by
co-locality (mainly avoiding network use) I think.

for which we intend to introduce another split assign strategy[1].
But I do think it should be configurable to enable advanced user to decide 
which assign stratgy to use.

=> when you say the "user" I guess you mean user of the source not user
of the dev framework (implementor of the source). I think that it should
be configurable indeed as the user is the one knowing the repartition of
the partitions of the backend data.

Best

Etienne



Welcome other devs to share opinion.

[1]: https://issues.apache.org/jira/browse/FLINK-31065





Also as for split assigner .


Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" 
抄送: "Chesnay Schepler" 
发送时间: 星期四, 2023年 3 月 30日 下午 10:36:39
主题: [blog article] Howto create a batch source with the new Source framework

Hi all,

After creating the Cassandra source connector (thanks Chesnay for the
review!), I wrote a blog article about how to create a batch source with
the new Source framework [1]. It gives field feedback on how to
implement the different components.

Re: [blog article] Howto create a batch source with the new Source framework

2023-03-31 Thread Etienne Chauchot

Hi Yuxia,

Thanks for your feedback.

Comments inline


Le 31/03/2023 à 04:21, yuxia a écrit :

Hi, Etienne.

Thanks for Etienne for sharing this article. I really like it and learn much 
from it.

=> Glad it was useful, that was precisely the point :)


I'd like to raise some questions about implementing batch source. Welcome devs 
to share insights about them.

The first question is how to generate splits:
As the article mentioned:
"Whenever possible, it is preferable to generate the splits lazily, meaning that 
each time a reader asks the enumerator for a split, the enumerator generates one on 
demand and assigns it to the reader."
I think it maybe not for all cases. In some cases, generating split may be time 
counsuming, then it may be better to generate a batch of splits on demand to 
amortize the expense.
But it then raises another question, how many splits should be generated in a 
batch, too many maywell cause OOM, too less may not make good use of batch 
generating splits.
To solve it, I think maybe we can provide a configuration to make user to 
configure how many splits should be generated in a batch.
What's your opinion on it. Have you ever encountered this problem in your 
implementation?


=> I agree, lazy splits is not the only way. I've mentioned in the 
article that batch generation is another in case of high split 
generation cost, thanks for the suggestion. During the implementation I 
didn't have this problem as generating a split was not costly, the only 
costly processing was the splits preparation. It was run asynchronously 
and only once, then each split generation was straightforward. That 
being said, during development, I had OOM risks in the size and number 
of splits. For the number of splits, lazy generation solved it as no 
list of splits was stored in the ennumerator apart from the splits to 
reassign. For the size of split I used a user provided max split memory 
size similar to what you suggest here. In the batch generation case, we 
could allow the user to set a max memory size for the batch : number of 
splits in batch looks more dangerous to me if we don't know the size of 
a split but if we are talking about storing the split objects and not 
their content then that is ok. IMO, memory size is more clear for the 
user as it is linked to the memory of a task manager.





The second question is how to assign splits:
What's your split assign stratgy?
=> the naïve one: a reader asks for a split, the enumerator receives the 
request, generates a split and assigns it to the demanding reader.

In flink, we provide `LocalityAwareSplitAssigner` to make use of locality to 
assign split to reader.
=> well, it has interest only when the data-backend cluster nodes can be 
co-localized with Flink task managers right? That would rarely be the 
case as clusters seem to be separated most of the time to use the 
maximum available CPU (at least for CPU-band workloads) no ?

But it may not perfert for the case of failover
=> Agree: it would require costly shuffle to keep the co-location after 
restoration and this cost would not be balanced by the gain raised by 
co-locality (mainly avoiding network use) I think.

for which we intend to introduce another split assign strategy[1].
But I do think it should be configurable to enable advanced user to decide 
which assign stratgy to use.


=> when you say the "user" I guess you mean user of the source not user 
of the dev framework (implementor of the source). I think that it should 
be configurable indeed as the user is the one knowing the repartition of 
the partitions of the backend data.


Best

Etienne




Welcome other devs to share opinion.

[1]: https://issues.apache.org/jira/browse/FLINK-31065





Also as for split assigner .


Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" 
抄送: "Chesnay Schepler" 
发送时间: 星期四, 2023年 3 月 30日 下午 10:36:39
主题: [blog article] Howto create a batch source with the new Source framework

Hi all,

After creating the Cassandra source connector (thanks Chesnay for the
review!), I wrote a blog article about how to create a batch source with
the new Source framework [1]. It gives field feedback on how to
implement the different components.

I felt it could be useful to people interested in contributing or
migrating connectors.

=> Can you give me your opinion ?

=> I think it could be useful to post the article to Flink official blog
also if you agree.

=> Same remark on my previous article [2]: what about publishing it to
Flink official blog ?


[1]https://echauchot.blogspot.com/2023/03/flink-howto-create-batch-source-with.html

[2]https://echauchot.blogspot.com/2022/11/flink-howto-migrate-real-life-batch.html


Best

Etienne


[blog article] Howto create a batch source with the new Source framework

2023-03-30 Thread Etienne Chauchot

Hi all,

After creating the Cassandra source connector (thanks Chesnay for the 
review!), I wrote a blog article about how to create a batch source with 
the new Source framework [1]. It gives field feedback on how to 
implement the different components.


I felt it could be useful to people interested in contributing or 
migrating connectors.


=> Can you give me your opinion ?

=> I think it could be useful to post the article to Flink official blog 
also if you agree.


=> Same remark on my previous article [2]: what about publishing it to 
Flink official blog ?



[1]https://echauchot.blogspot.com/2023/03/flink-howto-create-batch-source-with.html

[2]https://echauchot.blogspot.com/2022/11/flink-howto-migrate-real-life-batch.html 



Best

Etienne



Re: [ANNOUNCE] Apache Flink 1.17.0 released

2023-03-24 Thread Etienne Chauchot

Congrats to all the people involved!

Best

Etienne

Le 23/03/2023 à 10:19, Leonard Xu a écrit :

The Apache Flink community is very happy to announce the release of Apache 
Flink 1.17.0, which is the first release for the Apache Flink 1.17 series.

Apache Flink® is an open-source unified stream and batch data processing 
framework for distributed, high-performing, always-available, and accurate data 
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements for 
this release:
https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

Best regards,
Qingsheng, Martijn, Matthias and Leonard


Re: [ANNOUNCE[ flink-connector-parent 1.0.0 released

2023-03-22 Thread Etienne Chauchot

Hi Chesnay,

Thanks for the release and for upgrading the Cassandra connector parent.

Best

Etienne

Le 22/03/2023 à 10:21, Chesnay Schepler a écrit :

The release of flink-connector-parent 1.0.0 is complete.

Update your connectors everyone. Let's get rid of the strange "zentol" 
thing.




Re: [VOTE] Release 1.17.0, release candidate #3

2023-03-22 Thread Etienne Chauchot

Hi Qingshen again,

I just did an in-depth review of the release notes as promised. I have 
not opened the jira tickets though.


PTAL

Best

Etienne

Le 22/03/2023 à 08:43, Qingsheng Ren a écrit :

Hi Jakub,

Thanks for reviewing the release note and welcome to the Flink community!

We will change the fix version of all open issues to 1.17.1 in batch before
the final release, then these issues will disappear from the release note
auto-generated by JIRA.

Please feel free to raise any doubts or discussions in the mailing list :-)

Best,
Qingsheng

On Wed, Mar 22, 2023 at 3:24 PM Jakub Partyka 
wrote:


Hi,
  I was looking through release notes [1], and found one the list some
tasks, that have status "Open". For example: FLINK-28538 [2], FLINK-19722
[2], FLINK-18873 [3]. Is this situation expected, or should these tasks be
removed from release notes?

This is my first message on this mailing list, so forgive me, if this issue
is trivial or redundant.

Best Regards,
Jakub

[1]

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585
[2] https://issues.apache.org/jira/browse/FLINK-28538
[3] https://issues.apache.org/jira/browse/FLINK-19722
[4] https://issues.apache.org/jira/browse/FLINK-18873


On 2023/03/17 14:01:36 Qingsheng Ren wrote:

Hi everyone,

Please review and vote on the release candidate #3 for the version

1.17.0,

as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* JIRA release notes [1], and the pull request adding release note for
users [2]
* the official Apache source release and binary convenience releases to

be

deployed to dist.apache.org [3], which are signed with the key with
fingerprint A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [4],
* all artifacts to be deployed to the Maven Central Repository [5],
* source code tag "release-1.17.0-rc3" [6],
* website pull request listing the new release and adding announcement

blog

post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585

[2] https://github.com/apache/flink/pull/22146
[3] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.0-rc3/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5]

https://repository.apache.org/content/repositories/orgapacheflink-1600

[6] https://github.com/apache/flink/releases/tag/release-1.17.0-rc3
[7] https://github.com/apache/flink-web/pull/618

Thanks,
Martijn and Matthias, Leonard and Qingsheng



Re: [VOTE] Release 1.17.0, release candidate #3

2023-03-22 Thread Etienne Chauchot

Hi Qingsheng,

Yes indeed I was referring to the jira release notes. I know they are 
generated, I was wondering if we could configure the generation process. 
Anyway, the human readable release notes are clear enough. I'll review 
the related PR.


Best

Etienne

Le 22/03/2023 à 09:14, Qingsheng Ren a écrit :

Hi Etienne,

Thanks for the review!

I guess you are referring to the JIRA release note [1]. I didn't find 
a place to configure the format of it, as it is auto-generated by 
JIRA. Actually there will be a more "human-readable" release note [1] 
for users in the document, which highlights breaking changes in this 
version.


[1] https://github.com/apache/flink/pull/22146

Best,
Qingsheng

On Tue, Mar 21, 2023 at 6:15 PM Etienne Chauchot 
 wrote:


Hi all,

- I read the release notes: I'd suggest if possible that we group the
subtasks by main task and show the main tasks to give better
understanding to the reader.

- tested RC3 on a standalone cluster with this user code:
https://github.com/echauchot/tpcds-benchmark-flink

+1 (not-binding)

Best Etienne

Le 17/03/2023 à 15:01, Qingsheng Ren a écrit :
> Hi everyone,
>
> Please review and vote on the release candidate #3 for the
version 1.17.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific
comments)
>
> The complete staging area is available for your review, which
includes:
>
> * JIRA release notes [1], and the pull request adding release
note for
> users [2]
> * the official Apache source release and binary convenience
releases to be
> deployed to dist.apache.org <http://dist.apache.org> [3], which
are signed with the key with
> fingerprint A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [4],
> * all artifacts to be deployed to the Maven Central Repository [5],
> * source code tag "release-1.17.0-rc3" [6],
> * website pull request listing the new release and adding
announcement blog
> post [7].
>
> The vote will be open for at least 72 hours. It is adopted by
majority
> approval, with at least 3 PMC affirmative votes.
>
> [1]
>

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585

<https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585>
> [2] https://github.com/apache/flink/pull/22146
> [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.0-rc3/
> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> [5]
https://repository.apache.org/content/repositories/orgapacheflink-1600
> [6] https://github.com/apache/flink/releases/tag/release-1.17.0-rc3
> [7] https://github.com/apache/flink-web/pull/618
>
> Thanks,
> Martijn and Matthias, Leonard and Qingsheng
>


Re: [VOTE] Release 1.17.0, release candidate #3

2023-03-21 Thread Etienne Chauchot

Hi all,

- I read the release notes: I'd suggest if possible that we group the 
subtasks by main task and show the main tasks to give better 
understanding to the reader.


- tested RC3 on a standalone cluster with this user code: 
https://github.com/echauchot/tpcds-benchmark-flink


+1 (not-binding)

Best Etienne

Le 17/03/2023 à 15:01, Qingsheng Ren a écrit :

Hi everyone,

Please review and vote on the release candidate #3 for the version 1.17.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* JIRA release notes [1], and the pull request adding release note for
users [2]
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [3], which are signed with the key with
fingerprint A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [4],
* all artifacts to be deployed to the Maven Central Repository [5],
* source code tag "release-1.17.0-rc3" [6],
* website pull request listing the new release and adding announcement blog
post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585
[2] https://github.com/apache/flink/pull/22146
[3] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.0-rc3/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] https://repository.apache.org/content/repositories/orgapacheflink-1600
[6] https://github.com/apache/flink/releases/tag/release-1.17.0-rc3
[7] https://github.com/apache/flink-web/pull/618

Thanks,
Martijn and Matthias, Leonard and Qingsheng



Re: [VOTE] Release flink-connector-parent10.0, release candidate #1

2023-03-16 Thread Etienne Chauchot

Hi all,

- checked the release notes

- tested the connector-parent with under development Cassandra connector.

+1 (non-biding)

Best

Etienne

Le 15/03/2023 à 16:40, Chesnay Schepler a écrit :

Hi everyone,
Please review and vote on the release candidate, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This is the first release of the flink-connector-parent pom by the 
Flink project. This subsumes the previous release that I made myself.


A few minor changes have been made; see the release notes.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint 11D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Chesnay

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352762
[2] 
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.0.0-rc1/

[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1597
[5] 
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.0.0-rc1

[6] https://github.com/apache/flink-web/pull/620




Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-03-15 Thread Etienne Chauchot

Hi,

Why not track this in a FLIP and a ticket and link this discussion thread.

My 2 cents

Etienne

Le 15/03/2023 à 10:01, Hongshun Wang a écrit :

  Hi devs,
I’d like to join this discussion. CC:Qingsheng
As discussed above, new partitions after the first discovery should be
consumed from EARLIEST offset.

However, when KafkaSourceEnumerator restarts after a job failure, it cannot
distinguish between unassigned partitions as first-discovered or new,
because the snapshot state currently only contains assignedPartitions
collection (the assigned partitions). We can solve this by adding a
unAssignedInitialPartitons collection to snapshot state, which represents
the collection of first discovered partitions that have not yet been
assigned. Also, we can combine this two collections into a single
collection if we add status to each item.

Besides , there is also a problem which often occurs in pattern mode to
distinguish between the following two case:

1. Case1:  The first partition discovery is too slow, before which the
checkpoint is finished and then job is restarted .At this time, the
restored unAssignedInitialPartitons is an empty collection, which means
non-discovery. The next discovery will be treated as the first discovery.
2. Case2:  The first time the partition is obtained is empty, and new
partitions can only be obtained after multiple partition discoveries. If a
restart occurs between this period, the restored
*unAssignedInitialPartitons* is also an empty collection, which means
empty-discovery. However, the next discovery should be treated as a new
discovery.

We can solve this problem by adding a boolean value(*firstDiscoveryDone*)
to snapshot state, which represents whether the first-discovery has been
done.

Also two rejected alternatives :

1. Change the KafkaSourceEnumerator's snapshotState method to a blocking
one, which resumes only after the first-discovered partition has been
successfully assigned to KafkaSourceReader. The advantage of this approach
is no need to change the snapshot state's variable values. However, if
first-discovered partitions are not assigned before checkpointing, the
SourceCoordinator's event-loop thread will be blocked, but partition
assignment also requires the event-loop thread to execute, which will cause
thread self-locking.
2. An alternative to the *firstDiscoveryDone* variable. If we change the
first discovery method to a synchronous method, we can ensure that Case1
will never happen. Because when the event-loop thread starts, it first adds
a discovery event to the blocking queue. When it turns to execute the
checkpoint event, the partition has already been discovered successfully.
However, If partition discovery is a heavily time-consuming operation, the
SourceCoordinator cannot process other event operations during the waiting
period, such as reader registration. It is a waste.

Best regards,
Hongshun

On 2023/01/13 03:31:20 Qingsheng Ren wrote:

Hi devs,

I’d like to start a discussion about enabling the dynamic partition
discovery feature by default in Kafka source. Dynamic partition discovery
[1] is a useful feature in Kafka source especially under the scenario when
the consuming Kafka topic scales out, or the source subscribes to multiple
Kafka topics with a pattern. Users don’t have to restart the Flink job to
consume messages in the new partition with this feature enabled.

Currently,

dynamic partition discovery is disabled by default and users have to
explicitly specify the interval of discovery in order to turn it on.

# Breaking changes

For Kafka table source:

- “scan.topic-partition-discovery.interval” will be set to 30 seconds by
default.
- As we need to provide a way for users to disable the feature,
“scan.topic-partition-discovery.interval” = “0” will be used to turn off
the discovery. Before this proposal, “0” means to enable partition
discovery with interval = 0, which is a bit senseless in practice.
Unfortunately we can't use negative values as the type of this option is
Duration.

For KafkaSource (DataStream API)

- Dynamic partition discovery in Kafka source will be enabled by default,
with discovery interval set to 30 seconds.
- To align with table source, only a positive value for option “
partition.discovery.interval.ms” could be used to specify the discovery
interval. Both negative and zero will be interpreted as disabling the
feature.

# Overhead of partition discovery

Partition discovery is made on KafkaSourceEnumerator, which asynchronously
fetches topic metadata from Kafka cluster and checks if there’s any new
topic and partition. This shouldn’t introduce performance issues on the
Flink side.

On the Kafka broker side, partition discovery makes MetadataRequest to
Kafka broker for fetching topic infos. Considering Kafka broker has its
metadata cache and the default request frequency is relatively low (per 30
seconds), this 

Re: [VOTE] Release 1.17.0, release candidate #2

2023-03-14 Thread Etienne Chauchot

Hi all,

As promised, I ran the same tests on 1.17.0 RC2 I also verified the 
release notes.


Based on the scope of these tests : +1 (non-binding)

Etienne

Le 14/03/2023 à 07:44, Qingsheng Ren a écrit :

Hi everyone,

Please review and vote on the release candidate #2 for the version
1.17.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* JIRA release notes [1], and the pull request adding release note for
users [2]
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [3], which are signed with the key with
fingerprint A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [4],
* all artifacts to be deployed to the Maven Central Repository [5],
* source code tag "release-1.17.0-rc2" [6],
* website pull request listing the new release and adding announcement blog
post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585
[2] https://github.com/apache/flink/pull/22146
[3] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.0-rc2/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] https://repository.apache.org/content/repositories/orgapacheflink-1595
[6] https://github.com/apache/flink/releases/tag/release-1.17.0-rc2
[7] https://github.com/apache/flink-web/pull/618

Thanks,
Martijn and Matthias, Leonard and Qingsheng



Re: [ANNOUNCE] Release 1.17.0, release candidate #1

2023-03-13 Thread Etienne Chauchot

Hi,

Thanks Martijn and release managers for this RC.

I tested it in a standalone cluster on this (1) benchmark / user code

[1] https://github.com/echauchot/tpcds-benchmark-flink

The RC works on this tested scope. I'll test again on RC2.

Best

Etienne

Le 09/03/2023 à 20:31, Martijn Visser a écrit :

Hi Yingjie,

Thanks for the test and identifying the issue, this is super helpful!

To all others, please continue your testing on this RC so that if there are
more blockers to be found, we can fix them with the next RC and have
(hopefully) a successful vote on it.

Best regards,

Martijn

On Thu, Mar 9, 2023 at 4:54 PM Yingjie Cao  wrote:


Hi community and release managers:

When testing the release candidate #1 for batch scenario, I found a
potential deadlock issue of blocking shuffle. I have created a ticket [1]
for it and marked it as blocker. I will fix it no later than tomorrow.

[1] https://issues.apache.org/jira/browse/FLINK-31386

Best regards,
Yingjie

Qingsheng Ren  于2023年3月9日周四 13:51写道:


Hi everyone,

The RC1 for Apache Flink 1.17.0 has been created. This RC currently is

for

preview only to facilitate the integrated testing since the release
announcement is still under review. The voting process will be triggered
once the announcement is ready. It has all the artifacts that we would
typically have for a release, except for the release note and the website
pull request for the release announcement.

The following contents are available for your review:

- The preview source release and binary convenience releases [1], which
are signed with the key with fingerprint A1BD477F79D036D2C30C [2].
- all artifacts that would normally be deployed to the Maven
Central Repository [3].
- source code tag "release-1.17.0-rc1" [4]

Your help testing the release will be greatly appreciated! And we'll
create the voting thread as soon as all the efforts are finished.

[1] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.0-rc1
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3]

https://repository.apache.org/content/repositories/orgapacheflink-1591

[4] https://github.com/apache/flink/releases/tag/release-1.17.0-rc1

Best regards,
Qingsheng, Leonard, Matthias and Martijn



Re: [Vote] FLIP-298: Unifying the Implementation of SlotManager

2023-03-13 Thread Etienne Chauchot

+1 (not binding)

Etienne

Le 11/03/2023 à 07:37, Yangze Guo a écrit :

+1 (binding)

Zhanghao Chen  于 2023年3月10日周五 下午5:07写道:


Thanks Weihua. +1 (non-binding)

Best,
Zhanghao Chen

From: Weihua Hu 
Sent: Thursday, March 9, 2023 13:27
To: dev 
Subject: [Vote] FLIP-298: Unifying the Implementation of SlotManager

Hi Everyone,

I would like to start the vote on FLIP-298: Unifying the Implementation
of SlotManager [1]. The FLIP was discussed in this thread [2].

This FLIP aims to unify the implementation of SlotManager in
order to reduce maintenance costs.

The vote will last for at least 72 hours (03/14, 15:00 UTC+8)
unless there is an objection or insufficient votes. Thank you all.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-298%3A+Unifying+the+Implementation+of+SlotManager
[2]https://lists.apache.org/thread/ocssfxglpc8z7cto3k8p44mrjxwr67r9

Best,
Weihua



Re: [DISCUSS] FLIP-299 Pub/Sub Lite Connector

2023-03-13 Thread Etienne Chauchot

Hi all,

I agree with Konstantin, mentoring is important especially on this new 
connector framework. Long time maintenance is even more important.


I could not mentor you on this topic because I'm not a committer on the 
Flink project and because I don't know Pub/Sub tech. That being said I 
have on blog under writing to share what I learnt while authoring the 
Cassandra connector with the new source framework. I think it could be 
useful as a first learning step and to avoid some caveats.


Regarding the FLIP, as that you already developed the connector inside 
Google, I understand why you gave the whole code inside the FLIP (there 
is no better doc than code) but I think that giving the big 
architectural components and decisions would help the discussion/vote.


Also, I did not review the code but just took a quick look at the Google 
techs coupling:


- you need to replace the Google headers by the ASF v2 ones

- when possible, try to use the equivalent Flink / JDK / ASF libs 
instead of the Google ones (futures, collections, safe guard 
annotations, autovalue etc...)


Finally, as a hint, I think you could take a look at the first commits 
of the Apache Beam project when DataFlow SDK was donated to the ASF and 
see what was done there to make the code ASF friendly.


Best

Etienne

Le 09/03/2023 à 09:45, Konstantin Knauf a écrit :

Hi Daniel,

I think, it would be great to have a PubSub Lite Connector in Flink. Before
you put this proposal up for a vote, though, we need feedback from a
Committer who would review and help maintain it going forward. Ideally,
this Committer would guide one or more contributors from Google to
Committership so that Google could step up and maintain Flink's PubSub and
PubSub Lite Connector in the future. For this, it would be good to
understand how you envision the involvement of the PubSub Lite team at
Google.

I am specifically sensitive on this topic, because the PubSub connector has
lacked attention and maintenance for a long time. There was also a very
short-lived interested by Google in the past to contribute a Google PubSub
Connector [1].

Best,

Konstantin

[1] https://issues.apache.org/jira/browse/FLINK-22380

Am Mi., 8. März 2023 um 14:45 Uhr schrieb Etienne Chauchot <
echauc...@apache.org>:


Hi,

I agree with Ryan, even if clients might be totally different the
backend technologies are the same so hosting them in the same repo makes
sense. Similar thinking made us put all the Cassandra related connectors
in the same cassandra repo.

Etienne

Le 02/03/2023 à 14:43, Daniel Collins a écrit :

Hello Ryan,

Unfortunately there's not much shared logic between the two- the clients
have to look fundamentally different since the Pub/Sub Lite client

exposes

partitions to the split level for repeatable reads.

I have no objection to this living in the same repo as the Pub/Sub
connector, if this is an easier way forward than setting up a new repo,
sounds good to me. The Pub/Sub team is organizationally close to us, and

is

looking into providing more support for the flink connector in the near
future.

-Daniel

On Thu, Mar 2, 2023 at 3:26 AM Ryan Skraba 
Hello Daniel!  Quite a while ago, I started porting the Pub/Sub

connector

(from an existing PR) to the new source API in the new
flink-connector-gcp-pubsub repository [PR2].  As Martijn mentioned,

there

hasn't been a lot of attention on this connector; any community

involvement

would be appreciated!

Instead of considering this a new connector, is there an opportunity

here

to offer the two variants (Pub/Sub and Pub/Sub Lite) as different

artifacts

in that same repo?  Is there much common logic that can be shared

between

the two?  I'm not as familiar as I should be with Lite, but I do recall
that they share many concepts and _some_ dependencies.

All my best, Ryan


On Wed, Mar 1, 2023 at 11:21 PM Daniel Collins

wrote:


Hello all,

I'd like to start an official discuss thread for adding a Pub/Sub Lite
Connector to Flink. We've had requests from our users to add flink

support,

and are willing to maintain and support this connector long term from

the

product team.

The proposal is https://cwiki.apache.org/confluence/x/P51bDg, what

would

be
people's thoughts on adding this connector?

-Daniel





Re: [DISCUSS] FLIP-299 Pub/Sub Lite Connector

2023-03-08 Thread Etienne Chauchot

Hi,

I agree with Ryan, even if clients might be totally different the 
backend technologies are the same so hosting them in the same repo makes 
sense. Similar thinking made us put all the Cassandra related connectors 
in the same cassandra repo.


Etienne

Le 02/03/2023 à 14:43, Daniel Collins a écrit :

Hello Ryan,

Unfortunately there's not much shared logic between the two- the clients
have to look fundamentally different since the Pub/Sub Lite client exposes
partitions to the split level for repeatable reads.

I have no objection to this living in the same repo as the Pub/Sub
connector, if this is an easier way forward than setting up a new repo,
sounds good to me. The Pub/Sub team is organizationally close to us, and is
looking into providing more support for the flink connector in the near
future.

-Daniel

On Thu, Mar 2, 2023 at 3:26 AM Ryan Skraba 
wrote:


Hello Daniel!  Quite a while ago, I started porting the Pub/Sub connector
(from an existing PR) to the new source API in the new
flink-connector-gcp-pubsub repository [PR2].  As Martijn mentioned, there
hasn't been a lot of attention on this connector; any community involvement
would be appreciated!

Instead of considering this a new connector, is there an opportunity here
to offer the two variants (Pub/Sub and Pub/Sub Lite) as different artifacts
in that same repo?  Is there much common logic that can be shared between
the two?  I'm not as familiar as I should be with Lite, but I do recall
that they share many concepts and _some_ dependencies.

All my best, Ryan


On Wed, Mar 1, 2023 at 11:21 PM Daniel Collins

wrote:


Hello all,

I'd like to start an official discuss thread for adding a Pub/Sub Lite
Connector to Flink. We've had requests from our users to add flink

support,

and are willing to maintain and support this connector long term from the
product team.

The proposal is https://cwiki.apache.org/confluence/x/P51bDg, what would
be
people's thoughts on adding this connector?

-Daniel



Re: [ANNOUNCE] Apache Flink 1.16.1 released

2023-02-06 Thread Etienne Chauchot

Hi,

Thanks to everyone involved.

Best

Etienne

Le 02/02/2023 à 03:55, weijie guo a écrit :

Thank Martin for managing the release and all the people involved.


Best regards,

Weijie


Konstantin Knauf  于2023年2月2日周四 06:40写道:


Great. Thanks, Martijn for managing the release.

Am Mi., 1. Feb. 2023 um 20:26 Uhr schrieb Martijn Visser <
martijnvis...@apache.org>:


The Apache Flink community is very happy to announce the release of

Apache

Flink 1.16.1, which is the first bugfix release for the Apache Flink 1.16
series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data

streaming

applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the

improvements

for this bugfix release:
https://flink.apache.org/news/2023/01/30/release-1.16.1.html

The full release notes are available in Jira:



https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352344

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Feel free to reach out to the release managers (or respond to this

thread)

with feedback on the release process. Our goal is to constantly improve

the

release process. Feedback on what could be improved or things that didn't
go so well are appreciated.

Best regards,

Martijn Visser



--
https://twitter.com/snntrable
https://github.com/knaufk



[jira] [Created] (FLINK-30805) SplitEnumerator#handleSplitRequest() should be called automatically

2023-01-27 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-30805:


 Summary: SplitEnumerator#handleSplitRequest() should be called 
automatically
 Key: FLINK-30805
 URL: https://issues.apache.org/jira/browse/FLINK-30805
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Reporter: Etienne Chauchot


SplitEnumerator#handleSplitRequest() is not called automatically by the new 
source framework which could be surprising to a source author. Right now a 
source author would have to call it himself when a split is finished or early 
when the reader gets created. 
IMHO it would be good if we could find a way for the framework to call it when 
a split is finished automatically



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30802) Improve SplitReader#fetch() documentation

2023-01-27 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-30802:


 Summary: Improve SplitReader#fetch() documentation
 Key: FLINK-30802
 URL: https://issues.apache.org/jira/browse/FLINK-30802
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common, Documentation
Reporter: Etienne Chauchot


[SplitReader#fetch()|https://nightlies.apache.org/flink/flink-docs-master/api/java/]
 lacks details on the fact that source authors can decide to interrupt it (for 
ex for performance reasons) for it to be resumed later based on its state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >