Re: [VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2024-01-02 Thread Tzu-Li (Gordon) Tai
+1 (binding) looks good to me overall

thank you for revising the FLIP and continuing to drive the decision, Peter!

On Wed, Dec 27, 2023 at 7:16 AM Martijn Visser 
wrote:

> Hi Peter,
>
> It would be good if Gordon can take a look, but overall this looks good to
> me +1
>
> Best regards,
>
> Martijn
>
> On Fri, Dec 22, 2023 at 8:25 AM Péter Váry 
> wrote:
> >
> > We have enough votes for the decision, but given that this is an
> important
> > change, and for many of us it is a holiday season, I plan to keep this
> vote
> > open until the 3rd of January. This way, if anyone else has comments and
> > suggestions then they have time to raise them.
> >
> > Thanks everyone for the votes, and Leonard for the useful suggestions!
> >
> > Happy holidays everyone!
> >
> > Peter
> >
> > On Thu, Dec 21, 2023, 11:23 Leonard Xu  wrote:
> >
> > > Thanks Peter for quick response and update.
> > >
> > > I’ve no more comments on the updated FLIP, +1.
> > >
> > > For the PR process, you could alsouse draft PR[1] to leverage the
> testing
> > > infra during POC phase,
> > > we usually create FLIP umbrella issue and subtask issues after the
> FLIP is
> > > accepted.
> > >
> > >
> > > Best,
> > > Leonard
> > > [1]https://github.com/apache/flink/pulls?q=is%3Apr+is%3Aopen+draft
> > >
> > >
> > >
> > >
> > > >>
> > > >>
> > > >> Best,
> > > >> Leonard
> > > >>
> > > >>
> > > >>
> > > >>> 2023年12月21日 上午11:47,Jiabao Sun 
> 写道:
> > > >>>
> > > >>> Thanks Peter for driving this.
> > > >>>
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> Best,
> > > >>> Jiabao
> > > >>>
> > > >>>
> > > >>> On 2023/12/18 12:06:05 Gyula Fóra wrote:
> > >  +1 (binding)
> > > 
> > >  Gyula
> > > 
> > >  On Mon, 18 Dec 2023 at 13:04, Márton Balassi 
> > >  wrote:
> > > 
> > > > +1 (binding)
> > > >
> > > > On Mon 18. Dec 2023 at 09:34, Péter Váry 
> > > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Since there were no further comments on the discussion thread
> [1], I
> > > > would
> > > >> like to start the vote for FLIP-372 [2].
> > > >>
> > > >> The FLIP started as a small new feature, but in the discussion
> > > thread
> > > >> and
> > > >> in a similar parallel thread [3] we opted for a somewhat bigger
> > > >> change in
> > > >> the Sink V2 API.
> > > >>
> > > >> Please read the FLIP and cast your vote.
> > > >>
> > > >> The vote will remain open for at least 72 hours and only
> concluded
> > > if
> > > > there
> > > >> are no objections and enough (i.e. at least 3) binding votes.
> > > >>
> > > >> Thanks,
> > > >> Peter
> > > >>
> > > >> [1] -
> > > >> https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd
> > > >> [2] -
> > > >>
> > > >>
> > > >
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
> > > >> [3] -
> > > >> https://lists.apache.org/thread/h6nkgth838dlh5s90sd95zd6hlsxwx57
> > > >>
> > > >
> > > >>
> > > >>
> > >
> > >
>


[ANNOUNCE] Apache Flink Kafka Connectors 3.0.2 released

2023-12-01 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache
Flink Kafka Connectors 3.0.2. This release is compatible with the Apache
Flink 1.17.x and 1.18.x release series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353768

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Gordon


Re: [RESULT][VOTE] Apache Flink Kafka Connectors v3.0.2, RC #1

2023-11-30 Thread Tzu-Li (Gordon) Tai
I'm happy to announce that we have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:
* Gordon Tai (binding)
* Rui Fan
* Jing Ge
* Leonard Xu (binding)
* Martijn Visser (binding)
* Sergey Nuyanzin

There are no disapproving votes.

Thanks everyone! I'll now release the artifacts, and separately announce
once everything is ready.

Best,
Gordon


On Thu, Nov 30, 2023 at 3:05 AM Sergey Nuyanzin  wrote:

> +1(non-binding)
>
> - Downloaded all the resources
> - Verified signatures
> - Validated hashsums
> - Built from source code
> - Checked Github release tag
> - Reviewed the web PR
>
> On Mon, Nov 27, 2023 at 9:05 AM Martijn Visser 
> wrote:
>
> > +1 (binding)
> >
> > - Validated hashes
> > - Verified signature
> > - Verified that no binaries exist in the source archive
> > - Build the source with Maven
> > - Verified licenses
> > - Verified web PRs
> >
> > On Mon, Nov 27, 2023 at 4:17 AM Leonard Xu  wrote:
> > >
> > > +1 (binding)
> > >
> > > - checked the flink-connector-base dependency scope has been changed to
> > provided
> > > - built from source code succeeded
> > > - verified signatures
> > > - verified hashsums
> > > - checked the contents contains jar and pom files in apache repo
> > > - checked Github release tag
> > > - checked release notes
> > > - reviewed the web PR
> > >
> > > Best,
> > > Leonard
> > >
> > >
> > > > 2023年11月26日 下午4:40,Jing Ge  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - verified signature and hash
> > > > - checked repo
> > > > - checked tag, BTW, the tag link at [5] should be
> > > >
> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.2-rc1
> > > > - verified source archives do not contains any binaries
> > > > - build source maven 3.8.6 and jdk11
> > > > - verified web PR
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Sat, Nov 25, 2023 at 6:44 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > >
> > > >> +1 (non-binding)
> > > >>
> > > >> - Validated checksum hash
> > > >> - Verified signature
> > > >> - Verified that no binaries exist in the source archive
> > > >> - Build the source with Maven and jdk8
> > > >> - Verified licenses
> > > >> - Verified web PRs
> > > >>
> > > >> Best,
> > > >> Rui
> > > >>
> > > >> On Sat, Nov 25, 2023 at 2:05 AM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>
> > > >> wrote:
> > > >>
> > > >>> +1 (binding)
> > > >>>
> > > >>> - Verified signature and hashes
> > > >>> - Verified mvn dependency:tree for a typical user job jar [1]. When
> > using
> > > >>> Flink 1.18.0, flink-connector-base is no longer getting bundled,
> and
> > all
> > > >>> Flink dependencies resolve as 1.18.0 / provided.
> > > >>> - Submitting user job jar to local Flink 1.18.0 cluster works and
> job
> > > >> runs
> > > >>>
> > > >>> note: If running in the IDE, the flink-connector-base dependency is
> > > >>> explicitly required when using KafkaSource. Otherwise, if
> submitting
> > an
> > > >>> uber jar, the flink-connector-base dependency should not be bundled
> > as
> > > >>> it'll be provided by the Flink distribution and will already be on
> > the
> > > >>> classpath.
> > > >>>
> > > >>> [1] mvn dependency:tree output
> > > >>> ```
> > > >>> [INFO] com.tzulitai:testing-kafka:jar:1.0-SNAPSHOT
> > > >>> [INFO] +- org.apache.flink:flink-streaming-java:jar:1.18.0:provided
> > > >>> [INFO] |  +- org.apache.flink:flink-core:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-annotations:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-metrics-core:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-shaded-asm-9:jar:9.5-17.0:provided
> > > >>> [INFO] |  |  +-
> > > >>> org.apache.flink:flink-shaded-jackson:jar:2.14.2-17.0:provided
> > &g

Re: [VOTE] Apache Flink Kafka Connectors v3.0.2, RC #1

2023-11-24 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Verified signature and hashes
- Verified mvn dependency:tree for a typical user job jar [1]. When using
Flink 1.18.0, flink-connector-base is no longer getting bundled, and all
Flink dependencies resolve as 1.18.0 / provided.
- Submitting user job jar to local Flink 1.18.0 cluster works and job runs

note: If running in the IDE, the flink-connector-base dependency is
explicitly required when using KafkaSource. Otherwise, if submitting an
uber jar, the flink-connector-base dependency should not be bundled as
it'll be provided by the Flink distribution and will already be on the
classpath.

[1] mvn dependency:tree output
```
[INFO] com.tzulitai:testing-kafka:jar:1.0-SNAPSHOT
[INFO] +- org.apache.flink:flink-streaming-java:jar:1.18.0:provided
[INFO] |  +- org.apache.flink:flink-core:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-annotations:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-metrics-core:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-shaded-asm-9:jar:9.5-17.0:provided
[INFO] |  |  +-
org.apache.flink:flink-shaded-jackson:jar:2.14.2-17.0:provided
[INFO] |  |  +- org.apache.commons:commons-lang3:jar:3.12.0:provided
[INFO] |  |  +- org.apache.commons:commons-text:jar:1.10.0:provided
[INFO] |  |  +- com.esotericsoftware.kryo:kryo:jar:2.24.0:provided
[INFO] |  |  |  +- com.esotericsoftware.minlog:minlog:jar:1.2:provided
[INFO] |  |  |  \- org.objenesis:objenesis:jar:2.1:provided
[INFO] |  |  +- commons-collections:commons-collections:jar:3.2.2:provided
[INFO] |  |  \- org.apache.commons:commons-compress:jar:1.21:provided
[INFO] |  +- org.apache.flink:flink-file-sink-common:jar:1.18.0:provided
[INFO] |  +- org.apache.flink:flink-runtime:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-rpc-core:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-rpc-akka-loader:jar:1.18.0:provided
[INFO] |  |  +-
org.apache.flink:flink-queryable-state-client-java:jar:1.18.0:provided
[INFO] |  |  +- org.apache.flink:flink-hadoop-fs:jar:1.18.0:provided
[INFO] |  |  +- commons-io:commons-io:jar:2.11.0:provided
[INFO] |  |  +-
org.apache.flink:flink-shaded-netty:jar:4.1.91.Final-17.0:provided
[INFO] |  |  +-
org.apache.flink:flink-shaded-zookeeper-3:jar:3.7.1-17.0:provided
[INFO] |  |  +- org.javassist:javassist:jar:3.24.0-GA:provided
[INFO] |  |  +- org.xerial.snappy:snappy-java:jar:1.1.10.4:runtime
[INFO] |  |  \- org.lz4:lz4-java:jar:1.8.0:runtime
[INFO] |  +- org.apache.flink:flink-java:jar:1.18.0:provided
[INFO] |  |  \- com.twitter:chill-java:jar:0.7.6:provided
[INFO] |  +- org.apache.flink:flink-shaded-guava:jar:31.1-jre-17.0:provided
[INFO] |  +- org.apache.commons:commons-math3:jar:3.6.1:provided
[INFO] |  +- org.slf4j:slf4j-api:jar:1.7.36:runtime
[INFO] |  \- com.google.code.findbugs:jsr305:jar:1.3.9:provided
[INFO] +- org.apache.flink:flink-clients:jar:1.18.0:provided
[INFO] |  +- org.apache.flink:flink-optimizer:jar:1.18.0:provided
[INFO] |  \- commons-cli:commons-cli:jar:1.5.0:provided
[INFO] +- org.apache.flink:flink-connector-kafka:jar:3.0.2-1.18:compile
[INFO] |  +- org.apache.kafka:kafka-clients:jar:3.2.3:compile
[INFO] |  |  \- com.github.luben:zstd-jni:jar:1.5.2-1:runtime
[INFO] |  +- com.fasterxml.jackson.core:jackson-core:jar:2.15.2:compile
[INFO] |  +- com.fasterxml.jackson.core:jackson-databind:jar:2.15.2:compile
[INFO] |  |  \-
com.fasterxml.jackson.core:jackson-annotations:jar:2.15.2:compile
[INFO] |  +-
com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.15.2:compile
[INFO] |  \-
com.fasterxml.jackson.datatype:jackson-datatype-jdk8:jar:2.15.2:compile
[INFO] +- org.apache.logging.log4j:log4j-slf4j-impl:jar:2.17.1:runtime
[INFO] +- org.apache.logging.log4j:log4j-api:jar:2.17.1:runtime
[INFO] \- org.apache.logging.log4j:log4j-core:jar:2.17.1:runtime
```

On Fri, Nov 24, 2023 at 9:19 AM Tzu-Li (Gordon) Tai 
wrote:

> Hi everyone,
>
> Please review and vote on release candidate #1 for version 3.0.2 of the
> Apache Flink Kafka Connector, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> Compared to the previous hotfix release, this release only additionally
> contains a fix for FLINK-30400 (
> https://issues.apache.org/jira/browse/FLINK-30400).
>
> The release candidate contains the source release as well as JAR artifacts
> to be released to Maven, built against Flink 1.17.1 and 1.18.0.
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which are signed with the key with fingerprint
> 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag v3.0.2-rc1 [5],
> * website pull request listing the new release [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority

[VOTE] Apache Flink Kafka Connectors v3.0.2, RC #1

2023-11-24 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on release candidate #1 for version 3.0.2 of the
Apache Flink Kafka Connector, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

Compared to the previous hotfix release, this release only additionally
contains a fix for FLINK-30400 (
https://issues.apache.org/jira/browse/FLINK-30400).

The release candidate contains the source release as well as JAR artifacts
to be released to Maven, built against Flink 1.17.1 and 1.18.0.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
1C1E2394D3194E1944613488F320986D35C33D6A [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.0.2-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353768
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.2-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1675
[5] https://github.com/apache/flink-connector-kafka/commits/v3.0.2-rc1
[6] https://github.com/apache/flink-web/pull/700


Re: [VOTE] Release flink-connector-opensearch v1.1.0, release candidate #1

2023-11-23 Thread Tzu-Li (Gordon) Tai
Hi Danny,

Thanks for starting a RC for this.

>From the looks of the staged POMs for 1.1.0-1.18, the flink versions for
Flink dependencies still point to 1.17.1.

My understanding is that this is fine, as those provided scope
dependencies (e.g. flink-streaming-java) will have their versions
overwritten by the user POM if they do intend to compile their jobs against
Flink 1.18.x.
Can you clarify if this is the correct understanding of how we intend the
externalized connector artifacts to be published? Related discussion on [1].

Thanks,
Gordon

[1] https://lists.apache.org/thread/x1pyrrrq7o1wv1lcdovhzpo4qhd4tvb4

On Thu, Nov 23, 2023 at 3:14 PM Sergey Nuyanzin  wrote:

> +1 (non-binding)
>
> - downloaded artifacts
> - built from source
> - verified checksums and signatures
> - reviewed web pr
>
>
> On Mon, Nov 6, 2023 at 5:31 PM Ryan Skraba 
> wrote:
>
> > Hello! +1 (non-binding) Thanks for the release!
> >
> > I've validated the source for the RC1:
> > * flink-connector-opensearch-1.1.0-src.tgz at r64995
> > * The sha512 checksum is OK.
> > * The source file is signed correctly.
> > * The signature 0F79F2AFB2351BC29678544591F9C1EC125FD8DB is found in the
> > KEYS file, and on https://keyserver.ubuntu.com/
> > * The source file is consistent with the GitHub tag v1.1.0-rc1, which
> > corresponds to commit 0f659cc65131c9ff7c8c35eb91f5189e80414ea1
> > - The files explicitly excluded by create_pristine_sources (such as
> > .gitignore and the submodule tools/releasing/shared) are not present.
> > * Has a LICENSE file and a NOTICE file
> > * Does not contain any compiled binaries.
> >
> > * The sources can be compiled and unit tests pass with flink.version
> 1.17.1
> > and flink.version 1.18.0
> >
> > * Nexus has three staged artifact ids for 1.1.0-1.17 and 1.1.0-1.18
> > - flink-connector-opensearch (.jar, -javadoc.jar, -sources.jar,
> > -tests.jar and .pom)
> > - flink-sql-connector-opensearch (.jar, -sources.jar and .pom)
> > - flink-connector-gcp-pubsub-parent (only .pom)
> >
> > All my best, Ryan
> >
> > On Fri, Nov 3, 2023 at 10:29 AM Danny Cranmer 
> > wrote:
> > >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #1 for the version
> 1.1.0
> > of
> > > flink-connector-opensearch, as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org
> > [2],
> > > which are signed with the key with fingerprint
> > > 0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v1.1.0-rc1 [5],
> > > * website pull request listing the new release [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Danny
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353141
> > > [2]
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-opensearch-1.1.0-rc1/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1666/
> > > [5]
> https://github.com/apache/flink-connector-opensearch/tree/v1.1.0-rc1
> > > [6] https://github.com/apache/flink-web/pull/694
> >
>
>
> --
> Best regards,
> Sergey
>


[ANNOUNCE] Apache Flink Kafka Connectors 3.0.1 released

2023-10-31 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache
Flink Kafka Connectors 3.0.1. This release is compatible with the Apache
Flink 1.17.x and 1.18.x release series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352910

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Gordon


[RESULT] [VOTE] Apache Flink Kafka connector version 3.0.1, RC1

2023-10-30 Thread Tzu-Li (Gordon) Tai
I'm happy to announce that we have unanimously approved this release.

There are 10 approving votes, 4 of which are binding:
* Qingsheng Ren (binding)
* Martijn Visser (binding)
* Xianxun Ye
* Mystic Lama
* Leonard Xu (binding)
* Ahmed Hamdy
* Samrat Deb
* Tzu-Li (Gordon) Tai (binding)
* Sergey Nuyanzin
* Mason Chen

There are no disapproving votes.

Thanks everyone! I'll now release the artifacts, and separately announce
once everything is ready.

Best,
Gordon

On Mon, Oct 30, 2023 at 3:35 PM Tzu-Li (Gordon) Tai 
wrote:

> Thanks for the catch on the docs and fixing it, Xianxun and Mason!
>
> On Mon, Oct 30, 2023 at 12:36 PM Mason Chen 
> wrote:
>
>> I submitted PR to fix it since I was looking at the Kafka code already:
>> https://github.com/apache/flink-connector-kafka/pull/63
>>
>> On Mon, Oct 30, 2023 at 12:19 PM Mason Chen 
>> wrote:
>>
>> > +1 (non-binding)
>> >
>> > * Verified hashes and signatures
>> > * Verified no binaries
>> > * Verified poms point to 3.0.1
>> > * Reviewed web PR
>> > * Built from source
>> > * Verified git tag
>> >
>> > @Xianxun, good catch. The datastream docs should be automatically
>> updated
>> > via the doc shortcode. However, it seems that the sql connector doc
>> > shortcode doesn't support the new format of
>> > `{connector-release-version}-{flink-version}`.
>> >
>> > Best,
>> > Mason
>> >
>> > On Mon, Oct 30, 2023 at 9:27 AM Sergey Nuyanzin 
>> > wrote:
>> >
>> >> +1 (non-binding)
>> >> * Verified hashes and checksums
>> >> * Built from source
>> >> * Checked release tag
>> >> * Reviewed the web PR
>> >>
>> >> On Mon, Oct 30, 2023 at 5:13 PM Tzu-Li (Gordon) Tai <
>> tzuli...@apache.org>
>> >> wrote:
>> >>
>> >> > +1 (binding)
>> >> >
>> >> > - Hashes and checksums
>> >> > - Build succeeds against 1.18.0: mvn clean install
>> >> -Dflink.version=1.18.0
>> >> > - Verified that memory leak issue is fixed for idle topics. Tested
>> >> against
>> >> > Flink 1.18.0 cluster.
>> >> >
>> >> > Thanks,
>> >> > Gordon
>> >> >
>> >> >
>> >> > On Mon, Oct 30, 2023 at 8:20 AM Samrat Deb 
>> >> wrote:
>> >> >
>> >> > > +1 (non-binding)
>> >> > >
>> >> > > - Verified signatures
>> >> > > - Verified Checksum
>> >> > > - Build with Java 8 /11 - build success
>> >> > > - Started MSK cluster and EMR cluster with flink, successfully ran
>> >> some
>> >> > > examples to read and write data to MSK.
>> >> > > - Checked release tag exists
>> >> > >
>> >> > >
>> >> > > Bests,
>> >> > > Samrat
>> >> > >
>> >> > > On Mon, Oct 30, 2023 at 3:47 PM Ahmed Hamdy 
>> >> > wrote:
>> >> > >
>> >> > > > +1 (non-binding)
>> >> > > > - Verified Singatures
>> >> > > > - Verified Checksum
>> >> > > > - Build source successfully
>> >> > > > - Checked release tag exists
>> >> > > > - Reviewed the web PR
>> >> > > > Best Regards
>> >> > > > Ahmed Hamdy
>> >> > > >
>> >> > > >
>> >> > > > On Sun, 29 Oct 2023 at 08:02, Leonard Xu 
>> wrote:
>> >> > > >
>> >> > > > > +1 (binding)
>> >> > > > >
>> >> > > > > - Verified signatures
>> >> > > > > - Verified hashsums
>> >> > > > > - Checked Github release tag
>> >> > > > > - Built from source code succeeded
>> >> > > > > - Checked release notes
>> >> > > > > - Reviewed the web PR
>> >> > > > >
>> >> > > > > Best,
>> >> > > > > Leonard
>> >> > > > >
>> >> > > > >
>> >> > > > > > 2023年10月29日 上午11:34,mystic lama 
>> 写道:
>> >> > > > > >
>> >> > > > > > +1 (non-binding)
>> >> > > > > >
>> >> > > >

Re: [VOTE] Apache Flink Kafka connector version 3.0.1, RC1

2023-10-30 Thread Tzu-Li (Gordon) Tai
Thanks for the catch on the docs and fixing it, Xianxun and Mason!

On Mon, Oct 30, 2023 at 12:36 PM Mason Chen  wrote:

> I submitted PR to fix it since I was looking at the Kafka code already:
> https://github.com/apache/flink-connector-kafka/pull/63
>
> On Mon, Oct 30, 2023 at 12:19 PM Mason Chen 
> wrote:
>
> > +1 (non-binding)
> >
> > * Verified hashes and signatures
> > * Verified no binaries
> > * Verified poms point to 3.0.1
> > * Reviewed web PR
> > * Built from source
> > * Verified git tag
> >
> > @Xianxun, good catch. The datastream docs should be automatically updated
> > via the doc shortcode. However, it seems that the sql connector doc
> > shortcode doesn't support the new format of
> > `{connector-release-version}-{flink-version}`.
> >
> > Best,
> > Mason
> >
> > On Mon, Oct 30, 2023 at 9:27 AM Sergey Nuyanzin 
> > wrote:
> >
> >> +1 (non-binding)
> >> * Verified hashes and checksums
> >> * Built from source
> >> * Checked release tag
> >> * Reviewed the web PR
> >>
> >> On Mon, Oct 30, 2023 at 5:13 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> >> wrote:
> >>
> >> > +1 (binding)
> >> >
> >> > - Hashes and checksums
> >> > - Build succeeds against 1.18.0: mvn clean install
> >> -Dflink.version=1.18.0
> >> > - Verified that memory leak issue is fixed for idle topics. Tested
> >> against
> >> > Flink 1.18.0 cluster.
> >> >
> >> > Thanks,
> >> > Gordon
> >> >
> >> >
> >> > On Mon, Oct 30, 2023 at 8:20 AM Samrat Deb 
> >> wrote:
> >> >
> >> > > +1 (non-binding)
> >> > >
> >> > > - Verified signatures
> >> > > - Verified Checksum
> >> > > - Build with Java 8 /11 - build success
> >> > > - Started MSK cluster and EMR cluster with flink, successfully ran
> >> some
> >> > > examples to read and write data to MSK.
> >> > > - Checked release tag exists
> >> > >
> >> > >
> >> > > Bests,
> >> > > Samrat
> >> > >
> >> > > On Mon, Oct 30, 2023 at 3:47 PM Ahmed Hamdy 
> >> > wrote:
> >> > >
> >> > > > +1 (non-binding)
> >> > > > - Verified Singatures
> >> > > > - Verified Checksum
> >> > > > - Build source successfully
> >> > > > - Checked release tag exists
> >> > > > - Reviewed the web PR
> >> > > > Best Regards
> >> > > > Ahmed Hamdy
> >> > > >
> >> > > >
> >> > > > On Sun, 29 Oct 2023 at 08:02, Leonard Xu 
> wrote:
> >> > > >
> >> > > > > +1 (binding)
> >> > > > >
> >> > > > > - Verified signatures
> >> > > > > - Verified hashsums
> >> > > > > - Checked Github release tag
> >> > > > > - Built from source code succeeded
> >> > > > > - Checked release notes
> >> > > > > - Reviewed the web PR
> >> > > > >
> >> > > > > Best,
> >> > > > > Leonard
> >> > > > >
> >> > > > >
> >> > > > > > 2023年10月29日 上午11:34,mystic lama  写道:
> >> > > > > >
> >> > > > > > +1 (non-binding)
> >> > > > > >
> >> > > > > > - verified signatures
> >> > > > > > - build with Java 8 and Java 11 - build success
> >> > > > > >
> >> > > > > > Minor observation
> >> > > > > > - RAT check flagged that README.md is missing ASL
> >> > > > > >
> >> > > > > > On Fri, 27 Oct 2023 at 23:40, Xianxun Ye <
> >> yesorno828...@gmail.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > >> +1(non-binding)
> >> > > > > >>
> >> > > > > >> - Started a local Flink 1.18 cluster, read and wrote with
> Kafka
> >> > and
> >> > > > > Upsert
> >> > > > > >> Kafka connector successfully to Kafka 2.2 cluster
> >> > > > > >>
> >> > 

Re: [DISCUSS] AWS Connectors v4.2.0 release + 1.18 support

2023-10-30 Thread Tzu-Li (Gordon) Tai
+1

On Mon, Oct 30, 2023 at 9:00 AM Danny Cranmer 
wrote:

> Hey,
>
> > Did you mean skip 4.1.1, since 4.1.0 has already been released?
>
> I meant skip "4.1.0-1.18" since we could release this with the existing
> source. We will additionally skip 4.1.1 and jump to 4.2.0 since this
> version has features it should be a minor version rather than a patch [1].
>
> > Does this imply that the 4.1.x series will be reserved for Flink 1.17,
> and the 4.2.x series will correspond to Flink 1.18?
>
> 4.1.x will receive bug fixes for Flink 1.17.
> 4.2.x will receive bug fixes and features for Flink 1.17 and 1.18.
>
> Thanks,
> Danny
>
> [1] https://semver.org/
>
>
> On Mon, Oct 30, 2023 at 3:47 PM Samrat Deb  wrote:
>
> > Hi Danny ,
> >
> > Thank you for driving it.
> >
> > +1 (non binding )
> >
> >
> > >  I am proposing we skip 4.1.0 for Flink 1.18 and go
> > straight to 4.2.0.
> >
> > Does this imply that the 4.1.x series will be reserved for Flink 1.17,
> and
> > the 4.2.x series will correspond to Flink 1.18?
> >
> > Bests,
> > Samrat
> >
> >
> > On Mon, Oct 30, 2023 at 7:32 PM Jing Ge 
> > wrote:
> >
> > > Hi Danny,
> > >
> > > +1 Thanks for driving it. Did you mean skip 4.1.1, since 4.1.0 has
> > already
> > > been released?
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Mon, Oct 30, 2023 at 11:49 AM Danny Cranmer <
> dannycran...@apache.org>
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I would like to start the discussion to release Apache Flink AWS
> > > connectors
> > > > v4.2.0. We released v4.1.0 over six months ago on 2023-04-03. Since
> > then
> > > we
> > > > have resolved 23 issues [1]. Additionally now Flink 1.18 is live we
> > need
> > > to
> > > > add support for this. I am proposing we skip 4.1.0 for Flink 1.18 and
> > go
> > > > straight to 4.2.0. The CI is stable [2].
> > > >
> > > > I volunteer myself as the release manager.
> > > >
> > > > Thanks,
> > > > Danny
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-33021?jql=statusCategory%20%3D%20done%20AND%20project%20%3D%2012315522%20AND%20fixVersion%20%3D%2012353011%20ORDER%20BY%20priority%20DESC%2C%20key%20ASC
> > > > [2] https://github.com/apache/flink-connector-aws/actions
> > > >
> > >
> >
>


Re: [VOTE] Apache Flink Kafka connector version 3.0.1, RC1

2023-10-30 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Hashes and checksums
- Build succeeds against 1.18.0: mvn clean install -Dflink.version=1.18.0
- Verified that memory leak issue is fixed for idle topics. Tested against
Flink 1.18.0 cluster.

Thanks,
Gordon


On Mon, Oct 30, 2023 at 8:20 AM Samrat Deb  wrote:

> +1 (non-binding)
>
> - Verified signatures
> - Verified Checksum
> - Build with Java 8 /11 - build success
> - Started MSK cluster and EMR cluster with flink, successfully ran some
> examples to read and write data to MSK.
> - Checked release tag exists
>
>
> Bests,
> Samrat
>
> On Mon, Oct 30, 2023 at 3:47 PM Ahmed Hamdy  wrote:
>
> > +1 (non-binding)
> > - Verified Singatures
> > - Verified Checksum
> > - Build source successfully
> > - Checked release tag exists
> > - Reviewed the web PR
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Sun, 29 Oct 2023 at 08:02, Leonard Xu  wrote:
> >
> > > +1 (binding)
> > >
> > > - Verified signatures
> > > - Verified hashsums
> > > - Checked Github release tag
> > > - Built from source code succeeded
> > > - Checked release notes
> > > - Reviewed the web PR
> > >
> > > Best,
> > > Leonard
> > >
> > >
> > > > 2023年10月29日 上午11:34,mystic lama  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - verified signatures
> > > > - build with Java 8 and Java 11 - build success
> > > >
> > > > Minor observation
> > > > - RAT check flagged that README.md is missing ASL
> > > >
> > > > On Fri, 27 Oct 2023 at 23:40, Xianxun Ye 
> > > wrote:
> > > >
> > > >> +1(non-binding)
> > > >>
> > > >> - Started a local Flink 1.18 cluster, read and wrote with Kafka and
> > > Upsert
> > > >> Kafka connector successfully to Kafka 2.2 cluster
> > > >>
> > > >> One minor question: should we update the dependency manual of these
> > two
> > > >> documentations[1][2]?
> > > >>
> > > >> [1]
> > > >>
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/#dependencies
> > > >> [2]
> > > >>
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/upsert-kafka/#dependencies
> > > >>
> > > >> Best regards,
> > > >> Xianxun
> > > >>
> > > >>> 2023年10月26日 16:12,Martijn Visser  写道:
> > > >>>
> > > >>> +1 (binding)
> > > >>>
> > > >>> - Validated hashes
> > > >>> - Verified signature
> > > >>> - Verified that no binaries exist in the source archive
> > > >>> - Build the source with Maven via mvn clean install
> > > >>> -Pcheck-convergence -Dflink.version=1.18.0
> > > >>> - Verified licenses
> > > >>> - Verified web PR
> > > >>> - Started a cluster and the Flink SQL client, successfully read and
> > > >>> wrote with the Kafka connector to Confluent Cloud with AVRO and
> > Schema
> > > >>> Registry enabled
> > > >>>
> > > >>> On Thu, Oct 26, 2023 at 5:09 AM Qingsheng Ren 
> > > wrote:
> > > >>>>
> > > >>>> +1 (binding)
> > > >>>>
> > > >>>> - Verified signature and checksum
> > > >>>> - Verified that no binary exists in the source archive
> > > >>>> - Built from source with Java 8 using -Dflink.version=1.18
> > > >>>> - Started a local Flink 1.18 cluster, submitted jobs with SQL
> client
> > > >>>> reading from and writing (with exactly-once) to Kafka 3.2.3
> cluster
> > > >>>> - Nothing suspicious in LICENSE and NOTICE file
> > > >>>> - Reviewed web PR
> > > >>>>
> > > >>>> Thanks for the effort, Gordon!
> > > >>>>
> > > >>>> Best,
> > > >>>> Qingsheng
> > > >>>>
> > > >>>> On Thu, Oct 26, 2023 at 5:13 AM Tzu-Li (Gordon) Tai <
> > > >> tzuli...@apache.org>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi everyone,
> > > >>>>>
> > > 

[VOTE] Apache Flink Kafka connector version 3.0.1, RC1

2023-10-25 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on release candidate #1 for version 3.0.1 of the
Apache Flink Kafka Connector, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This release contains important changes for the following:
- Supports Flink 1.18.x series
- [FLINK-28303] EOS violation when using LATEST_OFFSETS startup mode
- [FLINK-33231] Memory leak causing OOM when there are no offsets to commit
back to Kafka
- [FLINK-28758] FlinkKafkaConsumer fails to stop with savepoint

The release candidate contains the source release as well as JAR artifacts
to be released to Maven, built against Flink 1.17.1 and 1.18.0.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
1C1E2394D3194E1944613488F320986D35C33D6A [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.0.1-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352910
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.1-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1664
[5] https://github.com/apache/flink-connector-kafka/commits/v3.0.1-rc1
[6] https://github.com/apache/flink-web/pull/692


Re: [DISCUSS] Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2023-10-24 Thread Tzu-Li (Gordon) Tai
Also one more meta comment regarding interplay of interface changes across
FLIP-371 and FLIP-372:

With FLIP-371, we're already thinking about some signature changes on the
TwoPhaseCommittingSink interface, such as 1)  introducing
`createCommitter(CommitterInitContext)`, and 2) renaming the writer side
`InitContext` to `WriterInitContext` to align on naming conventions with
the new `CommitterInitContext`.

Instead of trying to adapt the existing TwoPhaseCommittingSink interface,
since we're likely going to have to introduce a complete new interface with
FLIP-372 anyways + deprecate existing TwoPhaseCommittingSink, what do you
think about only introducing the above signature changes in the new
interface and just leaving the old one as is? Having access to the new
features (transforming committables / runtime context for committer
initialization) would be motivation for implementations to migrate as soon
as possible.

On Tue, Oct 24, 2023 at 5:00 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi Peter,
>
> Thanks a lot for starting this FLIP!
>
> I agree that the current TwoPhaseCommittingSink interfaces is limiting in
> that it assumes 1) committers have the same parallelism as writers, and 2)
> writers immediately produce finalized committables. This FLIP captures the
> problem pretty well, and I do think there are use cases for a more general
> flexible interface outside of the Iceberg connector as well.
>
> In terms of the abstraction layering, I was wondering if you've also
> considered this approach which I've quickly sketched in my local fork:
> https://github.com/tzulitai/flink/commit/e84e3ac57ce023c35037a8470fefdfcad877bcae
>
> With this approach, the sink translator always expect that 2PC sink
> implementations should extend `TwoPhaseCommittingSinkBase` and therefore
> assumes that a pre-commit topology always exist. For simple 2PC sinks that
> do not require transforming committables, we would ship (for convenience)
> an additional `SimpleTwoPhaseCommittingSinkBase` where the pre-commit
> topology is a no-op passthrough. With that we avoid some of the
> "boilerplates" where 2PC sinks with pre-commit topology requires
> implementing two interfaces, as proposed in the FLIP.
>
> Quick thought: regarding the awkwardness you mention in the end with sinks
> that have post commit topologies, but no pre-commit topologies -
> Alternative to the mixin approach in the FLIP, it might make sense to
> consider a builder approach for constructing 2PC sinks, which should also
> give users type-safety at compile time while not having the awkwardness
> with all the types involved. Something along the lines of:
>
> ```
> new TwoPhaseCommittingSinkBuilder(writerSupplier, committerSupplier)
> .withPreCommitTopology(writerResultsStream -> ...)   //
> Function, DataStream>
> .withPostCommitTopology(committablesStream -> ...) //
> Consumer>
> .withPreWriteTopology(inputStream -> ...)  //
> Function, DataStream>
> .build();
> ```
>
> We could probably do some validation in the build() method, e.g. if writer
> / committer have different types, then clearly a pre-commit topology should
> have been defined to transform intermediate writer results.
>
> Obviously, this would take generalization of the TwoPhaseCommittingSink
> interface to the extreme, where we just have one interface with all of the
> pre-commit / pre-write / post-commit methods built-in, and users would use
> the builder as the entrypoint to opt-in / opt-out as needed. The upside is
> that the SinkTransformationTranslator logic will become much less cluttered.
>
> I'll need to experiment the builder approach a bit more to see if it makes
> sense at all, but wanted to throw out the idea earlier to see what you
> think.
>
> On Mon, Oct 9, 2023 at 6:59 AM Péter Váry 
> wrote:
>
>> Hi Team,
>>
>> Did some experimenting and found the originally proposed solution to be a
>> bit awkward for cases where WithPostCommitTopology was needed but we do
>> not
>> need the WithPreCommitTopology transformation.
>> The flexibility of the new API would be better if we would use a mixin
>> like
>> approach. The interfaces would only be used to define the specific
>> required
>> methods, and they would not need to extend the original
>> TwoPhaseCommittingSink interface too.
>>
>> Since the interfaces WithPreCommitTopology and the WithPostCommitTopology
>> interfaces are still Experimental, after talking to Gyula offline, I have
>> updated the FLIP to use this new approach.
>>
>> Any comments, thoughts are welcome.
>>
>> Thanks,
>> Peter
>>
>> Péter Váry  ezt írta (időpont: 2023. okt.
>> 5.,
>> Cs, 16:04):
&g

Re: [DISCUSS] Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2023-10-24 Thread Tzu-Li (Gordon) Tai
Hi Peter,

Thanks a lot for starting this FLIP!

I agree that the current TwoPhaseCommittingSink interfaces is limiting in
that it assumes 1) committers have the same parallelism as writers, and 2)
writers immediately produce finalized committables. This FLIP captures the
problem pretty well, and I do think there are use cases for a more general
flexible interface outside of the Iceberg connector as well.

In terms of the abstraction layering, I was wondering if you've also
considered this approach which I've quickly sketched in my local fork:
https://github.com/tzulitai/flink/commit/e84e3ac57ce023c35037a8470fefdfcad877bcae

With this approach, the sink translator always expect that 2PC sink
implementations should extend `TwoPhaseCommittingSinkBase` and therefore
assumes that a pre-commit topology always exist. For simple 2PC sinks that
do not require transforming committables, we would ship (for convenience)
an additional `SimpleTwoPhaseCommittingSinkBase` where the pre-commit
topology is a no-op passthrough. With that we avoid some of the
"boilerplates" where 2PC sinks with pre-commit topology requires
implementing two interfaces, as proposed in the FLIP.

Quick thought: regarding the awkwardness you mention in the end with sinks
that have post commit topologies, but no pre-commit topologies -
Alternative to the mixin approach in the FLIP, it might make sense to
consider a builder approach for constructing 2PC sinks, which should also
give users type-safety at compile time while not having the awkwardness
with all the types involved. Something along the lines of:

```
new TwoPhaseCommittingSinkBuilder(writerSupplier, committerSupplier)
.withPreCommitTopology(writerResultsStream -> ...)   //
Function, DataStream>
.withPostCommitTopology(committablesStream -> ...) //
Consumer>
.withPreWriteTopology(inputStream -> ...)  //
Function, DataStream>
.build();
```

We could probably do some validation in the build() method, e.g. if writer
/ committer have different types, then clearly a pre-commit topology should
have been defined to transform intermediate writer results.

Obviously, this would take generalization of the TwoPhaseCommittingSink
interface to the extreme, where we just have one interface with all of the
pre-commit / pre-write / post-commit methods built-in, and users would use
the builder as the entrypoint to opt-in / opt-out as needed. The upside is
that the SinkTransformationTranslator logic will become much less cluttered.

I'll need to experiment the builder approach a bit more to see if it makes
sense at all, but wanted to throw out the idea earlier to see what you
think.

On Mon, Oct 9, 2023 at 6:59 AM Péter Váry 
wrote:

> Hi Team,
>
> Did some experimenting and found the originally proposed solution to be a
> bit awkward for cases where WithPostCommitTopology was needed but we do not
> need the WithPreCommitTopology transformation.
> The flexibility of the new API would be better if we would use a mixin like
> approach. The interfaces would only be used to define the specific required
> methods, and they would not need to extend the original
> TwoPhaseCommittingSink interface too.
>
> Since the interfaces WithPreCommitTopology and the WithPostCommitTopology
> interfaces are still Experimental, after talking to Gyula offline, I have
> updated the FLIP to use this new approach.
>
> Any comments, thoughts are welcome.
>
> Thanks,
> Peter
>
> Péter Váry  ezt írta (időpont: 2023. okt. 5.,
> Cs, 16:04):
>
> > Hi Team,
> >
> > In my previous email[1] I have described our challenges migrating the
> > existing Iceberg SinkFunction based implementation, to the new SinkV2
> based
> > implementation.
> >
> > As a result of the discussion around that topic, I have created the
> > FLIP-371 [2] to address the Committer related changes, and now I have
> > created a companion FLIP-372 [3] to address the WithPreCommitTopology
> > related issues.
> >
> > FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the
> > type of the Committable
> >
> > The main goal of the FLIP-372 is to extend the currently existing
> > TwoPhaseCommittingSink API by adding the possibility to have a
> > PreCommitTopology where the input of and the output types of the pre
> commit
> > transformation are different.
> >
> > Here is the FLIP: FLIP-372: Allow TwoPhaseCommittingSink
> > WithPreCommitTopology to alter the type of the Committable
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
> >
> >
> > Please use this thread to discuss the FLIP related questions, proposals,
> > and feedback.
> >
> > Thanks,
> > Peter
> >
> > - [1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
> > - [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+Provide+initialization+context+for+Committer+creation+in+TwoPhaseCommittingSink
> > - [3]
> >
> 

Re: [ANNOUNCE] Release 1.18.0, release candidate #0

2023-10-23 Thread Tzu-Li (Gordon) Tai
Hi David,

Just to follow-up on that last question: I can confirm that there are no
regressions for the Flink Kafka connector working with Flink 1.18. The
previous nightly build failures were caused by breaking changes in test
code, which has been resolved by now.

I'll be creating new releases for flink-connector-kafka 3.0.1-18 as soon as
the 1.18.0 artifacts are released.

Thanks,
Gordon

On Fri, Oct 6, 2023 at 1:40 AM David Radley  wrote:

> Hi Martjin,
> Thanks for your comments. I also think it is better to decouple the
> connectors – I agree they need to have their own release cycles. . I was
> worried that moving to Flink 1.118 is somehow causing the Kafka connector
> to fail – i.e. a regression. I think you are saying that there is no
> regression like this,
>   Kind regards, David.
> From: Martijn Visser 
> Date: Thursday, 5 October 2023 at 21:39
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [ANNOUNCE] Release 1.18.0, release candidate #0
> Hi David,
>
> It’s a deliberate choice to decouple the connectors. We shouldn’t block
> Flink 1.18 on connector statuses. There’s already work being done to fix
> the Flink Kafka connector. Any Flink connector comes after the new minor
> version, similar to how it has been for all other connectors with Flink
> 1.17.
>
> Best regards,
>
> Martijn Visser
>
> Op do 5 okt 2023 om 11:33 schreef David Radley 
>
> > Hi Jing,
> > Yes I agree that if we can get them resolved then that would be ideal.
> >
> > I guess the worry is that at 1.17, we had a released Flink core and Kafka
> > connector.
> > At 1.18 we will have a released Core Flink but no new Kafka connector. So
> > the last released Kafka connector would now be
> >
> https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka/3.0.0-1.17
> > which should be the same as the Kafka connector in 1.17. I guess this is
> > the combination that people would pick up to deploy in production – and I
> > assume this has been tested.
> >
> > This issues with the nightly builds refers to kafka connector main
> > branch.  If they are not regressions, you are suggesting that
> pragmatically
> > we go forward with the release; I think that makes sense to do, but do
> > these issues effect 3.0.0.-1.117.
> >
> > I suspect we should release a new Kafka connector asap, so we have a
> > matching connector built outside of the Flink repo. We may want to not
> > include the Flink core version in the connector – or we might end up
> > wanting to release a Kafka connector when there are no changes just to
> have
> > a match with the Flink core version.
> >
> > Kind regards, David.
> >
> >
> >
> > From: Jing Ge 
> > Date: Wednesday, 4 October 2023 at 17:36
> > To: dev@flink.apache.org 
> > Subject: [EXTERNAL] Re: [ANNOUNCE] Release 1.18.0, release candidate #0
> > Hi David,
> >
> > First of all, we should have enough time to wait for those issues to
> > be resolved. Secondly, it makes less sense to block upstream release by
> > downstream build issues. In case, those issues might need more time, we
> > should move forward with the Flink release without waiting for them.
> WDYT?
> >
> > Best regards,
> > Jing
> >
> > On Wed, Oct 4, 2023 at 6:15 PM David Radley 
> > wrote:
> >
> > > Hi ,
> > > As release 1.18 removes  the kafka connector from the core Flink
> > > repository, I assume we will wait until the kafka connector nightly
> build
> > > issues https://issues.apache.org/jira/browse/FLINK-33104   and
> > > https://issues.apache.org/jira/browse/FLINK-33017   are resolved
> before
> > > releasing 1.18?
> > >
> > >  Kind regards, David.
> > >
> > >
> > > From: Jing Ge 
> > > Date: Wednesday, 27 September 2023 at 15:11
> > > To: dev@flink.apache.org 
> > > Subject: [EXTERNAL] Re: [ANNOUNCE] Release 1.18.0, release candidate #0
> > > Hi Folks,
> > >
> > > @Ryan FYI: CI passed and the PR has been merged. Thanks!
> > >
> > > If there are no more other concerns, I will start publishing 1.18-rc1.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Mon, Sep 25, 2023 at 1:40 PM Jing Ge  wrote:
> > >
> > > > Hi Ryan,
> > > >
> > > > Thanks for reaching out. It is fine to include it but we need to wait
> > > > until the CI passes. I am not sure how long it will take, since there
> > > seems
> > > > to be some infra issues.
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Mon, Sep 25, 2023 at 11:34 AM Ryan Skraba
> > > 
> > > > wrote:
> > > >
> > > >> Hello!  There's a security fix that probably should be applied to
> 1.18
> > > >> in the next RC1 : https://github.com/apache/flink/pull/23461
> (bump
> > to
> > > >> snappy-java).  Do you think this would be possible to include?
> > > >>
> > > >> All my best, Ryan
> > > >>
> > > >> [1]: https://issues.apache.org/jira/browse/FLINK-33149"Bump
> > > >> snappy-java to 1.1.10.4"
> > > >>
> > > >>
> > > >>
> > > >> On Mon, Sep 25, 2023 at 3:54 PM Jing Ge  >
> > > >> wrote:
> > > >> >
> > > >> > Thanks Zakelly for the update! Appreciate it!
> > > >> >
> > > >> > 

Re: [VOTE] FLIP-371: Provide initialization context for Committer creation in TwoPhaseCommittingSink

2023-10-16 Thread Tzu-Li (Gordon) Tai
+1 (binding)

On Mon, Oct 16, 2023 at 4:57 AM Martijn Visser 
wrote:

> +1 (binding)
>
> On Mon, Oct 16, 2023 at 1:14 PM Leonard Xu  wrote:
> >
> > No more comments,  give my +1(binding)
> >
> > Best,
> > Leonard
> >
> > > 2023年10月16日 下午6:20,Péter Váry  写道:
> > >
> > > Any more comments?
> > >
> > > Leonard Xu  ezt írta (időpont: 2023. okt. 16., H,
> 8:22):
> > >
> > >> Thanks Peter for driving this work.
> > >>
> > >>
> > >> +1(binding)
> > >>
> > >> Best,
> > >> Leonard
> > >>
> > >>> 2023年10月12日 下午10:58,Márton Balassi  写道:
> > >>>
> > >>> +1 (binding)
> > >>>
> > >>> Marton
> > >>>
> > >>> On Wed, Oct 11, 2023 at 8:20 PM Gyula Fóra 
> wrote:
> > >>>
> >  Thanks , Peter.
> > 
> >  +1
> > 
> >  Gyula
> > 
> >  On Wed, 11 Oct 2023 at 14:47, Péter Váry <
> peter.vary.apa...@gmail.com>
> >  wrote:
> > 
> > > Hi all,
> > >
> > > Thank you to everyone for the feedback on FLIP-371[1].
> > > Based on the discussion thread [2], I think we are ready to take a
> vote
> >  to
> > > contribute this to Flink.
> > > I'd like to start a vote for it.
> > > The vote will be open for at least 72 hours (excluding weekends,
> unless
> > > there is an objection or an insufficient number of votes).
> > >
> > > Thanks,
> > > Peter
> > >
> > >
> > > [1]
> > >
> > >
> > 
> > >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+Provide+initialization+context+for+Committer+creation+in+TwoPhaseCommittingSink
> > > [2]
> https://lists.apache.org/thread/v3mrspdlrqrzvbwm0lcgr0j4v03dx97c
> > >
> > 
> > >>
> > >>
> >
>


Re: [DISCUSS] FLIP-371: SinkV2 Committer imporvements

2023-10-05 Thread Tzu-Li (Gordon) Tai
Thanks Peter for starting the FLIP.

Overall, this seems pretty straightforward and overdue, +1.

Two quick question / comments:

   1. Can you rename the FLIP to something less generic? Perhaps "Provide
   initialization context for Committer creation in TwoPhaseCommittingSink"?
   2. Can you describe why the job ID is needed to be exposed? Although
   it's out of scope for this FLIP, I'm wondering if there's a need to do the
   same for the sink writer InitContext.

Thanks,
Gordon

On Wed, Oct 4, 2023 at 11:20 AM Martijn Visser 
wrote:

> Hi all,
>
> Peter, Marton, Gordon and I had an offline sync on SinkV2 and I'm
> happy with this first FLIP on the topic. +1
>
> Best regards,
>
> Martijn
>
> On Wed, Oct 4, 2023 at 5:48 PM Márton Balassi 
> wrote:
> >
> > Thanks, Peter. I agree that this is needed for Iceberg and beneficial for
> > other connectors too.
> >
> > +1
> >
> > On Wed, Oct 4, 2023 at 3:56 PM Péter Váry 
> > wrote:
> >
> > > Hi Team,
> > >
> > > In my previous email[1] I have described our challenges migrating the
> > > existing Iceberg SinkFunction based implementation, to the new SinkV2
> based
> > > implementation.
> > >
> > > As a result of the discussion around that topic, I have created the
> first
> > > [2] of the FLIP-s addressing the missing features there.
> > >
> > > The main goal of the FLIP-371 is to extend the currently existing
> Committer
> > > API by providing an initial context on Committer creation. This context
> > > will contain - among other, less interesting data -
> > > the SinkCommitterMetricGroup which could be used to store the generic
> > > commit related metrics, and also provide a way for the Committer to
> publish
> > > its own metrics.
> > >
> > > The feature has already been requested through FLINK-25857 [3].
> > >
> > > Please use this thread to discuss the FLIP related questions,
> proposals,
> > > feedback.
> > >
> > > Thanks,
> > > Peter
> > >
> > > - [1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
> > > - [2]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+SinkV2+Committer+imporvements
> > > - [3] https://issues.apache.org/jira/browse/FLINK-25857
> > >
>


Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

2023-09-29 Thread Tzu-Li (Gordon) Tai
Hi everyone,

It’s been a while since this topic was last discussed, but nevertheless, it
still remains very desirable to figure out a clear path towards making
SinkV2 @Public.

There’s a new thread [1] that has a pretty good description on missing
features in SinkV2 from the Iceberg connector’s perspective, which prevents
them from migrating - anything related to those new requirements, let's
discuss there.

Nevertheless, I think we should also revive and reuse this thread to reach
consensus / closure on all concerns already brought up here.

It’s quite a long thread where a lot of various concerns were brought up,
but I’d start by addressing two very specific ones: FLIP-287 [2] and
FLINK-30238 [3]

First of all, FLIP-287 has been approved and merged already, and will be
released with 1.18.0. So, connector migrations that were waiting on this
should hopefully be unblocked after the release. So this seems to no longer
be a concern - let’s see things through with those connectors actually
being migrated.

FLINK-30238 is sort of a confusing one, and I do believe it is (partially)
a false alarm. After looking into this, the problem reported there
essentially breaks down to two things:
1) TwoPhaseCommittingSink is unable to take a new savepoint after restoring
from a savepoint generated via `stop-with-savepoint --drain`
2) SinkV2 sinks with a PostCommitTopology do not properly have post-commits
completed after a stop-with-savepoint operation, since committed
commitables are not emitted to the post-commit topology after the committer
receives the end-of-input signal.

My latest comment in [3] explains this in a bit more detail.

I believe we can conclude that problem 1) is a non-concern - users should
not restore from a job that is drained on stop-with-savepoint and cannot
expect the restored job to function normally.
Problem 2) remains a real issue though, and to help clear things up I think
we should close FLINK-30238 in favor of a new ticket scoped to the specific
PostCommitTopology problem.

The other open concerns seem to mostly be around graduation criteria and
process - I've yet to go through those and will follow up with a separate
reply (or perhaps Martijn can help wrap up that part?).

Thanks,
Gordon

[1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
[2]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
[3] https://issues.apache.org/jira/browse/FLINK-30238

On Mon, Feb 13, 2023 at 2:50 AM Jing Ge  wrote:

> @Martijn
>
> What I shared previously is the fact of the current KafkaSink. Following
> your suggestion, the KafkaSink should still be marked as @Experimental for
> now which will need even longer time to graduate. BTW, KafkaSink does not
> depend on any @Internal interfaces. The @Internal is used for methods
> coming from @PublicEvolving SinkV2 interfaces, not interfaces themself.
> Thanks for bringing this topic up. Currently, there is no rule defined to
> say that no @Internal is allowed for methods implemented
> from @PublicEvolving interfaces. Further (off-this-topic) discussion might
> be required to check if it really makes sense to define such a rule, since
> some methods defined in interfaces might only be used internally, i.e. no
> connector user would be aware of them.
>
> @Dong
>
> I agree with everything you said and especially can't agree more to let
> developers who will own it make the decision.
>
> Best regards,
> Jing
>
>
> On Sun, Feb 12, 2023 at 2:53 AM Dong Lin  wrote:
>
> > Hi Martijn,
> >
> > Thanks for the reply. Please see my comments inline.
> >
> > Regards,
> > Dong
> >
> > On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser  >
> > wrote:
> >
> > > Hi all,
> > >
> > > I wanted to get back on a couple of comments and have a proposal at the
> > end
> > > for the next steps:
> > >
> > > @Steven Wu
> > > If I look back at the original discussion of FLIP-191 and also the
> thread
> > > that you're referring to, it appears from the discussion with Yun Gao
> > that
> > > the solution was in near-sight, but just not finished. Perhaps it needs
> > to
> > > be restarted once more so it can be brought to a closure. Also when I
> > > looked back at the original introduction of SinkV2, there was
> FLINK-25726
> > > [1] which looks like it was built specifically for Iceberg and Delta
> > Sink?
> > >
> > > @Jing
> > > > All potential unstable methods coming from SinkV2 interfaces will be
> > kept
> > > marked as @internal.
> > >
> > > This is a situation that I think should be avoided: if it's unstable,
> it
> > > should be marked as @Experimental. Connectors should not need to rely
> > > on @Internal interfaces; if they are needed by a connector, a
> maintainer
> > > should create a proposal to expose.
> > >
> > > On the production readiness of SinkV2:
> > >
> > > @Dong
> > > > And Yuxia mentioned earlier in this thread that there are bugs such
> as
> > > FLINK-30238  and
> > > FLINK-29459 

Re: [DISCUSS] FLIP-319: Integrating with Kafka’s proper support for 2PC participation (KIP-939).

2023-08-31 Thread Tzu-Li (Gordon) Tai
ouldn't be
> any
>   dangling txns.
>3. Shouldn't we call completeTransaction on restore instead of
>commitTransaction? In what situations would the flink Kafka connector
> abort
>the transaction?
>4. Do we need to keep the current KafkaInternalProducer for a while to
>remain compatible with older Kafka versions that do not support KIP-939?
>5. How will the connector handle
>transaction.two.phase.commit.enable=false on the broker (not client)
> level?
>6. Does it make sense for the connector users to override
>transaction.two.phase.commit.enable? If it does not make sense, would
> the
>connector ignore the config or throw an exception when it is passed?
>
>
> Best regards,
> Alex
>
> On Wed, Aug 23, 2023 at 6:09 AM Gyula Fóra  wrote:
>
> > Hi Gordon!
> >
> > Thank you for preparing the detailed FLIP, I think this is a huge
> > improvement that enables the exactly-once Kafka sink in many
> environments /
> > use-cases where this was previously unfeasible due to the limitations
> > highlighted in the FLIP.
> >
> > Big +1
> >
> > Cheers,
> > Gyula
> >
> > On Fri, Aug 18, 2023 at 7:54 PM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > Hi Flink devs,
> > >
> > > I’d like to officially start a discussion for FLIP-319: Integrating
> with
> > > Kafka’s proper support for 2PC participation (KIP-939) [1].
> > >
> > > This is the “sister” joint FLIP for KIP-939 [2] [3]. It has been a
> > > long-standing issue that Flink’s Kafka connector doesn’t work fully
> > > correctly under exactly-once mode due to lack of distributed
> transaction
> > > support in the Kafka transaction protocol. This has led to subpar hacks
> > in
> > > the connector such as Java reflections to workaround the protocol's
> > > limitations (which causes a bunch of problems on its own, e.g. long
> > > recovery times for the connector), while still having corner case
> > scenarios
> > > that can lead to data loss.
> > >
> > > This joint effort with the Kafka community attempts to address this so
> > that
> > > the Flink Kafka connector can finally work against public Kafka APIs,
> > which
> > > should result in a much more robust integration between the two
> systems,
> > > and for Flink developers, easier maintainability of the code.
> > >
> > > Obviously, actually implementing this FLIP relies on the joint KIP
> being
> > > implemented and released first. Nevertheless, I'd like to start the
> > > discussion for the design as early as possible so we can benefit from
> the
> > > new Kafka changes as soon as it is available.
> > >
> > > Looking forward to feedback and comments on the proposal!
> > >
> > > Thanks,
> > > Gordon
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071710
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC
> > > [3] https://lists.apache.org/thread/wbs9sqs3z1tdm7ptw5j4o9osmx9s41nf
> > >
> >
>


Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

The original intent of having a separate repo for the playground repo, was
that StateFun users can just go to that and start running simple examples
without any other distractions from the core code. I personally don't have
a strong preference here and can understand how it would make the workflow
more streamlined, but just FYI on the reasoning why are separate in the
first place.

re: paths for locating StateFun artifacts.
Can this be solved by simply passing in the path to the artifacts? As well
as the image tag for the locally build base StateFun image. They could
probably be environment variables.

Cheers,
Gordon

On Fri, Aug 18, 2023 at 12:13 PM Galen Warren via user <
u...@flink.apache.org> wrote:

> Yes, exactly! And in addition to the base Statefun jars and the jar for
> the Java SDK, it does an equivalent copy/register operation for each of the
> other SDK libraries (Go, Python, Javascript) so that those libraries are
> also available when building the playground examples.
>
> One more question: In order to copy the various build artifacts into the
> Docker containers, those artifacts need to be part of the Docker context.
> With the playground being a separate project, that's slightly tricky to do,
> as there is no guarantee (other than convention) about the relative paths
> of *flink-statefun* and* flink-statefun-playground *in someone's local
> filesystem. The way I've set this up locally is to copy the playground into
> the* flink-statefun* project -- i.e. to *flink-statefun*/playground --
> such that I can set the Docker context to the root folder of
> *flink-statefun* and then have access to any local code and/or build
> artifacts.
>
> I'm wondering if there might be any appetite for making that move
> permanent, i.e. moving the playground to *flink-statefun*/playground and
> deprecating the standalone playground project. In addition to making the
> problem of building with unreleased artifacts a bit simpler to solve, it
> would also simplify the process of releasing a new Statefun version, since
> the entire process could be handled with a single PR and associated
> build/deploy tasks. In other words, a single PR could both update and
> deploy the Statefun package and the playground code and images.
>
> As it stands, at least two PRs would be required for each Statefun version
> update -- one for *flink-statefun* and one for *flink-statefun-playground*
> .
>
> Anyway, just an idea. Maybe there's an important reason for these projects
> to remain separate. If we do want to keep the playground project where it
> is, I could solve the copying problem by requiring the two projects to be
> siblings in the file system and by pre-copying the local build artifacts
> into a location accessible by the existing Docker contexts. This would
> still leave us with the need to have two PRs and releases instead of one,
> though.
>
> Thanks for your help!
>
>
> On Fri, Aug 18, 2023 at 2:45 PM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi Galen,
>>
>> > locally built code is copied into the build containers
>> so that it can be accessed during the build.
>>
>> That's exactly what we had been doing for release testing, yes. Sorry I
>> missed that detail in my previous response.
>>
>> And yes, that sounds like a reasonable approach. If I understand you
>> correctly, the workflow would become this:
>>
>>1. Build the StateFun repo locally to install the snapshot artifact
>>jars + have a local base StateFun image.
>>2. Run the playground in "local" mode, so that it uses the local base
>>StateFun image + builds the playground code using copied artifact jars
>>(instead of pulling from Maven).
>>
>> That looks good to me!
>>
>> Thanks,
>> Gordon
>>
>> On Fri, Aug 18, 2023 at 11:33 AM Galen Warren
>>  wrote:
>>
>> > Thanks.
>> >
>> > If you were to build a local image, as you suggest, how do you access
>> that
>> > image when building the playground images? All the compilation of
>> > playground code happens inside containers, so local images on the host
>> > aren't available in those containers. Unless I'm missing something?
>> >
>> > I've slightly reworked things such that the playground images can be
>> run in
>> > one of two modes -- the default mode, which works like before, and a
>> > "local" mode where locally built code is copied into the build
>> containers
>> > so that it can be accessed during the build. It works fine, you just
>> have
>> > to define a couple of environment variables when running docker-compose
>> to
>> > specify default

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

> locally built code is copied into the build containers
so that it can be accessed during the build.

That's exactly what we had been doing for release testing, yes. Sorry I
missed that detail in my previous response.

And yes, that sounds like a reasonable approach. If I understand you
correctly, the workflow would become this:

   1. Build the StateFun repo locally to install the snapshot artifact
   jars + have a local base StateFun image.
   2. Run the playground in "local" mode, so that it uses the local base
   StateFun image + builds the playground code using copied artifact jars
   (instead of pulling from Maven).

That looks good to me!

Thanks,
Gordon

On Fri, Aug 18, 2023 at 11:33 AM Galen Warren
 wrote:

> Thanks.
>
> If you were to build a local image, as you suggest, how do you access that
> image when building the playground images? All the compilation of
> playground code happens inside containers, so local images on the host
> aren't available in those containers. Unless I'm missing something?
>
> I've slightly reworked things such that the playground images can be run in
> one of two modes -- the default mode, which works like before, and a
> "local" mode where locally built code is copied into the build containers
> so that it can be accessed during the build. It works fine, you just have
> to define a couple of environment variables when running docker-compose to
> specify default vs. local mode and what versions of Flink and Statefun
> should be referenced, and then you can build a run the local examples
> without any additional steps. Does that sound like a reasonable approach?
>
>
> On Fri, Aug 18, 2023 at 2:17 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi Galen,
> >
> > > Gordon, is there a trick to running the sample code in
> > flink-statefun-playground against yet-unreleased code that I'm missing?
> >
> > You'd have to locally build an image from the release branch, with a
> > temporary image version tag. Then, in the flink-statefun-playground,
> change
> > the image versions in the docker-compose files to use that locally built
> > image. IIRC, that's what we have been doing in the past. Admittedly, it's
> > pretty manual - I don't think the CI manages this workflow.
> >
> > Thanks,
> > Gordon
> >
> > On Mon, Aug 14, 2023 at 10:42 AM Galen Warren 
> > wrote:
> >
> > > I created a pull request for this: [FLINK-31619] Upgrade Stateful
> > > Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> > > apache/flink-statefun (github.com)
> > > <https://github.com/apache/flink-statefun/pull/331>.
> > >
> > > JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink 1.16.1
> -
> > > ASF JIRA (apache.org)
> > > <https://issues.apache.org/jira/browse/FLINK-31619?filter=-1>.
> > >
> > > Statefun references 1.16.2, despite the title -- that version has come
> > out
> > > since the issue was created.
> > >
> > > I figured out how to run all the playground tests locally, but it took
> a
> > > bit of reworking of the playground setup with respect to Docker;
> > > specifically, the Docker contexts used to build the example functions
> > > needed to be broadened (i.e. moved up the tree) so that, if needed,
> local
> > > artifacts/code can be accessed from within the containers at build
> time.
> > > Then I made the Docker compose.yml configurable through environment
> > > variables to allow for them to run in either the original manner --
> i.e.
> > > pulling artifacts from public repos -- or in a "local" mode, where
> > > artifacts are pulled from local builds.
> > >
> > > This process is a cleaner if the playground is a subfolder of the
> > > flink-statefun project rather than be its own repository
> > > (flink-statefun-playground), because then all the relative paths
> between
> > > the playground files and the build artifacts are fixed. So, I'd like to
> > > propose to move the playground files, modified as described above, to
> > > flink-statefun/playground and retire flink-statefun-playground. I can
> > > submit separate PR s those changes if everyone is on board.
> > >
> > > Also, should I plan to do the same upgrade to handle Flink 1.17.x? It
> > > should be easy to do, especially while the 1.16.x upgrade is fresh on
> my
> > > mind.
> > >
> > > Thanks.
> > >
> > >
> > > On Fri, Aug 11, 2023 at 6:40 PM Galen Warren 
> > > wrote:
> > >
> > >> I'm done with the c

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

> Gordon, is there a trick to running the sample code in
flink-statefun-playground against yet-unreleased code that I'm missing?

You'd have to locally build an image from the release branch, with a
temporary image version tag. Then, in the flink-statefun-playground, change
the image versions in the docker-compose files to use that locally built
image. IIRC, that's what we have been doing in the past. Admittedly, it's
pretty manual - I don't think the CI manages this workflow.

Thanks,
Gordon

On Mon, Aug 14, 2023 at 10:42 AM Galen Warren 
wrote:

> I created a pull request for this: [FLINK-31619] Upgrade Stateful
> Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> apache/flink-statefun (github.com)
> <https://github.com/apache/flink-statefun/pull/331>.
>
> JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink 1.16.1 -
> ASF JIRA (apache.org)
> <https://issues.apache.org/jira/browse/FLINK-31619?filter=-1>.
>
> Statefun references 1.16.2, despite the title -- that version has come out
> since the issue was created.
>
> I figured out how to run all the playground tests locally, but it took a
> bit of reworking of the playground setup with respect to Docker;
> specifically, the Docker contexts used to build the example functions
> needed to be broadened (i.e. moved up the tree) so that, if needed, local
> artifacts/code can be accessed from within the containers at build time.
> Then I made the Docker compose.yml configurable through environment
> variables to allow for them to run in either the original manner -- i.e.
> pulling artifacts from public repos -- or in a "local" mode, where
> artifacts are pulled from local builds.
>
> This process is a cleaner if the playground is a subfolder of the
> flink-statefun project rather than be its own repository
> (flink-statefun-playground), because then all the relative paths between
> the playground files and the build artifacts are fixed. So, I'd like to
> propose to move the playground files, modified as described above, to
> flink-statefun/playground and retire flink-statefun-playground. I can
> submit separate PR s those changes if everyone is on board.
>
> Also, should I plan to do the same upgrade to handle Flink 1.17.x? It
> should be easy to do, especially while the 1.16.x upgrade is fresh on my
> mind.
>
> Thanks.
>
>
> On Fri, Aug 11, 2023 at 6:40 PM Galen Warren 
> wrote:
>
>> I'm done with the code to make Statefun compatible with Flink 1.16, and
>> all the tests (including e2e succeed). The required changes were pretty
>> minimal.
>>
>> I'm running into a bit of a chicken/egg problem executing the tests in
>> flink-statefun-playground
>> <https://github.com/apache/flink-statefun-playground>, though. That
>> project seems to assume that all the various Statefun artifacts are built
>> and deployed to the various public repositories already. I've looked into
>> trying to redirect references to local artifacts; however, that's also
>> tricky since all the building occurs in Docker containers.
>>
>> Gordon, is there a trick to running the sample code in
>> flink-statefun-playground against yet-unreleased code that I'm missing?
>>
>> Thanks.
>>
>> On Sat, Jun 24, 2023 at 12:40 PM Galen Warren 
>> wrote:
>>
>>> Great -- thanks!
>>>
>>> I'm going to be out of town for about a week but I'll take a look at
>>> this when I'm back.
>>>
>>> On Tue, Jun 20, 2023 at 8:46 AM Martijn Visser 
>>> wrote:
>>>
>>>> Hi Galen,
>>>>
>>>> Yes, I'll be more than happy to help with Statefun releases.
>>>>
>>>> Best regards,
>>>>
>>>> Martijn
>>>>
>>>> On Tue, Jun 20, 2023 at 2:21 PM Galen Warren 
>>>> wrote:
>>>>
>>>>> Thanks.
>>>>>
>>>>> Martijn, to answer your question, I'd need to do a small amount of
>>>>> work to get a PR ready, but not much. Happy to do it if we're deciding to
>>>>> restart Statefun releases -- are we?
>>>>>
>>>>> -- Galen
>>>>>
>>>>> On Sat, Jun 17, 2023 at 9:47 AM Tzu-Li (Gordon) Tai <
>>>>> tzuli...@apache.org> wrote:
>>>>>
>>>>>> > Perhaps he could weigh in on whether the combination of automated
>>>>>> tests plus those smoke tests should be sufficient for testing with new
>>>>>> Flink versions
>>>>>>
>>>>>> What we usually did at the bare minimum for new StateFun releases was
>>>

[DISCUSS] FLIP-319: Integrating with Kafka’s proper support for 2PC participation (KIP-939).

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Flink devs,

I’d like to officially start a discussion for FLIP-319: Integrating with
Kafka’s proper support for 2PC participation (KIP-939) [1].

This is the “sister” joint FLIP for KIP-939 [2] [3]. It has been a
long-standing issue that Flink’s Kafka connector doesn’t work fully
correctly under exactly-once mode due to lack of distributed transaction
support in the Kafka transaction protocol. This has led to subpar hacks in
the connector such as Java reflections to workaround the protocol's
limitations (which causes a bunch of problems on its own, e.g. long
recovery times for the connector), while still having corner case scenarios
that can lead to data loss.

This joint effort with the Kafka community attempts to address this so that
the Flink Kafka connector can finally work against public Kafka APIs, which
should result in a much more robust integration between the two systems,
and for Flink developers, easier maintainability of the code.

Obviously, actually implementing this FLIP relies on the joint KIP being
implemented and released first. Nevertheless, I'd like to start the
discussion for the design as early as possible so we can benefit from the
new Kafka changes as soon as it is available.

Looking forward to feedback and comments on the proposal!

Thanks,
Gordon

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071710
[2]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC
[3] https://lists.apache.org/thread/wbs9sqs3z1tdm7ptw5j4o9osmx9s41nf


[jira] [Created] (FLINK-32455) Breaking change in TypeSerializerUpgradeTestBase prevents flink-connector-kafka from building against 1.18-SNAPSHOT

2023-06-27 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-32455:
---

 Summary: Breaking change in TypeSerializerUpgradeTestBase prevents 
flink-connector-kafka from building against 1.18-SNAPSHOT
 Key: FLINK-32455
 URL: https://issues.apache.org/jira/browse/FLINK-32455
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka, Test Infrastructure
Affects Versions: 1.18.0
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.18.0


FLINK-27518 introduced a breaking signature change to the abstract class 
{{TypeSerializerUpgradeTestBase}}, specifically the abstract 
{{createTestSpecifications}} method signature was changed. This breaks 
downstream test code in externalized connector repos, e.g. 
flink-connector-kafka's {{KafkaSerializerUpgradeTest}}

Moreover, {{fink-migration-test-utils}} needs to be transitively pulled in by 
downstream test code that depends on flink-core test-jar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32453) flink-connector-kafka does not build against Flink 1.18-SNAPSHOT

2023-06-27 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-32453:
---

 Summary: flink-connector-kafka does not build against Flink 
1.18-SNAPSHOT
 Key: FLINK-32453
 URL: https://issues.apache.org/jira/browse/FLINK-32453
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.18.0
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.18.0


There are a few breaking changes in test utility code that prevents 
{{apache/flink-connector-kafka}} from building against Flink 1.18-SNAPSHOT. 
This umbrella ticket captures all breaking changes, and should only be closed 
once we make things build again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Status of Statefun Project

2023-06-17 Thread Tzu-Li (Gordon) Tai
> Perhaps he could weigh in on whether the combination of automated tests
plus those smoke tests should be sufficient for testing with new Flink
versions

What we usually did at the bare minimum for new StateFun releases was the
following:

   1. Build tests (including the smoke tests in the e2e module, which
   covers important tests like exactly-once verification)
   2. Updating the flink-statefun-playground repo and manually running all
   language examples there.

If upgrading Flink versions was the only change in the release, I'd
probably say that this is sufficient.

Best,
Gordon

On Thu, Jun 15, 2023 at 5:25 AM Martijn Visser 
wrote:

> Let me know if you have a PR for a Flink update :)
>
> On Thu, Jun 8, 2023 at 5:52 PM Galen Warren via user <
> u...@flink.apache.org> wrote:
>
>> Thanks Martijn.
>>
>> Personally, I'm already using a local fork of Statefun that is compatible
>> with Flink 1.16.x, so I wouldn't have any need for a released version
>> compatible with 1.15.x. I'd be happy to do the PRs to modify Statefun to
>> work with new versions of Flink as they come along.
>>
>> As for testing, Statefun does have unit tests and Gordon also sent me
>> instructions a while back for how to do some additional smoke tests which
>> are pretty straightforward. Perhaps he could weigh in on whether the
>> combination of automated tests plus those smoke tests should be sufficient
>> for testing with new Flink versions (I believe the answer is yes).
>>
>> -- Galen
>>
>>
>>
>> On Thu, Jun 8, 2023 at 8:01 AM Martijn Visser 
>> wrote:
>>
>>> Hi all,
>>>
>>> Apologies for the late reply.
>>>
>>> I'm willing to help out with merging requests in Statefun to keep them
>>> compatible with new Flink releases and create new releases. I do think
>>> that
>>> validation of the functionality of these releases depends a lot on those
>>> who do these compatibility updates, with PMC members helping out with the
>>> formal process.
>>>
>>> > Why can't the Apache Software Foundation allow community members to
>>> bring
>>> it up to date?
>>>
>>> There's nothing preventing anyone from reviewing any of the current PRs
>>> or
>>> opening new ones. However, none of them are approved [1], so there's also
>>> nothing to merge.
>>>
>>> > I believe that there are people and companies on this mailing list
>>> interested in supporting Apache Flink Stateful Functions.
>>>
>>> If so, then now is the time to show.
>>>
>>> Would there be a preference to create a release with Galen's merged
>>> compatibility update to Flink 1.15.2, or do we want to skip that and go
>>> straight to a newer version?
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> [1]
>>>
>>> https://github.com/apache/flink-statefun/pulls?q=is%3Apr+is%3Aopen+review%3Aapproved
>>>
>>> On Tue, Jun 6, 2023 at 3:55 PM Marco Villalobos <
>>> mvillalo...@kineteque.com>
>>> wrote:
>>>
>>> > Why can't the Apache Software Foundation allow community members to
>>> bring
>>> > it up to date?
>>> >
>>> > What's the process for that?
>>> >
>>> > I believe that there are people and companies on this mailing list
>>> > interested in supporting Apache Flink Stateful Functions.
>>> >
>>> > You already had two people on this thread express interest.
>>> >
>>> > At the very least, we could keep the library versions up to date.
>>> >
>>> > There are only a small list of new features that might be worthwhile:
>>> >
>>> > 1. event time processing
>>> > 2. state rest api
>>> >
>>> >
>>> > On Jun 6, 2023, at 3:06 AM, Chesnay Schepler 
>>> wrote:
>>> >
>>> > If you were to fork it *and want to redistribute it* then the short
>>> > version is that
>>> >
>>> >1. you have to adhere to the Apache licensing requirements
>>> >2. you have to make it clear that your fork does not belong to the
>>> >Apache Flink project. (Trademarks and all that)
>>> >
>>> > Neither should be significant hurdles (there should also be plenty of
>>> > online resources regarding 1), and if you do this then you can freely
>>> share
>>> > your fork with others.
>>> >
>>> > I've also pinged Martijn to take a look at this thread.
>>> > To my knowledge the project hasn't decided anything yet.
>>> >
>>> > On 27/05/2023 04:05, Galen Warren wrote:
>>> >
>>> > Ok, I get it. No interest.
>>> >
>>> > If this project is being abandoned, I guess I'll work with my own
>>> fork. Is
>>> > there anything I should consider here? Can I share it with other
>>> people who
>>> > use this project?
>>> >
>>> > On Tue, May 16, 2023 at 10:50 AM Galen Warren 
>>> 
>>> > wrote:
>>> >
>>> >
>>> > Hi Martijn, since you opened this discussion thread, I'm curious what
>>> your
>>> > thoughts are in light of the responses? Thanks.
>>> >
>>> > On Wed, Apr 19, 2023 at 1:21 PM Galen Warren 
>>> 
>>> > wrote:
>>> >
>>> >
>>> > I use Apache Flink for stream processing, and StateFun as a hand-off
>>> >
>>> > point for the rest of the application.
>>> > It serves well as a bridge between a Flink Streaming job and
>>> > micro-services.
>>> >
>>> > This is 

Re: [VOTE] FLIP-287: Extend Sink#InitContext to expose TypeSerializer, ObjectReuse and JobID

2023-06-16 Thread Tzu-Li (Gordon) Tai
+1 (binding)

On Fri, Jun 16, 2023, 09:53 Jing Ge  wrote:

> +1(binding)
>
> Best Regards,
> Jing
>
> On Fri, Jun 16, 2023 at 10:10 AM Lijie Wang 
> wrote:
>
> > +1 (binding)
> >
> > Thanks for driving it, Joao.
> >
> > Best,
> > Lijie
> >
> > Joao Boto  于2023年6月16日周五 15:53写道:
> >
> > > Hi all,
> > >
> > > Thank you to everyone for the feedback on FLIP-287[1]. Based on the
> > > discussion thread [2], we have come to a consensus on the design and
> are
> > > ready to take a vote to contribute this to Flink.
> > >
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours(excluding weekends, unless there is an objection or an
> insufficient
> > > number of votes.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > > [2]https://lists.apache.org/thread/wb3myhqsdz81h08ygwx057mkn1hc3s8f
> > >
> > >
> > > Best,
> > > Joao Boto
> > >
> >
>


Re: [VOTE] FLIP-246: Dynamic Kafka Source (originally Multi Cluster Kafka Source)

2023-06-16 Thread Tzu-Li (Gordon) Tai
+1 (binding)

+1 for either DynamicKafkaSource or DiscoveringKafkaSource

Cheers,
Gordon

On Thu, Jun 15, 2023, 10:56 Mason Chen  wrote:

> Hi all,
>
> Thank you to everyone for the feedback on FLIP-246 [1]. Based on the
> discussion thread [2], we have come to a consensus on the design and are
> ready to take a vote to contribute this to Flink.
>
> This voting thread will be open for at least 72 hours (excluding weekends,
> until June 20th 10:00AM PST) unless there is an objection or an
> insufficient number of votes.
>
> (Optional) If you have an opinion on the naming of the connector, please
> include it in your vote:
> 1. DynamicKafkaSource
> 2. MultiClusterKafkaSource
> 3. DiscoveringKafkaSource
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217389320
> [2] https://lists.apache.org/thread/vz7nw5qzvmxwnpktnofc9p13s1dzqm6z
>
> Best,
> Mason
>


Re: [DISCUSS] FLIP-246: Multi Cluster Kafka Source

2023-06-14 Thread Tzu-Li (Gordon) Tai
Hi Mason,

Thanks for addressing my comments. I agree that option 3 seems more
reasonable.

> Reorganize the metadata in a Map in
`KafkaStream` where the String is the proposed
`KafkaClusterIdentifier.name` field.

Why not just Map?

Regarding naming, I like DynamicKafkaSource as that's what I immediately
thought of when reading the FLIP, but I'm not married to the name :)

In principle, it looks like the FLIP is in good shape and generally people
seem to like the idea of having this connector in Flink.
I'd be in favor of an official vote to allow this to move forward.

Thanks,
Gordon

On Mon, Jun 12, 2023 at 1:57 PM Mason Chen  wrote:

> >
> > My main worry for doing this as a later iteration is that this would
> > probably be a breaking change for the public interface. If that can be
> > avoided and planned ahead, I'm fine with moving forward with how it is
> > right now.
>
>
> Make sense. Considering the public interfaces, I think we still want to
> provide clients the ability to pin certain configurations in the
> builder--however, cluster specific configurations may not be known upfront
> or generalize to all clusters so there would need to be changes in the
> `KafkaMetadataService` interface. This could be achieved by exposing via:
>
> 1. A separate API (e.g. `Map
> getKafkaClusterProperties()`) in KafkaMetadataService
> 2. In `KafkaClusterIdentifier` as this already contains some configuration
> (e.g. Bootstrap server) in which case we should rename the class to
> something like `KafkaCluster` as it is no longer just an identifier
> 3. Reorganize the metadata in a Map in
> `KafkaStream` where the String is the proposed
> `KafkaClusterIdentifier.name` field.
>
> I am preferring option 3 since this simplifies equals() checks on
> KafkaClusterIdentifier (e.g. is it the name, bootstrap, or both?).
>
> Small correction for the MultiClusterKafkaSourceEnumerator section: "This
> > reader is responsible for discovering and assigning splits from 1+
> cluster"
>
> Thanks for the catch!
>
> the defining characteristic is the dynamic discovery vs. the fact that
> > multiple clusters [...]
>
>
>
> I think the "Table" in the name of those SQL connectors should avoid
> > confusion. Perhaps we can also solicit other ideas? I would throw
> > "DiscoveringKafkaSource" into the mix.
>
>  Agreed with Gordon's and your suggestions. Right, the only public facing
> name for SQL is `kafka` for the SQL connector identifier. Based on your
> suggestions:
>
> 1. MultiClusterKafkaSource
> 2. DynamicKafkaSource
> 3. DiscoveringKafkaSource
> 4. MutableKafkaSource
> 5. AdaptiveKafkaSource
>
> I added a few of my own. I do prefer 2. What do others think?
>
> Best,
> Mason
>
> On Sun, Jun 11, 2023 at 1:12 PM Thomas Weise  wrote:
>
> > Hi Mason,
> >
> > Thanks for the iterations on the FLIP, I think this is in a very good
> shape
> > now.
> >
> > Small correction for the MultiClusterKafkaSourceEnumerator section: "This
> > reader is responsible for discovering and assigning splits from 1+
> cluster"
> >
> > Regarding the user facing name of the connector: I agree with Gordon that
> > the defining characteristic is the dynamic discovery vs. the fact that
> > multiple clusters may be consumed in parallel. (Although, as described in
> > the FLIP, lossless consumer migration only works with a strategy that
> > involves intermittent parallel consumption of old and new clusters to
> drain
> > and switch.)
> >
> > I think the "Table" in the name of those SQL connectors should avoid
> > confusion. Perhaps we can also solicit other ideas? I would throw
> > "DiscoveringKafkaSource" into the mix.
> >
> > Cheers,
> > Thomas
> >
> >
> >
> >
> > On Fri, Jun 9, 2023 at 3:40 PM Tzu-Li (Gordon) Tai 
> > wrote:
> >
> > > > Regarding (2), definitely. This is something we planned to add later
> on
> > > but
> > > so far keeping things common has been working well.
> > >
> > > My main worry for doing this as a later iteration is that this would
> > > probably be a breaking change for the public interface. If that can be
> > > avoided and planned ahead, I'm fine with moving forward with how it is
> > > right now.
> > >
> > > > DynamicKafkaSource may be confusing because it is really similar to
> the
> > > KafkaDynamicSource/Sink (table connectors).
> > >
> > > The table / sql Kafka connectors (KafkaDynamicTableFactory,
> > > KafkaDynamicTableSource / KafkaDynamicTableSink) are all intern

Re: [DISCUSS] FLIP-246: Multi Cluster Kafka Source

2023-06-09 Thread Tzu-Li (Gordon) Tai
> Regarding (2), definitely. This is something we planned to add later on
but
so far keeping things common has been working well.

My main worry for doing this as a later iteration is that this would
probably be a breaking change for the public interface. If that can be
avoided and planned ahead, I'm fine with moving forward with how it is
right now.

> DynamicKafkaSource may be confusing because it is really similar to the
KafkaDynamicSource/Sink (table connectors).

The table / sql Kafka connectors (KafkaDynamicTableFactory,
KafkaDynamicTableSource / KafkaDynamicTableSink) are all internal classes
not really meant to be exposed to the user though.
It can cause some confusion internally for the code maintainers, but on the
actual public surface I don't see this being an issue.

Thanks,
Gordon

On Wed, Jun 7, 2023 at 8:55 PM Mason Chen  wrote:

> Hi Gordon,
>
> Thanks for taking a look!
>
> Regarding (1), there is a need from the readers to send this event at
> startup because the reader state may reflect outdated metadata. Thus, the
> reader should not start without fresh metadata. With fresh metadata, the
> reader can filter splits from state--this filtering capability is
> ultimately how we solve the common issue of "I re-configured my Kafka
> source and removed some topic, but it refers to the old topic due to state
> *[1]*". I did not mention this because I thought this is more of a detail
> but I'll make a brief note of it.
>
> Regarding (2), definitely. This is something we planned to add later on but
> so far keeping things common has been working well. In that regard, yes the
> metadata service should expose these configurations but the source should
> not check it into state unlike the other metadata. I'm going to add it to a
> section called "future enhancements". This is also feedback that Ryan, an
> interested user, gave earlier in this thread.
>
> Regarding (3), that's definitely a good point and there are some real use
> cases, in addition to what you mentioned, to use this in single cluster
> mode (see *[1] *above). DynamicKafkaSource may be confusing because it is
> really similar to the KafkaDynamicSource/Sink (table connectors).
>
> Best,
> Mason
>
> On Wed, Jun 7, 2023 at 10:40 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi Mason,
> >
> > Thanks for updating the FLIP. In principle, I believe this would be a
> > useful addition. Some comments so far:
> >
> > 1. In this sequence diagram [1], why is there a need for a
> > GetMetadataUpdateEvent from the MultiClusterSourceReader going to the
> > MultiClusterSourceEnumerator? Shouldn't the enumerator simply start
> sending
> > metadata update events to the reader once it is registered at the
> > enumerator?
> >
> > 2. Looking at the new builder API, there's a few configurations that are
> > common across *all *discovered Kafka clusters / topics, specifically the
> > deserialization schema, offset initialization strategy, Kafka client
> > properties, and consumer group ID. Is there any use case that users would
> > want to have these configurations differ across different Kafka clusters?
> > If that's the case, would it make more sense to encapsulate these
> > configurations to be owned by the metadata service?
> >
> > 3. Is MultiClusterKafkaSource the best name for this connector? I find
> that
> > the dynamic aspect of Kafka connectivity to be a more defining
> > characteristic, and that is the main advantage it has compared to the
> > static KafkaSource. A user may want to use this new connector over
> > KafkaSource even if they're just consuming from a single Kafka cluster;
> for
> > example, one immediate use case I can think of is Kafka repartitioning
> with
> > zero Flink job downtime. They create a new topic with higher parallelism
> > and repartition their Kafka records from the old topic to the new topic,
> > and they want the consuming Flink job to be able to move from the old
> topic
> > to the new topic with zero-downtime while retaining exactly-once
> > guarantees. So, perhaps DynamicKafkaSource is a better name for this
> > connector?
> >
> > Thanks,
> > Gordon
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source?preview=/217389320/255072018/image-2023-6-7_2-29-13.png
> >
> > On Wed, Jun 7, 2023 at 3:07 AM Mason Chen 
> wrote:
> >
> > > Hi Jing,
> > >
> > > Thanks for the prompt feedback! I had some confusion with how to resize
> > > images in confluence--anyways, I have made the font bigger, added white
> > > background, and also made the diagrams the

Re: [DISCUSS] FLIP-246: Multi Cluster Kafka Source

2023-06-07 Thread Tzu-Li (Gordon) Tai
Hi Mason,

Thanks for updating the FLIP. In principle, I believe this would be a
useful addition. Some comments so far:

1. In this sequence diagram [1], why is there a need for a
GetMetadataUpdateEvent from the MultiClusterSourceReader going to the
MultiClusterSourceEnumerator? Shouldn't the enumerator simply start sending
metadata update events to the reader once it is registered at the
enumerator?

2. Looking at the new builder API, there's a few configurations that are
common across *all *discovered Kafka clusters / topics, specifically the
deserialization schema, offset initialization strategy, Kafka client
properties, and consumer group ID. Is there any use case that users would
want to have these configurations differ across different Kafka clusters?
If that's the case, would it make more sense to encapsulate these
configurations to be owned by the metadata service?

3. Is MultiClusterKafkaSource the best name for this connector? I find that
the dynamic aspect of Kafka connectivity to be a more defining
characteristic, and that is the main advantage it has compared to the
static KafkaSource. A user may want to use this new connector over
KafkaSource even if they're just consuming from a single Kafka cluster; for
example, one immediate use case I can think of is Kafka repartitioning with
zero Flink job downtime. They create a new topic with higher parallelism
and repartition their Kafka records from the old topic to the new topic,
and they want the consuming Flink job to be able to move from the old topic
to the new topic with zero-downtime while retaining exactly-once
guarantees. So, perhaps DynamicKafkaSource is a better name for this
connector?

Thanks,
Gordon

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source?preview=/217389320/255072018/image-2023-6-7_2-29-13.png

On Wed, Jun 7, 2023 at 3:07 AM Mason Chen  wrote:

> Hi Jing,
>
> Thanks for the prompt feedback! I had some confusion with how to resize
> images in confluence--anyways, I have made the font bigger, added white
> background, and also made the diagrams themselves bigger.
>
> Regarding the exactly once semantics, that's definitely good to point out
> in the doc. Thus, I have broken out my "Basic Idea" section into:
> 1. an intro
> 2. details about KafkaMetadataService
> 3. details about KafkaStream and KafkaClusterId (the metadata)
> 4. details about exactly once semantics and consistency guarantees
>
> This should give readers enough context about the design goals and
> interactions before deep diving into the class interfaces.
>
> Best,
> Mason
>
> On Tue, Jun 6, 2023 at 1:25 PM Jing Ge  wrote:
>
> > Hi Mason,
> >
> > It is a very practical feature that many users are keen to use. Thanks to
> > the previous discussion, the FLIP now looks informative. Thanks for your
> > proposal. One small suggestion is that the attached images are quite
> small
> > to read if we don't click and enlarge them. Besides that, It is difficult
> > to read the text on the current sequence diagram because it has a
> > transparent background. Would you like to replace it with a white
> > background?
> >
> > Exactly-one is one of the key features of Kafka connector. I have the
> same
> > concern as Qingsheng. Since you have answered questions about it
> > previously, would you like to create an extra section in your FLIP to
> > explicitly describe scenarios when exactly-one is supported and when it
> is
> > not?
> >
> > Best regards,
> > Jing
> >
> > On Mon, Jun 5, 2023 at 11:41 PM Mason Chen 
> wrote:
> >
> > > Hi all,
> > >
> > > I'm working on FLIP-246 again, for the Multi Cluster Kafka Source
> > > contribution. The document has been updated with some more context
> about
> > > how it can solve the Kafka topic removal scenario and a sequence
> diagram
> > to
> > > illustrate how the components interact.
> > >
> > > Looking forward to any feedback!
> > >
> > > Best,
> > > Mason
> > >
> > > On Wed, Oct 12, 2022 at 11:12 PM Mason Chen 
> > > wrote:
> > >
> > > > Hi Ryan,
> > > >
> > > > Thanks for the additional context! Yes, the offset initializer would
> > need
> > > > to take a cluster as a parameter and the MultiClusterKafkaSourceSplit
> > can
> > > > be exposed in an initializer.
> > > >
> > > > Best,
> > > > Mason
> > > >
> > > > On Thu, Oct 6, 2022 at 11:00 AM Ryan van Huuksloot <
> > > > ryan.vanhuuksl...@shopify.com> wrote:
> > > >
> > > >> Hi Mason,
> > > >>
> > > >> Thanks for the clarification! In regards to the addition to the
> > > >> OffsetInitializer of this API - this would be an awesome addition
> and
> > I
> > > >> think this entire FLIP would be a great addition to the Flink.
> > > >>
> > > >> To provide more context as to why we need particular offsets, we use
> > > >> Hybrid Source to currently backfill from buckets prior to reading
> from
> > > >> Kafka. We have a service that will tell us what offset has last been
> > > loaded
> > > >> into said bucket which we will use to initialize the 

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-05-30 Thread Tzu-Li (Gordon) Tai
Hi,

> I think we can get the serializer directly in InitContextImpl through
`getOperatorConfig().getTypeSerializerIn(0,
getUserCodeClassloader()).duplicate()`.

This should work, yes.

+1 to the updated FLIP so far. Thank you, Joao, for being on top of this!

Thanks,
Gordon

On Tue, May 30, 2023 at 12:34 AM João Boto  wrote:

> Hi Lijie,
>
> I left the two options to use whatever you want, but I can clean the FLIP
> to have only one..
>
> Updated the FLIP
>
> Regards
>
> On 2023/05/23 07:23:45 Lijie Wang wrote:
> > Hi Joao,
> >
> > I noticed the FLIP currently contains the following 2 methods about type
> > serializer:
> >
> > (1)  TypeSerializer createInputSerializer();
> > (2)  TypeSerializer createSerializer(TypeInformation inType);
> >
> > Is the method (2) still needed now?
> >
> > Best,
> > Lijie
> >
> > João Boto  于2023年5月19日周五 16:53写道:
> >
> > > Updated the FLIP to use this option.
> > >
> >
>


[ANNOUNCE] Apache Flink Kafka Connectors 3.0.0 released

2023-04-21 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache
Flink Kafka Connectors 3.0.0. This release is compatible with the Apache
Flink 1.17.x release series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Gordon


Re: [VOTE] FLIP-288: Enable Dynamic Partition Discovery by Default in Kafka Source

2023-04-21 Thread Tzu-Li (Gordon) Tai
+1

On Thu, Apr 20, 2023 at 11:52 PM Hongshun Wang 
wrote:

> Dear Flink Developers,
>
>
> Thank you for providing feedback on FLIP-288: Enable Dynamic Partition
> Discovery by Default in Kafka Source[1] on the discussion thread[2].
>
> The goal of the FLIP is to enable partition discovery by default and set
> EARLIEST offset strategy for later discovered partitions.
>
>
> I am initiating a vote for this FLIP. The vote will be open for at least 72
> hours, unless there is an objection or insufficient votes.
>
>
> [1]: [
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source](https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source)
> 
> [2]: [
>
> https://lists.apache.org/thread/581f2xq5d1tlwc8gcr27gwkp3zp0wrg6](https://lists.apache.org/thread/581f2xq5d1tlwc8gcr27gwkp3zp0wrg6)
>
>
> Best regards,
> Hongshun
>


Re: [DISCUSS] FLIP-288:Enable Dynamic Partition Discovery by Default in Kafka Source

2023-04-21 Thread Tzu-Li (Gordon) Tai
> I have already modified FLIP-288 to provide a
newDiscoveryOffsetsInitializer in the KafkaSourceBuilder and
KafkaSourceEnumerator. Users can use
KafkaSourceBuilder#setNewDiscoveryOffsets to change the strategy for new
partitions.

Thanks for addressing my comment Hongshun.

> Considering these reasons and facts, I’m +1 to only use EARLIEST for  new
discovered partitions.

Sounds good to me.


Overall, +1 to this proposal in principle (I'll formally vote on the vote
thread as well).

Thanks,
Gordon

On Tue, Apr 18, 2023 at 9:12 PM Leonard Xu  wrote:

> Thanks Hongshun for deeper analysis of the existing KafkaSource
> implementation details, Cool!
> There’s no specific use case to use a future TIMESTAMP and SPECIFIC-OFFSET
> for new discovered partitions
>  The existing SpecifiedOffsetsInitializer will use the EARLIEST offset for
> unspecified partitions as well as new discovered partitions
>  The existing TimestampOffsetsInitializer will use the LATEST offset for
> future timestamp, the  LATEST offset is similar to  EARLIEST offset for new
> discovered partitions  in this case,  and EARLIEST is safer as it covers
> all records.
> Considering these reasons and facts, I’m +1 to only use EARLIEST for  new
> discovered partitions.
>
> The updated FLIP looks good to me, we can start a vote thread soon if
> there are no new divergences.
>
> Best,
> Leonard
>
> > On Apr 18, 2023, at 4:58 PM, Hongshun Wang 
> wrote:
> >
> > Hi Shammon,
> >
> > Thank you for your advice.I have carefully considered whether to show
> this
> > in SQL DDL. Therefore, I carefully studied whether it is feasible
> Recently
> >
> > However,  after reading the corresponding code more thoroughly, it
> appears
> > that SpecifiedOffsetsInitializer and TimestampOffsetsInitializer do not
> > work as we initially thought. Finally, I have decided to only use
> > "EARLIEST" instead of allowing the user to make a free choice.
> >
> > Now, let me show my new understanding.
> >
> > The actual work of SpecifiedOffsetsInitializer and
> > TimestampOffsetsInitializer:
> >
> >
> >   - *SpecifiedOffsetsInitializer*: Use *Specified offset* for specified
> >   partitions while use *EARLIEST* for unspecified partitions. Specified
> >   partitions offset should be less than the latest offset, otherwise it
> will
> >   start from the *EARLIEST*.
> >   - *TimestampOffsetsInitializer*: Initialize the offsets based on a
> >   timestamp. If the message meeting the requirement of the timestamp
> have not
> >   been produced to Kafka yet, just use the *LATEST* offset.
> >
> > So, some problems will occur when new partition use
> > SpecifiedOffsetsInitializer or TimestampOffsetsInitializer. You can find
> > more information in the "Rejected Alternatives" section of Flip-288,
> which
> > includes details of the code and process of deductive reasoning.
> > All these problems can be reproducible in the current version. The reason
> > why they haven't been exposed is probably because users usually set the
> > existing specified offset or timestamp, so it appears as earliest in
> > production.
> >
> > WDYT?
> > CC:Ruan, Shammon, Gordon, Leonard and Qingsheng.
> >
> > Yours
> >
> > Hongshun
> >
> >
> >
> >
> > On Fri, Apr 14, 2023 at 5:48 PM Shammon FY  wrote:
> >
> >> Hi Hongshun
> >>
> >> Thanks for updating the FLIP, it totally sounds good to me.
> >>
> >> I just have one comment: How does sql job set new discovery offsets
> >> initializer?
> >> I found `DataStream` jobs can set different offsets initializers for new
> >> discovery partitions in `KafkaSourceBuilder.setNewDiscoveryOffsets`. Do
> SQL
> >> jobs need to support this feature?
> >>
> >> Best,
> >> Shammon FY
> >>
> >> On Wed, Apr 12, 2023 at 2:27 PM Hongshun Wang 
> >> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I have already modified FLIP-288 to provide a
> >>> newDiscoveryOffsetsInitializer in the KafkaSourceBuilder and
> >>> KafkaSourceEnumerator. Users can use
> >>> KafkaSourceBuilder#setNewDiscoveryOffsets to change the strategy for
> new
> >>> partitions.
> >>>
> >>> Surely, enabling the partition discovery strategy by default and
> >> modifying
> >>> the offset strategy for new partitions should be brought to the user's
> >>> attention. Therefore, it will be explained in the 1.18 release notes.
> >>>
> >>> WDYT?CC, Ruan, Shammon, Gordon and Leonard.
> >>>
> >>>
> >>> Best,
> >>>
> >>> Hongshun
> >>>
> >>> On Fri, Mar 31, 2023 at 2:56 PM Hongshun Wang  >
> >>> wrote:
> >>>
>  Hi everyone,
>  Thanks for your participation.
> 
>  @Gordon, I looked at the several questions you raised:
> 
>    1. Should we use the firstDiscovery flag or two separate
>    OffsetsInitializers? Actually, I have considered later. If we follow
>    my initial idea, we can provide a default earliest
> >> OffsetsInitializer
>    for a new partition. However, According to @Shammon's suggestion,
> >>> different
>    startup OffsetsInitializers correspond to different post-startup
>    

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-04-21 Thread Tzu-Li (Gordon) Tai
Do we have to introduce `InitContext#createSerializer(TypeInformation)`
which returns TypeSerializer, or is it sufficient to only provide
`InitContext#createInputSerializer()` which returns TypeSerializer?

I had the impression that buffering sinks like JDBC only need the
latter. @Joao, could you confirm?

If that's the case, +1 to adding the following method signatures to
InitContext:
* TypeSerializer createInputSerializer()
* boolean isObjectReuseEnabled()

Thanks,
Gordon

On Fri, Apr 21, 2023 at 3:04 AM Zhu Zhu  wrote:

> Good point! @Gordon
> Introducing an `InitContext#createSerializer(TypeInformation)` looks a
> better option to me, so we do not need to introduce an unmodifiable
> `ExecutionConfig` at this moment.
>
> Hope that we can make `ExecutionConfig` a read-only interface in
> Flink 2.0. It is exposed in `RuntimeContext` to user functions already,
> while mutating the values at runtime is actually an undefined behavior.
>
> Thanks,
> Zhu
>
> Tzu-Li (Gordon) Tai  于2023年4月18日周二 01:02写道:
> >
> > Hi,
> >
> > Sorry for chiming in late.
> >
> > I'm not so sure that exposing ExecutionConfig / ReadExecutionConfig
> > directly through Sink#InitContext is the right thing to do.
> >
> > 1. A lot of the read-only getter methods on ExecutionConfig are
> irrelevant
> > for sinks. Expanding the scope of the InitContext interface with so many
> > irrelevant methods is probably going to make writing unit tests a pain.
> >
> > 2. There's actually a few getter methods on `InitContext` that have
> > duplicate/redundant info for what ExecutionConfig exposes. For example,
> > InitContext#getNumberOfParallelSubtasks and InitContext#getAttemptNumber
> > currently exist and it can be confusing if users find 2 sources of that
> > information (either via the `InitContext` and via the wrapped
> > `ExecutionConfig`).
> >
> > All in all, it feels like `Sink#InitContext` was introduced initially as
> a
> > means to selectively only expose certain information to sinks.
> >
> > It looks like right now, the only requirement is that some sinks require
> 1)
> > isObjectReuseEnabled, and 2) TypeSerializer for the input type. Would it
> > make sense to follow the original intent and only selectively expose
> these?
> > For 1), we can just add a new method to `InitContext` and forward the
> > information from `ExecutionConfig` accessible at the operator level.
> > For 2), would it make sense to create the serializer at the operator
> level
> > and then provide it through `InitContext`?
> >
> > Thanks,
> > Gordon
> >
> > On Mon, Apr 17, 2023 at 8:23 AM Zhu Zhu  wrote:
> >
> > > We can let the `InitContext` return `ExecutionConfig` in the interface.
> > > However, a `ReadableExecutionConfig` implementation should be returned
> > > so that exceptions will be thrown if users tries to modify the
> > > `ExecutionConfig`.
> > >
> > > We can rework all the setters of `ExecutionConfig` to internally
> invoke a
> > > `setConfiguration(...)` method. Then the `ReadableExecutionConfig` can
> > > just override that method. But pay attention to a few exceptional
> > > setters, i.e. those for globalJobParameters and serializers.
> > >
> > > We should also explicitly state in the documentation of
> > > `InitContext #getExecutionConfig()`, that the returned
> `ExecutionConfig`
> > > is unmodifiable.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > João Boto  于2023年4月17日周一 16:51写道:
> > > >
> > > > Hi Zhu,
> > > >
> > > > Thanks for you time for reviewing this.
> > > >
> > > > Extending ´ExecutionConfig´ will allow to modify the values in the
> > > config (this is what we want to prevent with Option2)
> > > >
> > > > To extend the ExecutionConfig is not simpler to do Option1 (expose
> > > ExecutionConfig directly).
> > > >
> > > > Regards
> > > >
> > > >
> > > >
> > > > On 2023/04/03 09:42:28 Zhu Zhu wrote:
> > > > > Hi João,
> > > > >
> > > > > Thanks for creating this FLIP!
> > > > > I'm overall +1 for it to unblock the migration of sinks to SinkV2.
> > > > >
> > > > > Yet I think it's better to let the `ReadableExecutionConfig` extend
> > > > > `ExecutionConfig`, because otherwise we have to introduce a new
> method
> > > > > `TypeInformation#createSerializer(ReadableExecutionConfig)`. The
> new
> > > > > met

Re: [VOTE] Release flink-connector-kafka 3.0.0 for Flink 1.17, release candidate #2

2023-04-20 Thread Tzu-Li (Gordon) Tai
We have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:

* Alexander Sorokoumov
* Martijn Visser (binding)
* Tzu-Li (Gordon) Tai (binding)
* Danny Cranmer (binding)
* Ahmed Hamdy
* Mason Chen

Thanks so much everyone for testing and voting! I will now finalize the
release.

Thanks,
Gordon

On Thu, Apr 20, 2023 at 3:04 PM Mason Chen  wrote:

> +1 (non-binding)
>
> * Verified hashes and signatures
> * Verified no binaries
> * Verified LICENSE and NOTICE files, pointing to 2023 as well
> * Verified poms point to 3.0.0
> * Reviewed web PR
> * Built from source
> * Verified git tag
>
> Best,
> Mason
>
> On Thu, Apr 20, 2023 at 10:04 AM Ahmed Hamdy  wrote:
>
> > +1 (non-binding)
> >
> > - Release notes look good.
> > - verified signatures and checksums are correct.
> > - Verified no binaries in source archive.
> > - Built from source
> > - Approved Web PR (no comments if we are supporting 1.17+)
> > Best Regards
> > Ahmed
> >
> > On Thu, 20 Apr 2023 at 17:08, Danny Cranmer 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > - +1 on skipping 1.16
> > > - Release notes look ok
> > > - Verified signature/hashes of source archive
> > > - Verified there are no binaries in the source archive
> > > - Built from source
> > > - Contents of Maven repo look good
> > > - Verified NOTICE files
> > > - Tag exists in Github
> > > - Reviewed web PR (looks good apart from the open comment from Martijn)
> > >
> > >
> > > On Tue, Apr 18, 2023 at 6:38 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - Checked hashes and signatures
> > > > - Built from source mvn clean install -Pcheck-convergence
> > > > -Dflink.version=1.17.0
> > > > - Eyeballed NOTICE license files
> > > > - Started a Flink 1.17.0 cluster + Kafka 3.2.3 cluster, submitted a
> SQL
> > > > statement using the Kafka connector under exactly-once mode.
> > > Checkpointing
> > > > and restoring works, with or without throughput on the Kafka topic.
> > > >
> > > > Thanks,
> > > > Gordon
> > > >
> > > > On Fri, Apr 14, 2023 at 2:13 AM Martijn Visser <
> > martijnvis...@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - Validated hashes
> > > > > - Verified signature
> > > > > - Verified that no binaries exist in the source archive
> > > > > - Build the source with Maven via mvn clean install
> > -Pcheck-convergence
> > > > > -Dflink.version=1.17.0
> > > > > - Verified licenses
> > > > > - Verified web PR
> > > > > - Started a cluster and the Flink SQL client, successfully read and
> > > wrote
> > > > > with the Kafka connector to Confluent Cloud with AVRO and Schema
> > > Registry
> > > > > enabled
> > > > >
> > > > > On Fri, Apr 14, 2023 at 12:24 AM Alexander Sorokoumov
> > > > >  wrote:
> > > > >
> > > > > > +1 (nb).
> > > > > >
> > > > > > Checked:
> > > > > >
> > > > > >- checksums are correct
> > > > > >- source code builds (JDK 8+11)
> > > > > >- release notes are correct
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Alex
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 12, 2023 at 5:07 PM Tzu-Li (Gordon) Tai <
> > > > tzuli...@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > A few important remarks about this release candidate:
> > > > > > >
> > > > > > > - As mentioned in the previous voting thread of RC1 [1], we've
> > > > decided
> > > > > to
> > > > > > > skip releasing a version of the externalized Flink Kafka
> > Connector
> > > > > > matching
> > > > > > > with Flink 1.16.x since the original vote thread stalled, and
> > > > meanwhile
> > > > > > > we've already completed externalizing all Kafka connector code
>

Re: [VOTE] Release flink-connector-kafka 3.0.0 for Flink 1.17, release candidate #2

2023-04-18 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Checked hashes and signatures
- Built from source mvn clean install -Pcheck-convergence
-Dflink.version=1.17.0
- Eyeballed NOTICE license files
- Started a Flink 1.17.0 cluster + Kafka 3.2.3 cluster, submitted a SQL
statement using the Kafka connector under exactly-once mode. Checkpointing
and restoring works, with or without throughput on the Kafka topic.

Thanks,
Gordon

On Fri, Apr 14, 2023 at 2:13 AM Martijn Visser 
wrote:

> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven via mvn clean install -Pcheck-convergence
> -Dflink.version=1.17.0
> - Verified licenses
> - Verified web PR
> - Started a cluster and the Flink SQL client, successfully read and wrote
> with the Kafka connector to Confluent Cloud with AVRO and Schema Registry
> enabled
>
> On Fri, Apr 14, 2023 at 12:24 AM Alexander Sorokoumov
>  wrote:
>
> > +1 (nb).
> >
> > Checked:
> >
> >- checksums are correct
> >- source code builds (JDK 8+11)
> >- release notes are correct
> >
> >
> > Best,
> > Alex
> >
> >
> > On Wed, Apr 12, 2023 at 5:07 PM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > A few important remarks about this release candidate:
> > >
> > > - As mentioned in the previous voting thread of RC1 [1], we've decided
> to
> > > skip releasing a version of the externalized Flink Kafka Connector
> > matching
> > > with Flink 1.16.x since the original vote thread stalled, and meanwhile
> > > we've already completed externalizing all Kafka connector code as of
> > Flink
> > > 1.17.0.
> > >
> > > - As such, this RC is basically identical to the Kafka connector code
> > > bundled with the Flink 1.17.0 release, PLUS a few critical fixes for
> > > exactly-once violations, namely FLINK-31305, FLINK-31363, and
> FLINK-31620
> > > (please see release notes [2]).
> > >
> > > - As part of preparing this RC, I've also deleted the original v3.0
> > branch
> > > and re-named the v4.0 branch to replace it instead. Effectively, this
> > > resets the versioning numbers for the externalized Flink Kafka
> Connector
> > > code repository, so that this first release of the repo starts from
> > v3.0.0.
> > >
> > > Thanks,
> > > Gordon
> > >
> > > [1] https://lists.apache.org/thread/r97y5qt8x0c72460vs5cjm5c729ljmh6
> > > [2]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
> > >
> > > On Wed, Apr 12, 2023 at 4:55 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Please review and vote on release candidate #2 for version 3.0.0 of
> the
> > > > Apache Flink Kafka Connector, as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release to be deployed to
> dist.apache.org
> > > > [2], which are signed with the key with fingerprint
> > > > 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag v3.0.0-rc2 [5],
> > > > * website pull request listing the new release [6].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Gordon
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
> > > > [2]
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc2/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1607
> > > > [5]
> > > >
> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.0-rc2
> > > > [6] https://github.com/apache/flink-web/pull/632
> > > >
> > >
> >
>


Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-04-17 Thread Tzu-Li (Gordon) Tai
Hi,

Sorry for chiming in late.

I'm not so sure that exposing ExecutionConfig / ReadExecutionConfig
directly through Sink#InitContext is the right thing to do.

1. A lot of the read-only getter methods on ExecutionConfig are irrelevant
for sinks. Expanding the scope of the InitContext interface with so many
irrelevant methods is probably going to make writing unit tests a pain.

2. There's actually a few getter methods on `InitContext` that have
duplicate/redundant info for what ExecutionConfig exposes. For example,
InitContext#getNumberOfParallelSubtasks and InitContext#getAttemptNumber
currently exist and it can be confusing if users find 2 sources of that
information (either via the `InitContext` and via the wrapped
`ExecutionConfig`).

All in all, it feels like `Sink#InitContext` was introduced initially as a
means to selectively only expose certain information to sinks.

It looks like right now, the only requirement is that some sinks require 1)
isObjectReuseEnabled, and 2) TypeSerializer for the input type. Would it
make sense to follow the original intent and only selectively expose these?
For 1), we can just add a new method to `InitContext` and forward the
information from `ExecutionConfig` accessible at the operator level.
For 2), would it make sense to create the serializer at the operator level
and then provide it through `InitContext`?

Thanks,
Gordon

On Mon, Apr 17, 2023 at 8:23 AM Zhu Zhu  wrote:

> We can let the `InitContext` return `ExecutionConfig` in the interface.
> However, a `ReadableExecutionConfig` implementation should be returned
> so that exceptions will be thrown if users tries to modify the
> `ExecutionConfig`.
>
> We can rework all the setters of `ExecutionConfig` to internally invoke a
> `setConfiguration(...)` method. Then the `ReadableExecutionConfig` can
> just override that method. But pay attention to a few exceptional
> setters, i.e. those for globalJobParameters and serializers.
>
> We should also explicitly state in the documentation of
> `InitContext #getExecutionConfig()`, that the returned `ExecutionConfig`
> is unmodifiable.
>
> Thanks,
> Zhu
>
> João Boto  于2023年4月17日周一 16:51写道:
> >
> > Hi Zhu,
> >
> > Thanks for you time for reviewing this.
> >
> > Extending ´ExecutionConfig´ will allow to modify the values in the
> config (this is what we want to prevent with Option2)
> >
> > To extend the ExecutionConfig is not simpler to do Option1 (expose
> ExecutionConfig directly).
> >
> > Regards
> >
> >
> >
> > On 2023/04/03 09:42:28 Zhu Zhu wrote:
> > > Hi João,
> > >
> > > Thanks for creating this FLIP!
> > > I'm overall +1 for it to unblock the migration of sinks to SinkV2.
> > >
> > > Yet I think it's better to let the `ReadableExecutionConfig` extend
> > > `ExecutionConfig`, because otherwise we have to introduce a new method
> > > `TypeInformation#createSerializer(ReadableExecutionConfig)`. The new
> > > method may require every `TypeInformation` to implement it, including
> > > Flink built-in ones and custom ones, otherwise exceptions will happen.
> > > That goal, however, is pretty hard to achieve.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > João Boto  于2023年2月28日周二 23:34写道:
> > > >
> > > > I have update the FLIP with the 2 options that we have discussed..
> > > >
> > > > Option 1: Expose ExecutionConfig directly on InitContext
> > > > this have a minimal impact as we only have to expose the new methods
> > > >
> > > > Option 2: Expose ReadableExecutionConfig on InitContext
> > > > with this option we have more impact as we need to add a new method
> to TypeInformation and change all implementations (current exists 72
> implementations)
> > > >
> > > > Waiting for feedback or concerns about the two options
> > >
>


Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-04-12 Thread Tzu-Li (Gordon) Tai
RC2 (for Flink 1.17.0) vote has started in a separate thread:
https://lists.apache.org/thread/mff76c2hzcb1mk8fm5m2h4z0j73qz2vk

Please test and cast your votes!

On Tue, Apr 11, 2023 at 11:45 AM Martijn Visser 
wrote:

> +1, thanks for driving this Gordon.
>
> On Tue, Apr 11, 2023 at 8:15 PM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi all,
>>
>> Martijn and I discussed offline to cancel this vote.
>>
>> Moreover, now that Flink 1.17 is out and we still haven't released
>> anything yet for the newly externalized Kafka connector, we've decided to
>> skip releasing a version that matches with Flink 1.16 all together, and
>> instead go straight to supporting Flink 1.17 for our first release.
>>
>> Practically this means:
>>
>>1. The code as of branch `flink-connector-kafka:v4.0` will be
>>re-versioned as `v3.0` and that will be the actual first release of
>>flink-connector-kafka.
>>2. v3.0.0 will be the first release of `flink-connector-kafka` and it
>>will initially support Flink 1.17.x series.
>>
>> I'm happy to drive the release efforts for this and will create a new RC
>> shortly over the next day or two.
>>
>> Thanks,
>> Gordon
>>
>> On Wed, Apr 5, 2023 at 9:32 PM Mason Chen  wrote:
>>
>>> +1 for new RC!
>>>
>>> Best,
>>> Mason
>>>
>>> On Tue, Apr 4, 2023 at 11:32 AM Tzu-Li (Gordon) Tai >> >
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I've ported the critical fixes I mentioned to v3.0 and v4.0 branches of
>>> > apache/flink-connector-kafka now.
>>> >
>>> > @martijnvis...@apache.org  let me know if
>>> you'd
>>> > need help with creating a new RC, if there's too much to juggle on
>>> > your end. Happy to help out.
>>> >
>>> > Thanks,
>>> > Gordon
>>> >
>>> > On Sun, Apr 2, 2023 at 11:21 PM Konstantin Knauf 
>>> > wrote:
>>> >
>>> > > +1. Thanks, Gordon!
>>> > >
>>> > > Am Mo., 3. Apr. 2023 um 06:37 Uhr schrieb Tzu-Li (Gordon) Tai <
>>> > > tzuli...@apache.org>:
>>> > >
>>> > > > Hi Martijn,
>>> > > >
>>> > > > Since this RC vote was opened, we had three critical bug fixes
>>> that was
>>> > > > merged for the Kafka connector:
>>> > > >
>>> > > >- https://issues.apache.org/jira/browse/FLINK-31363
>>> > > >- https://issues.apache.org/jira/browse/FLINK-31305
>>> > > >- https://issues.apache.org/jira/browse/FLINK-31620
>>> > > >
>>> > > > Given the severity of these issues (all of them are violations of
>>> > > > exactly-once semantics), and the fact that they are currently not
>>> > > included
>>> > > > yet in any released version, do you think it makes sense to cancel
>>> this
>>> > > RC
>>> > > > in favor of a new one that includes these?
>>> > > > Since this RC vote has been stale for quite some time already, it
>>> > doesn't
>>> > > > seem like we're throwing away too much effort that has already been
>>> > done
>>> > > if
>>> > > > we start a new RC with these critical fixes included.
>>> > > >
>>> > > > What do you think?
>>> > > >
>>> > > > Thanks,
>>> > > > Gordon
>>> > > >
>>> > > > On Thu, Feb 9, 2023 at 3:26 PM Tzu-Li (Gordon) Tai <
>>> > tzuli...@apache.org>
>>> > > > wrote:
>>> > > >
>>> > > > > +1 (binding)
>>> > > > >
>>> > > > > - Verified legals (license headers and root LICENSE / NOTICE
>>> file).
>>> > > > AFAICT
>>> > > > > no dependencies require explicit acknowledgement in the NOTICE
>>> files.
>>> > > > > - No binaries in staging area
>>> > > > > - Built source with tests
>>> > > > > - Verified signatures and hashes
>>> > > > > - Web PR changes LGTM
>>> > > > >
>>> > > > > Thanks Martijn!
>>> > > > >
>>> > > > > Cheers,
>>> > > > > Gordon
>>> > > > >
>>> > > > > O

Re: [VOTE] Release flink-connector-kafka 3.0.0 for Flink 1.17, release candidate #2

2023-04-12 Thread Tzu-Li (Gordon) Tai
A few important remarks about this release candidate:

- As mentioned in the previous voting thread of RC1 [1], we've decided to
skip releasing a version of the externalized Flink Kafka Connector matching
with Flink 1.16.x since the original vote thread stalled, and meanwhile
we've already completed externalizing all Kafka connector code as of Flink
1.17.0.

- As such, this RC is basically identical to the Kafka connector code
bundled with the Flink 1.17.0 release, PLUS a few critical fixes for
exactly-once violations, namely FLINK-31305, FLINK-31363, and FLINK-31620
(please see release notes [2]).

- As part of preparing this RC, I've also deleted the original v3.0 branch
and re-named the v4.0 branch to replace it instead. Effectively, this
resets the versioning numbers for the externalized Flink Kafka Connector
code repository, so that this first release of the repo starts from v3.0.0.

Thanks,
Gordon

[1] https://lists.apache.org/thread/r97y5qt8x0c72460vs5cjm5c729ljmh6
[2]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577

On Wed, Apr 12, 2023 at 4:55 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi everyone,
>
> Please review and vote on release candidate #2 for version 3.0.0 of the
> Apache Flink Kafka Connector, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which are signed with the key with fingerprint
> 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag v3.0.0-rc2 [5],
> * website pull request listing the new release [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Gordon
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
> [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1607
> [5]
> https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.0-rc2
> [6] https://github.com/apache/flink-web/pull/632
>


[VOTE] Release flink-connector-kafka 3.0.0 for Flink 1.17, release candidate #2

2023-04-12 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on release candidate #2 for version 3.0.0 of the
Apache Flink Kafka Connector, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
1C1E2394D3194E1944613488F320986D35C33D6A [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.0.0-rc2 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc2/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1607
[5] https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.0-rc2
[6] https://github.com/apache/flink-web/pull/632


Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-04-11 Thread Tzu-Li (Gordon) Tai
Hi all,

Martijn and I discussed offline to cancel this vote.

Moreover, now that Flink 1.17 is out and we still haven't released anything
yet for the newly externalized Kafka connector, we've decided to skip
releasing a version that matches with Flink 1.16 all together, and instead
go straight to supporting Flink 1.17 for our first release.

Practically this means:

   1. The code as of branch `flink-connector-kafka:v4.0` will be
   re-versioned as `v3.0` and that will be the actual first release of
   flink-connector-kafka.
   2. v3.0.0 will be the first release of `flink-connector-kafka` and it
   will initially support Flink 1.17.x series.

I'm happy to drive the release efforts for this and will create a new RC
shortly over the next day or two.

Thanks,
Gordon

On Wed, Apr 5, 2023 at 9:32 PM Mason Chen  wrote:

> +1 for new RC!
>
> Best,
> Mason
>
> On Tue, Apr 4, 2023 at 11:32 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi all,
> >
> > I've ported the critical fixes I mentioned to v3.0 and v4.0 branches of
> > apache/flink-connector-kafka now.
> >
> > @martijnvis...@apache.org  let me know if
> you'd
> > need help with creating a new RC, if there's too much to juggle on
> > your end. Happy to help out.
> >
> > Thanks,
> > Gordon
> >
> > On Sun, Apr 2, 2023 at 11:21 PM Konstantin Knauf 
> > wrote:
> >
> > > +1. Thanks, Gordon!
> > >
> > > Am Mo., 3. Apr. 2023 um 06:37 Uhr schrieb Tzu-Li (Gordon) Tai <
> > > tzuli...@apache.org>:
> > >
> > > > Hi Martijn,
> > > >
> > > > Since this RC vote was opened, we had three critical bug fixes that
> was
> > > > merged for the Kafka connector:
> > > >
> > > >- https://issues.apache.org/jira/browse/FLINK-31363
> > > >- https://issues.apache.org/jira/browse/FLINK-31305
> > > >- https://issues.apache.org/jira/browse/FLINK-31620
> > > >
> > > > Given the severity of these issues (all of them are violations of
> > > > exactly-once semantics), and the fact that they are currently not
> > > included
> > > > yet in any released version, do you think it makes sense to cancel
> this
> > > RC
> > > > in favor of a new one that includes these?
> > > > Since this RC vote has been stale for quite some time already, it
> > doesn't
> > > > seem like we're throwing away too much effort that has already been
> > done
> > > if
> > > > we start a new RC with these critical fixes included.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Gordon
> > > >
> > > > On Thu, Feb 9, 2023 at 3:26 PM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - Verified legals (license headers and root LICENSE / NOTICE file).
> > > > AFAICT
> > > > > no dependencies require explicit acknowledgement in the NOTICE
> files.
> > > > > - No binaries in staging area
> > > > > - Built source with tests
> > > > > - Verified signatures and hashes
> > > > > - Web PR changes LGTM
> > > > >
> > > > > Thanks Martijn!
> > > > >
> > > > > Cheers,
> > > > > Gordon
> > > > >
> > > > > On Mon, Feb 6, 2023 at 6:12 PM Mason Chen 
> > > > wrote:
> > > > >
> > > > >> That makes sense, thanks for the clarification!
> > > > >>
> > > > >> Best,
> > > > >> Mason
> > > > >>
> > > > >> On Wed, Feb 1, 2023 at 7:16 AM Martijn Visser <
> > > martijnvis...@apache.org
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Mason,
> > > > >> >
> > > > >> > Thanks, [4] is indeed a copy-paste error and you've made the
> right
> > > > >> > assumption that
> > > > >> >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> > > > >> > is the correct maven central link.
> > > > >> >
> > > > >> > I think we should use FLINK-30052 to move the Kafka connector
> code
> > > > from
&g

[jira] [Created] (FLINK-31740) Allow setting boundedness for upsert-kafka SQL connector

2023-04-05 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-31740:
---

 Summary: Allow setting boundedness for upsert-kafka SQL connector
 Key: FLINK-31740
 URL: https://issues.apache.org/jira/browse/FLINK-31740
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Kafka
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


With FLINK-24456, we added boundedness options for streaming mode to the SQL 
Kafka Connector. This was mostly just an exposure of existing functionality 
that was already available at the DataStream API level.

We should do the same for the SQL Upsert Kafka Connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-04-04 Thread Tzu-Li (Gordon) Tai
Hi all,

I've ported the critical fixes I mentioned to v3.0 and v4.0 branches of
apache/flink-connector-kafka now.

@martijnvis...@apache.org  let me know if you'd
need help with creating a new RC, if there's too much to juggle on
your end. Happy to help out.

Thanks,
Gordon

On Sun, Apr 2, 2023 at 11:21 PM Konstantin Knauf  wrote:

> +1. Thanks, Gordon!
>
> Am Mo., 3. Apr. 2023 um 06:37 Uhr schrieb Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>:
>
> > Hi Martijn,
> >
> > Since this RC vote was opened, we had three critical bug fixes that was
> > merged for the Kafka connector:
> >
> >- https://issues.apache.org/jira/browse/FLINK-31363
> >- https://issues.apache.org/jira/browse/FLINK-31305
> >- https://issues.apache.org/jira/browse/FLINK-31620
> >
> > Given the severity of these issues (all of them are violations of
> > exactly-once semantics), and the fact that they are currently not
> included
> > yet in any released version, do you think it makes sense to cancel this
> RC
> > in favor of a new one that includes these?
> > Since this RC vote has been stale for quite some time already, it doesn't
> > seem like we're throwing away too much effort that has already been done
> if
> > we start a new RC with these critical fixes included.
> >
> > What do you think?
> >
> > Thanks,
> > Gordon
> >
> > On Thu, Feb 9, 2023 at 3:26 PM Tzu-Li (Gordon) Tai 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > - Verified legals (license headers and root LICENSE / NOTICE file).
> > AFAICT
> > > no dependencies require explicit acknowledgement in the NOTICE files.
> > > - No binaries in staging area
> > > - Built source with tests
> > > - Verified signatures and hashes
> > > - Web PR changes LGTM
> > >
> > > Thanks Martijn!
> > >
> > > Cheers,
> > > Gordon
> > >
> > > On Mon, Feb 6, 2023 at 6:12 PM Mason Chen 
> > wrote:
> > >
> > >> That makes sense, thanks for the clarification!
> > >>
> > >> Best,
> > >> Mason
> > >>
> > >> On Wed, Feb 1, 2023 at 7:16 AM Martijn Visser <
> martijnvis...@apache.org
> > >
> > >> wrote:
> > >>
> > >> > Hi Mason,
> > >> >
> > >> > Thanks, [4] is indeed a copy-paste error and you've made the right
> > >> > assumption that
> > >> >
> > >> >
> > >>
> >
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> > >> > is the correct maven central link.
> > >> >
> > >> > I think we should use FLINK-30052 to move the Kafka connector code
> > from
> > >> the
> > >> > 1.17 release also over the Kafka connector repo (especially since
> > >> there's
> > >> > now a v3.0 branch for the Kafka connector, so it can be merged in
> > main).
> > >> > When those commits have been merged, we can make a next Kafka
> > connector
> > >> > release (which is equivalent to the 1.17 release, which can only be
> > done
> > >> > when 1.17 is done because of the split level watermark alignment)
> and
> > >> then
> > >> > FLINK-30859 can be finished.
> > >> >
> > >> > Best regards,
> > >> >
> > >> > Martijn
> > >> >
> > >> > Op wo 1 feb. 2023 om 09:16 schreef Mason Chen <
> mas.chen6...@gmail.com
> > >:
> > >> >
> > >> > > +1 (non-binding)
> > >> > >
> > >> > > * Verified hashes and signatures
> > >> > > * Verified no binaries
> > >> > > * Verified LICENSE and NOTICE files
> > >> > > * Verified poms point to 3.0.0-1.16
> > >> > > * Reviewed web PR
> > >> > > * Built from source
> > >> > > * Verified git tag
> > >> > >
> > >> > > I think [4] your is a copy-paste error and I did all the
> > verification
> > >> > > assuming that
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> > >> > > is the correct maven central link.
> > >> > >
> > >> > > Regarding the release notes, should we close
> 

Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-04-02 Thread Tzu-Li (Gordon) Tai
Hi Martijn,

Since this RC vote was opened, we had three critical bug fixes that was
merged for the Kafka connector:

   - https://issues.apache.org/jira/browse/FLINK-31363
   - https://issues.apache.org/jira/browse/FLINK-31305
   - https://issues.apache.org/jira/browse/FLINK-31620

Given the severity of these issues (all of them are violations of
exactly-once semantics), and the fact that they are currently not included
yet in any released version, do you think it makes sense to cancel this RC
in favor of a new one that includes these?
Since this RC vote has been stale for quite some time already, it doesn't
seem like we're throwing away too much effort that has already been done if
we start a new RC with these critical fixes included.

What do you think?

Thanks,
Gordon

On Thu, Feb 9, 2023 at 3:26 PM Tzu-Li (Gordon) Tai 
wrote:

> +1 (binding)
>
> - Verified legals (license headers and root LICENSE / NOTICE file). AFAICT
> no dependencies require explicit acknowledgement in the NOTICE files.
> - No binaries in staging area
> - Built source with tests
> - Verified signatures and hashes
> - Web PR changes LGTM
>
> Thanks Martijn!
>
> Cheers,
> Gordon
>
> On Mon, Feb 6, 2023 at 6:12 PM Mason Chen  wrote:
>
>> That makes sense, thanks for the clarification!
>>
>> Best,
>> Mason
>>
>> On Wed, Feb 1, 2023 at 7:16 AM Martijn Visser 
>> wrote:
>>
>> > Hi Mason,
>> >
>> > Thanks, [4] is indeed a copy-paste error and you've made the right
>> > assumption that
>> >
>> >
>> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
>> > is the correct maven central link.
>> >
>> > I think we should use FLINK-30052 to move the Kafka connector code from
>> the
>> > 1.17 release also over the Kafka connector repo (especially since
>> there's
>> > now a v3.0 branch for the Kafka connector, so it can be merged in main).
>> > When those commits have been merged, we can make a next Kafka connector
>> > release (which is equivalent to the 1.17 release, which can only be done
>> > when 1.17 is done because of the split level watermark alignment) and
>> then
>> > FLINK-30859 can be finished.
>> >
>> > Best regards,
>> >
>> > Martijn
>> >
>> > Op wo 1 feb. 2023 om 09:16 schreef Mason Chen :
>> >
>> > > +1 (non-binding)
>> > >
>> > > * Verified hashes and signatures
>> > > * Verified no binaries
>> > > * Verified LICENSE and NOTICE files
>> > > * Verified poms point to 3.0.0-1.16
>> > > * Reviewed web PR
>> > > * Built from source
>> > > * Verified git tag
>> > >
>> > > I think [4] your is a copy-paste error and I did all the verification
>> > > assuming that
>> > >
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
>> > > is the correct maven central link.
>> > >
>> > > Regarding the release notes, should we close
>> > > https://issues.apache.org/jira/browse/FLINK-30052 and link it there?
>> > I've
>> > > created https://issues.apache.org/jira/browse/FLINK-30859 to remove
>> the
>> > > existing code from the master branch.
>> > >
>> > > Best,
>> > > Mason
>> > >
>> > > On Tue, Jan 31, 2023 at 6:23 AM Martijn Visser <
>> martijnvis...@apache.org
>> > >
>> > > wrote:
>> > >
>> > > > Hi everyone,
>> > > > Please review and vote on the release candidate #1 for
>> > > > flink-connector-kafka version 3.0.0, as follows:
>> > > > [ ] +1, Approve the release
>> > > > [ ] -1, Do not approve the release (please provide specific
>> comments)
>> > > >
>> > > > Note: this is the same code as the Kafka connector for the Flink
>> 1.16
>> > > > release.
>> > > >
>> > > > The complete staging area is available for your review, which
>> includes:
>> > > > * JIRA release notes [1],
>> > > > * the official Apache source release to be deployed to
>> dist.apache.org
>> > > > [2],
>> > > > which are signed with the key with fingerprint
>> > > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
>> > > > * all artifacts to be deployed to the Maven Central Repository [4],
>> > > > * source code tag v3.0.0-rc1 [5],
>> > > > * website pull request listing the new release [6].
>> > > >
>> > > > The vote will be open for at least 72 hours. It is adopted by
>> majority
>> > > > approval, with at least 3 PMC affirmative votes.
>> > > >
>> > > > Thanks,
>> > > > Release Manager
>> > > >
>> > > > [1]
>> > > >
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
>> > > > [2]
>> > > >
>> > > >
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc1
>> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> > > > [4]
>> > > >
>> > > >
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc1/
>> > > > [5]
>> > > >
>> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.0-rc1
>> > > > [6] https://github.com/apache/flink-web/pull/606
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS] FLIP-288:Enable Dynamic Partition Discovery by Default in Kafka Source

2023-03-29 Thread Tzu-Li (Gordon) Tai
Hi Hongshun,

Thank you for drafting the FLIP for this.

Overall, the intent of the FLIP makes sense to me. I'm actually surprised
that the partition discovery feature works as it is in the KafkaSource
today - in older versions of the Kafka source connector (implemented on
source v1), newly discovered partitions are always consumed from EARLIEST
regardless of the initial startup offset strategy.

A few comments:


*1. The proposed change to the OffsetsInitializer#getPartitionOffsets
method is a breaking API change. Can we avoid it?*Instead of the additional
`firstDiscovery` flag and letting implementations have to check against
that flag to differentiate the offset strategy, it seems to be a cleaner
API if we simply let users provide two separate OffsetsInitializers - a
startup OffsetsInitializer, and a post-startup OffsetsInitializer. By
default, the post-startup OffsetsInitializer can be inferred from the
startup OffsetsInitializer (e.g. EARLIEST startup is coupled with EARLIEST
for post-startup, LATEST startup is coupled with EARLIEST for post-startup,
etc.). If the users want to, they can also override and define their own
post-startup OffsetsInitializer. This seems to provide more flexibility and
a better API design for composability.

*2. Clarification on "future-time" TIMESTAMP OffsetsInitializer*
Here I'm referring to this comment made earlier between Shammon and
Hongshun:
> Users can freely set the starting position according to their needs, it
may be before the latest Kafka data, or it may be at a certain point in the
future.
Does future-time TIMESTAMP actually work? In my understanding, if you try
to retrieve offsets for an "out-of-bound" timestamp, you'll by default just
get the "auto.offset.reset" position. Can you clarify the behaviour here?

*3. Clarification on coupling SPECIFIC-OFFSET startup with SPECIFIC-OFFSET
post-startup*
I'm wondering if this is the correct default combo. What happens if a newly
discovered partition was not designated an offset in the provided offsets
map?
In that case, I believe you'll also just default to the "auto.offset.reset"
position, but that means users have an extra config dimension to worry
about for partition discovery.

Thanks,
Gordon


On Mon, Mar 27, 2023 at 9:14 PM Shammon FY  wrote:

> Hi hongshun
>
> Thanks for updating, the FLIP's pretty good and looks good to me!
>
> Best,
> Shammon FY
>
>
> On Tue, Mar 28, 2023 at 11:16 AM Hongshun Wang 
> wrote:
>
> > Hi Shammon,
> >
> >
> > Thanks a lot for your advise. I agree with your opinion now. It seems
> that
> > I forgot to consider that it may be at a certain point in the future.
> >
> >
> > I will modify OffsetsInitializer to provide a different strategy for
> later
> > discovered partitions, by which users can also customize strategies for
> new
> > and old partitions.
> >
> >  WDYT?
> >
> >
> > Yours
> >
> > Hongshun
> >
> > On Tue, Mar 28, 2023 at 9:00 AM Shammon FY  wrote:
> >
> > > Hi Hongshun
> > >
> > > Thanks for your answer.
> > >
> > > I think the startup offset of Kafka such as timestamp or
> > > specific_offset has no relationship with `Window Operator`. Users can
> > > freely set the starting position according to their needs, it may be
> > before
> > > the latest Kafka data, or it may be at a certain point in the future.
> > >
> > > The offsets set by users in Kafka can be divided into four types at the
> > > moment: EARLIEST, LATEST, TIMESTAMP, SPECIFIC_OFFSET. The new
> discovered
> > > partitions may need to be handled with different strategies for these
> > four
> > > types:
> > >
> > > 1. EARLIEST, use EARLIEST for the new discovered partitions
> > > 2. LATEST, use EARLIEST for the new discovered partitions
> > > 3. TIMESTAMP, use TIMESTAMP for the new discovered partitions
> > > 4. SPECIFIC_OFFSET, use SPECIFIC_OFFSET for the new discovered
> partitions
> > >
> > > From above, it seems that we only need to do special processing for
> > > EARLIEST. What do you think of it?
> > >
> > > Best,
> > > Shammon FY
> > >
> > >
> > > On Fri, Mar 24, 2023 at 11:23 AM Hongshun Wang <
> loserwang1...@gmail.com>
> > > wrote:
> > >
> > > > "If all new messages in old partitions should be consumed, all new
> > > messages
> > > > in new partitions should also be consumed."
> > > >
> > > > Sorry, I wrote the last sentence incorrectly.
> > > >
> > > > On Fri, Mar 24, 2023 at 11:15 AM Hongshun Wang <
> > loserwang1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Shammon,
> > > > >
> > > > > Thanks for your advise!  I learn a lot about
> > TIMESTAMP/SPECIFIC_OFFSET.
> > > > > That's interesting.
> > > > >
> > > > > However, I have a different opinion.
> > > > >
> > > > > If a user employs the SPECIFIC_OFFSET strategy and enables
> > > > auto-discovery,
> > > > > they will be able to find new partitions beyond the specified
> offset.
> > > > > Otherwise, enabling auto-discovery is no sense.
> > > > >
> > > > > When it comes to the TIMESTAMP strategy, it seems to be trivial. I
> > > > > understand your 

Re: [ANNOUNCE] Kafka Connector Code Removal from apache/flink:main branch and code freezing

2023-03-27 Thread Tzu-Li (Gordon) Tai
Thanks for the updates.

So far the above mentioned issues seem to all have PRs against
apache/flink-connector-kafka now.

To be clear, this notice isn't about discussing _what_ PRs we will be
merging or not merging - we should try to review all of them eventually.
The only reason I've made a list of PRs in the original notice is just to
make it visible which PRs we need to reopen against
apache/flink-connector-kafka due to the code removal.

Thanks,
Gordon

On Sun, Mar 26, 2023 at 7:07 PM Jacky Lau  wrote:

> Hi Gordon. https://issues.apache.org/jira/browse/FLINK-31006, which is
> also
> a critical bug in kafka. it will not exit after all partitions consumed
> when jobmanager failover in pipeline mode running unbounded source. and i
> talked with   @PatrickRen <https://github.com/PatrickRen> offline, don't
> have a suitable way to fix it before. and we will solved it in this week
>
> Shammon FY  于2023年3月25日周六 13:13写道:
>
> > Thanks Jing and Gordon, I have closed the pr
> > https://github.com/apache/flink/pull/21965 and will open a new one for
> > kafka connector
> >
> >
> > Best,
> > shammon FY
> >
> >
> > On Saturday, March 25, 2023, Ran Tao  wrote:
> >
> > > Thank you Gordon and all the people who have worked on the externalized
> > > kafka implementation.
> > > I have another pr related to Kafka[1]. I will be very appreciative if
> you
> > > can help me review it in your free time.
> > >
> > > [1] https://github.com/apache/flink-connector-kafka/pull/10
> > >
> > > Best Regards,
> > > Ran Tao
> > >
> > >
> > > Tzu-Li (Gordon) Tai  于2023年3月24日周五 23:21写道:
> > >
> > > > Thanks Jing! I missed https://github.com/apache/flink/pull/21965
> > indeed.
> > > >
> > > > Please let us know if anything else was overlooked.
> > > >
> > > > On Fri, Mar 24, 2023 at 8:13 AM Jing Ge 
> > > > wrote:
> > > >
> > > > > Thanks Gordon for driving this! There is another PR related to
> Kafka
> > > > > connector: https://github.com/apache/flink/pull/21965
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Fri, Mar 24, 2023 at 4:06 PM Tzu-Li (Gordon) Tai <
> > > tzuli...@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Now that Flink 1.17 has been released, and given that we've
> already
> > > > > synced
> > > > > > the latest Kafka connector code up to Flink 1.17 to the
> > > > > > apache/flink-connector-kafka repo (thanks to Mason and Martijn
> for
> > > most
> > > > > of
> > > > > > the effort!), we're now in the final step of completely removing
> > the
> > > > > Kafka
> > > > > > connector code from apache/flink:main branch, tracked by
> > FLINK-30859
> > > > [1].
> > > > > >
> > > > > > As such, we'd like to ask that no more Kafka connector changes
> gets
> > > > > merged
> > > > > > to apache/flink:main, effective now. Going forward, all Kafka
> > > connector
> > > > > PRs
> > > > > > should be opened directly against the
> apache/flink-connector-kafka:
> > > main
> > > > > > branch.
> > > > > >
> > > > > > Meanwhile, there's a couple of "dangling" Kafka connector PRs
> over
> > > the
> > > > > last
> > > > > > 2 months that is opened against apache/flink:main:
> > > > > >
> > > > > >1. [FLINK-31305] Propagate producer exceptions outside of
> > mailbox
> > > > > >executor [2]
> > > > > >2. [FLINK-31049] Add support for Kafka record headers to
> > KafkaSink
> > > > [3]
> > > > > >3. [FLINK-31262] Move kafka sql connector fat jar test to
> > > > > >SmokeKafkaITCase [4 ]
> > > > > >4. [hotfix] Add writeTimestamp option to
> > > > > >KafkaRecordSerializationSchemaBuilder [5]
> > > > > >
> > > > > > Apart from 1. [FLINK-31305] which is a critical bug and is
> already
> > in
> > > > > > review closed to being merged, for the rest we will be reaching
> out
> > > on
> > > > > the
> > > > > > PRs to ask the authors to close the PR and reopen them against
> > > > > > apache/flink-connector-kafka:main.
> > > > > >
> > > > > > Thanks,
> > > > > > Gordon
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-30859
> > > > > > [2] https://github.com/apache/flink/pull/22150
> > > > > > [3] https://github.com/apache/flink/pull/8
> > > > > > [4] https://github.com/apache/flink/pull/22060
> > > > > > [5] https://github.com/apache/flink/pull/22037
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] Kafka Connector Code Removal from apache/flink:main branch and code freezing

2023-03-24 Thread Tzu-Li (Gordon) Tai
Thanks Jing! I missed https://github.com/apache/flink/pull/21965 indeed.

Please let us know if anything else was overlooked.

On Fri, Mar 24, 2023 at 8:13 AM Jing Ge  wrote:

> Thanks Gordon for driving this! There is another PR related to Kafka
> connector: https://github.com/apache/flink/pull/21965
>
> Best regards,
> Jing
>
> On Fri, Mar 24, 2023 at 4:06 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi all,
> >
> > Now that Flink 1.17 has been released, and given that we've already
> synced
> > the latest Kafka connector code up to Flink 1.17 to the
> > apache/flink-connector-kafka repo (thanks to Mason and Martijn for most
> of
> > the effort!), we're now in the final step of completely removing the
> Kafka
> > connector code from apache/flink:main branch, tracked by FLINK-30859 [1].
> >
> > As such, we'd like to ask that no more Kafka connector changes gets
> merged
> > to apache/flink:main, effective now. Going forward, all Kafka connector
> PRs
> > should be opened directly against the apache/flink-connector-kafka:main
> > branch.
> >
> > Meanwhile, there's a couple of "dangling" Kafka connector PRs over the
> last
> > 2 months that is opened against apache/flink:main:
> >
> >1. [FLINK-31305] Propagate producer exceptions outside of mailbox
> >executor [2]
> >2. [FLINK-31049] Add support for Kafka record headers to KafkaSink [3]
> >3. [FLINK-31262] Move kafka sql connector fat jar test to
> >SmokeKafkaITCase [4 ]
> >4. [hotfix] Add writeTimestamp option to
> >KafkaRecordSerializationSchemaBuilder [5]
> >
> > Apart from 1. [FLINK-31305] which is a critical bug and is already in
> > review closed to being merged, for the rest we will be reaching out on
> the
> > PRs to ask the authors to close the PR and reopen them against
> > apache/flink-connector-kafka:main.
> >
> > Thanks,
> > Gordon
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-30859
> > [2] https://github.com/apache/flink/pull/22150
> > [3] https://github.com/apache/flink/pull/8
> > [4] https://github.com/apache/flink/pull/22060
> > [5] https://github.com/apache/flink/pull/22037
> >
>


[ANNOUNCE] Kafka Connector Code Removal from apache/flink:main branch and code freezing

2023-03-24 Thread Tzu-Li (Gordon) Tai
Hi all,

Now that Flink 1.17 has been released, and given that we've already synced
the latest Kafka connector code up to Flink 1.17 to the
apache/flink-connector-kafka repo (thanks to Mason and Martijn for most of
the effort!), we're now in the final step of completely removing the Kafka
connector code from apache/flink:main branch, tracked by FLINK-30859 [1].

As such, we'd like to ask that no more Kafka connector changes gets merged
to apache/flink:main, effective now. Going forward, all Kafka connector PRs
should be opened directly against the apache/flink-connector-kafka:main
branch.

Meanwhile, there's a couple of "dangling" Kafka connector PRs over the last
2 months that is opened against apache/flink:main:

   1. [FLINK-31305] Propagate producer exceptions outside of mailbox
   executor [2]
   2. [FLINK-31049] Add support for Kafka record headers to KafkaSink [3]
   3. [FLINK-31262] Move kafka sql connector fat jar test to
   SmokeKafkaITCase [4 ]
   4. [hotfix] Add writeTimestamp option to
   KafkaRecordSerializationSchemaBuilder [5]

Apart from 1. [FLINK-31305] which is a critical bug and is already in
review closed to being merged, for the rest we will be reaching out on the
PRs to ask the authors to close the PR and reopen them against
apache/flink-connector-kafka:main.

Thanks,
Gordon

[1] https://issues.apache.org/jira/browse/FLINK-30859
[2] https://github.com/apache/flink/pull/22150
[3] https://github.com/apache/flink/pull/8
[4] https://github.com/apache/flink/pull/22060
[5] https://github.com/apache/flink/pull/22037


Re: [VOTE] Release flink-connector-parent10.0, release candidate #1

2023-03-16 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Verified hashes and signature
- Read web PR
- Staging Maven artifacts looks good

Thanks,
Gordon

On Thu, Mar 16, 2023 at 1:24 PM Martijn Visser 
wrote:

> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Verified web PR
>
> On Thu, Mar 16, 2023 at 2:51 PM Danny Cranmer 
> wrote:
>
> > Thanks for driving this release, Chesnay.
> >
> > +1 (binding)
> >
> > - Verified signatures and hashes of source archive
> > - Maven repo looks good
> > - Passes maven "build"
> > - Tag present in github
> > - Reviewed web PR
> >
> > Thanks,
> > Danny
> >
> >
> > On Thu, Mar 16, 2023 at 8:55 AM Etienne Chauchot 
> > wrote:
> >
> > > Hi all,
> > >
> > > - checked the release notes
> > >
> > > - tested the connector-parent with under development Cassandra
> connector.
> > >
> > > +1 (non-biding)
> > >
> > > Best
> > >
> > > Etienne
> > >
> > > Le 15/03/2023 à 16:40, Chesnay Schepler a écrit :
> > > > Hi everyone,
> > > > Please review and vote on the release candidate, as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > This is the first release of the flink-connector-parent pom by the
> > > > Flink project. This subsumes the previous release that I made myself.
> > > >
> > > > A few minor changes have been made; see the release notes.
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release to be deployed to
> dist.apache.org
> > > > [2], which are signed with the key with fingerprint 11D464BA [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag [5],
> > > > * website pull request listing the new release [6].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Chesnay
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352762
> > > > [2]
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.0.0-rc1/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > >
> https://repository.apache.org/content/repositories/orgapacheflink-1597
> > > > [5]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.0.0-rc1
> > > > [6] https://github.com/apache/flink-web/pull/620
> > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-02-10 Thread Tzu-Li (Gordon) Tai
re: @Mason about a pluggable error handling strategy

+1 to that, we've had users ask for that in the past for other connectors
as well, e.g. Elasticsearch.
As long as the exposure doesn't lend itself to potentially breaking
processing semantics, I don't see why not!

On Fri, Feb 10, 2023 at 1:54 PM Tzu-Li (Gordon) Tai 
wrote:

> re: @Mason about TimestampOffsetInitializer default offset reset strategy
>
> Should the OffsetsInitializer even be respected for partitions from new
> discoveries? By "new discoveries", I mean partitions that are discovered
> outside of the initial metadata pull on job startup.
>
> I had the impression that we should be differentiating between 1) initial
> discovery after job startup without restore, and 2) discovery of restored
> partitions written in Flink state at job restore time, and 3) dynamic
> discoveries post restore / startup phases.
>
> For each case, the suitable strategy for initializing offset and resetting
> offset in case of out-of-range, would be different:
> For case 1), we should respect the user-configured OffsetInitializer
> strategy for initializing. And, it makes sense that the default reset
> strategy is LATEST to be aligned with Kafka's default.
> For case 2), obviously we always initialize partition offsets based on the
> checkpointed offset. It's probably debatable whether the reset strategy
> should be EARLIEST or LATEST, but by intuition it seems to be EARLIEST to
> avoid the least amount of data loss.
> For case 3), the initialize strategy should always be read from EARLIEST,
> given the nature of delays in a poll-based periodic discovery. I don't
> think offset resets should ever occur here, as we wouldn't ever bump into
> an out-of-range situation here.
>
> TLDR, the user-configured OffsetInitializer should only be respected when
> starting a fresh job without checkpoints, and for the first set of
> partitions that already exist at startup time.
> In all other scenarios, we required dedicated ways to handle them which we
> probably shouldn't expose to the user for configuring.
>
> On Fri, Feb 10, 2023 at 1:34 PM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Joined the discussion a bit late, but just want to add in my +1 as well
>> :-)
>>
>> Historically, when dynamic partition discovery was implemented in the
>> earlier versions of the FlinkKafkaConsumer, it was implemented such that
>> multiple source subtasks would in parallel query Kafka brokers for
>> topic/partition metadata at the configured discovery interval. There were
>> concerns about this at a larger scale, hence the feature was disabled by
>> default.
>>
>> I don't see any reason why to not enable this by default for the latest
>> implementations of the KafkaSourceEnumerator.
>>
>> That being said - this is essentially a breaking user-facing change in
>> that it has functional side effects, but I don't see any way of introducing
>> this without a breaking change either.
>>
>> I imagine that the group of users that are most likely to be caught by
>> surprise are users who use regex topic pattern subscription, but did not
>> enable partition discovery. We need to be diligent in documenting
>> (including releasing blog posts) about this change.
>>
>> removed partition handling is not yet added
>>
>>
>> I agree with @Qinsheng that this can be an orthogonal topic outside the
>> scope of the planned changes here as it isn't straightforward.
>>
>> On Fri, Feb 10, 2023 at 12:56 PM Martijn Visser 
>> wrote:
>>
>>> Oh and Mason, definitely interesting! :)
>>>
>>> On Fri, Feb 10, 2023 at 9:51 PM Martijn Visser >> >
>>> wrote:
>>>
>>> > @Qingsheng what are your next steps for this proposal?
>>> >
>>> > On Thu, Jan 19, 2023 at 9:14 AM Mason Chen 
>>> wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> Sorry to come into the discussion late--I saw the thread earlier.
>>> >>
>>> >> I'm also +1 for the change in general. I think most users have this
>>> turned
>>> >> on by default since the overhead is quite low. A default in the two
>>> digit
>>> >> seconds range works well for us. However, I do have two main concerns
>>> that
>>> >> are related, but don't necessarily block this FLIP:
>>> >>
>>> >> 1. Timestamp Offset Initializer
>>> >>
>>> >> Currently, the timestamp offset initializer defaults the offset reset
>>> >> strategy to LATEST. This can present some problems if the discovery
>>&g

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-02-10 Thread Tzu-Li (Gordon) Tai
re: @Mason about TimestampOffsetInitializer default offset reset strategy

Should the OffsetsInitializer even be respected for partitions from new
discoveries? By "new discoveries", I mean partitions that are discovered
outside of the initial metadata pull on job startup.

I had the impression that we should be differentiating between 1) initial
discovery after job startup without restore, and 2) discovery of restored
partitions written in Flink state at job restore time, and 3) dynamic
discoveries post restore / startup phases.

For each case, the suitable strategy for initializing offset and resetting
offset in case of out-of-range, would be different:
For case 1), we should respect the user-configured OffsetInitializer
strategy for initializing. And, it makes sense that the default reset
strategy is LATEST to be aligned with Kafka's default.
For case 2), obviously we always initialize partition offsets based on the
checkpointed offset. It's probably debatable whether the reset strategy
should be EARLIEST or LATEST, but by intuition it seems to be EARLIEST to
avoid the least amount of data loss.
For case 3), the initialize strategy should always be read from EARLIEST,
given the nature of delays in a poll-based periodic discovery. I don't
think offset resets should ever occur here, as we wouldn't ever bump into
an out-of-range situation here.

TLDR, the user-configured OffsetInitializer should only be respected when
starting a fresh job without checkpoints, and for the first set of
partitions that already exist at startup time.
In all other scenarios, we required dedicated ways to handle them which we
probably shouldn't expose to the user for configuring.

On Fri, Feb 10, 2023 at 1:34 PM Tzu-Li (Gordon) Tai 
wrote:

> Joined the discussion a bit late, but just want to add in my +1 as well :-)
>
> Historically, when dynamic partition discovery was implemented in the
> earlier versions of the FlinkKafkaConsumer, it was implemented such that
> multiple source subtasks would in parallel query Kafka brokers for
> topic/partition metadata at the configured discovery interval. There were
> concerns about this at a larger scale, hence the feature was disabled by
> default.
>
> I don't see any reason why to not enable this by default for the latest
> implementations of the KafkaSourceEnumerator.
>
> That being said - this is essentially a breaking user-facing change in
> that it has functional side effects, but I don't see any way of introducing
> this without a breaking change either.
>
> I imagine that the group of users that are most likely to be caught by
> surprise are users who use regex topic pattern subscription, but did not
> enable partition discovery. We need to be diligent in documenting
> (including releasing blog posts) about this change.
>
> removed partition handling is not yet added
>
>
> I agree with @Qinsheng that this can be an orthogonal topic outside the
> scope of the planned changes here as it isn't straightforward.
>
> On Fri, Feb 10, 2023 at 12:56 PM Martijn Visser 
> wrote:
>
>> Oh and Mason, definitely interesting! :)
>>
>> On Fri, Feb 10, 2023 at 9:51 PM Martijn Visser 
>> wrote:
>>
>> > @Qingsheng what are your next steps for this proposal?
>> >
>> > On Thu, Jan 19, 2023 at 9:14 AM Mason Chen 
>> wrote:
>> >
>> >> Hi all,
>> >>
>> >> Sorry to come into the discussion late--I saw the thread earlier.
>> >>
>> >> I'm also +1 for the change in general. I think most users have this
>> turned
>> >> on by default since the overhead is quite low. A default in the two
>> digit
>> >> seconds range works well for us. However, I do have two main concerns
>> that
>> >> are related, but don't necessarily block this FLIP:
>> >>
>> >> 1. Timestamp Offset Initializer
>> >>
>> >> Currently, the timestamp offset initializer defaults the offset reset
>> >> strategy to LATEST. This can present some problems if the discovery
>> >> interval is set too large since records from new partitions could be
>> >> skipped (the set timestamp is not found in Kafka, thus resetting to the
>> >> latest). Here is a ticket to allow customizations:
>> >> https://issues.apache.org/jira/browse/FLINK-30200 (Qingsheng, you
>> might
>> >> remember this from a PR review). Thanks for mentioning this in your
>> FLIP!
>> >>
>> >> 2. AdminClient Fault Tolerance
>> >>
>> >> AdminClient, which is used for partition discovery, seems not to handle
>> >> Kafka timeouts as robustly as the KafkaConsumer API, and we have
>> noticed
>> >> that tran

Re: [DISCUSS] Enabling dynamic partition discovery by default in Kafka source

2023-02-10 Thread Tzu-Li (Gordon) Tai
Joined the discussion a bit late, but just want to add in my +1 as well :-)

Historically, when dynamic partition discovery was implemented in the
earlier versions of the FlinkKafkaConsumer, it was implemented such that
multiple source subtasks would in parallel query Kafka brokers for
topic/partition metadata at the configured discovery interval. There were
concerns about this at a larger scale, hence the feature was disabled by
default.

I don't see any reason why to not enable this by default for the latest
implementations of the KafkaSourceEnumerator.

That being said - this is essentially a breaking user-facing change in that
it has functional side effects, but I don't see any way of introducing this
without a breaking change either.

I imagine that the group of users that are most likely to be caught by
surprise are users who use regex topic pattern subscription, but did not
enable partition discovery. We need to be diligent in documenting
(including releasing blog posts) about this change.

removed partition handling is not yet added


I agree with @Qinsheng that this can be an orthogonal topic outside the
scope of the planned changes here as it isn't straightforward.

On Fri, Feb 10, 2023 at 12:56 PM Martijn Visser 
wrote:

> Oh and Mason, definitely interesting! :)
>
> On Fri, Feb 10, 2023 at 9:51 PM Martijn Visser 
> wrote:
>
> > @Qingsheng what are your next steps for this proposal?
> >
> > On Thu, Jan 19, 2023 at 9:14 AM Mason Chen 
> wrote:
> >
> >> Hi all,
> >>
> >> Sorry to come into the discussion late--I saw the thread earlier.
> >>
> >> I'm also +1 for the change in general. I think most users have this
> turned
> >> on by default since the overhead is quite low. A default in the two
> digit
> >> seconds range works well for us. However, I do have two main concerns
> that
> >> are related, but don't necessarily block this FLIP:
> >>
> >> 1. Timestamp Offset Initializer
> >>
> >> Currently, the timestamp offset initializer defaults the offset reset
> >> strategy to LATEST. This can present some problems if the discovery
> >> interval is set too large since records from new partitions could be
> >> skipped (the set timestamp is not found in Kafka, thus resetting to the
> >> latest). Here is a ticket to allow customizations:
> >> https://issues.apache.org/jira/browse/FLINK-30200 (Qingsheng, you might
> >> remember this from a PR review). Thanks for mentioning this in your
> FLIP!
> >>
> >> 2. AdminClient Fault Tolerance
> >>
> >> AdminClient, which is used for partition discovery, seems not to handle
> >> Kafka timeouts as robustly as the KafkaConsumer API, and we have noticed
> >> that transient network hiccups cause full job restarts (since the
> >> jobmanager fails) in numerous incidents. Internally, we have introduced
> an
> >> error handling strategy based on the number of consecutive partition
> >> discovery failures. I'm interested in opening a JIRA ticket to
> contribute
> >> this feature back to Flink and open making the error handling more
> >> pluggable. What do you think?
> >>
> >> Best,
> >> Mason
> >>
> >> On Sun, Jan 15, 2023 at 11:39 PM Qingsheng Ren 
> wrote:
> >>
> >> > Thanks for the input Becket!
> >> >
> >> > I reorganized this proposal into FLIP-288 [1].
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source
> >> >
> >> > Best,
> >> > Qingsheng
> >> >
> >> > On Sun, Jan 15, 2023 at 9:18 AM Becket Qin 
> >> wrote:
> >> >
> >> > > Thanks for the proposal, Qingsheng.
> >> > >
> >> > > +1 to enable auto partition discovery by default. Just a reminder,
> we
> >> > need
> >> > > a FLIP for this.
> >> > >
> >> > > A bit more background on this.
> >> > >
> >> > > Most of the Kafka users simply subscribe to a topic and let the
> >> consumer
> >> > to
> >> > > automatically adapt to partition changes. So enabling auto partition
> >> > > discovery would align with that experience. The counter argument
> last
> >> > time
> >> > > when I proposed to enable auto partition discovery was mainly due to
> >> the
> >> > > concern from the Flink users. There were arguments that sometimes
> >> users
> >> > > don't want the partition changes to get automatically picked up, but
> >> want
> >> > > to do this by restarting the job manually so they can avoid
> unnoticed
> >> > > changes in the jobs.
> >> > >
> >> > > Given that in the old Flink source, by default the auto partition
> >> > discovery
> >> > > was disabled, and there are use cases from both sides, we simply
> kept
> >> the
> >> > > behavior unchanged. From the discussion we have here, it looks like
> >> > > enabling auto partition discovery is much preferred. So I think we
> >> should
> >> > > do it.
> >> > >
> >> > > I am not worried about the performance. The new Kafka source will
> only
> >> > have
> >> > > the SplitEnumerator sending metadata requests when the feature is
> >> > enabled.
> >> > > It is actually much 

Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-02-09 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Verified legals (license headers and root LICENSE / NOTICE file). AFAICT
no dependencies require explicit acknowledgement in the NOTICE files.
- No binaries in staging area
- Built source with tests
- Verified signatures and hashes
- Web PR changes LGTM

Thanks Martijn!

Cheers,
Gordon

On Mon, Feb 6, 2023 at 6:12 PM Mason Chen  wrote:

> That makes sense, thanks for the clarification!
>
> Best,
> Mason
>
> On Wed, Feb 1, 2023 at 7:16 AM Martijn Visser 
> wrote:
>
> > Hi Mason,
> >
> > Thanks, [4] is indeed a copy-paste error and you've made the right
> > assumption that
> >
> >
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> > is the correct maven central link.
> >
> > I think we should use FLINK-30052 to move the Kafka connector code from
> the
> > 1.17 release also over the Kafka connector repo (especially since there's
> > now a v3.0 branch for the Kafka connector, so it can be merged in main).
> > When those commits have been merged, we can make a next Kafka connector
> > release (which is equivalent to the 1.17 release, which can only be done
> > when 1.17 is done because of the split level watermark alignment) and
> then
> > FLINK-30859 can be finished.
> >
> > Best regards,
> >
> > Martijn
> >
> > Op wo 1 feb. 2023 om 09:16 schreef Mason Chen :
> >
> > > +1 (non-binding)
> > >
> > > * Verified hashes and signatures
> > > * Verified no binaries
> > > * Verified LICENSE and NOTICE files
> > > * Verified poms point to 3.0.0-1.16
> > > * Reviewed web PR
> > > * Built from source
> > > * Verified git tag
> > >
> > > I think [4] your is a copy-paste error and I did all the verification
> > > assuming that
> > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> > > is the correct maven central link.
> > >
> > > Regarding the release notes, should we close
> > > https://issues.apache.org/jira/browse/FLINK-30052 and link it there?
> > I've
> > > created https://issues.apache.org/jira/browse/FLINK-30859 to remove
> the
> > > existing code from the master branch.
> > >
> > > Best,
> > > Mason
> > >
> > > On Tue, Jan 31, 2023 at 6:23 AM Martijn Visser <
> martijnvis...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi everyone,
> > > > Please review and vote on the release candidate #1 for
> > > > flink-connector-kafka version 3.0.0, as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > Note: this is the same code as the Kafka connector for the Flink 1.16
> > > > release.
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release to be deployed to
> dist.apache.org
> > > > [2],
> > > > which are signed with the key with fingerprint
> > > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag v3.0.0-rc1 [5],
> > > > * website pull request listing the new release [6].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Release Manager
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
> > > > [2]
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc1
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc1/
> > > > [5]
> > > >
> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.0-rc1
> > > > [6] https://github.com/apache/flink-web/pull/606
> > > >
> > >
> >
>


Re: Stateful Functions with Flink 1.15 and onwards

2022-11-04 Thread Tzu-Li (Gordon) Tai
@Galen The next step is essentially for someone to be the release manager
and drive the release process for StateFun 3.3.0 [1]. The document for the
release process might be slightly outdated in some places, but overall
outlines the process pretty clearly.

As I mentioned earlier, there's quite a few steps in the process that
requires committer write access, so it would be easier if a committer can
pick this up as the release manager.
I'll be happy to take this on, but I'll only have availability after next
week. If someone else is willing to take this on earlier it would certainly
be better to get the release out ASAP.

[1]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Stateful+Functions+Release

On Fri, Nov 4, 2022 at 5:54 AM Galen Warren  wrote:

> Thanks Gordon.
>
> What is the next step here?
>
> On Thu, Nov 3, 2022 at 1:45 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > FYI, release-3.3 branch has been cut and is ready for the release process
> > for StateFun 3.3.0:
> > https://github.com/apache/flink-statefun/tree/release-3.3
> >
> > On Tue, Nov 1, 2022 at 10:21 AM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> >> Btw, I'll assume that we're using this thread to gather consensus for
> >> code-freezing for 3.3.x series of StateFun. I know there hasn't been
> much
> >> activity on the repo, so this is just a formality really :)
> >>
> >> From the commit history, it looks like we're mainly including the below
> >> major changes and bug fixes for 3.3.x:
> >> - Flink upgrade to 1.15.2
> >> - https://issues.apache.org/jira/browse/FLINK-26340
> >> - https://issues.apache.org/jira/browse/FLINK-25866
> >> - https://issues.apache.org/jira/browse/FLINK-25936
> >> - https://issues.apache.org/jira/browse/FLINK-25933
> >>
> >> I'll wait for 24 hours before cutting the release branch for 3.3.x,
> >> unless anyone raises any objections before that.
> >>
> >> Thanks,
> >> Gordon
> >>
> >> On Tue, Nov 1, 2022 at 10:09 AM Galen Warren 
> >> wrote:
> >>
> >>> Thanks Gordon and Filip, I appreciate your help on this one.
> >>>
> >>> On Tue, Nov 1, 2022 at 1:07 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> >>> wrote:
> >>>
> >>>> PR for upgrading to Flink 1.15.2 has been merged. Thanks for the
> >>>> efforts,
> >>>> Galen and Filip!
> >>>>
> >>>> We should be ready to kick off a new release for StateFun with the
> Flink
> >>>> version upgrade.
> >>>> I'll cut off a release branch now on apache/flink-statefun for
> >>>> release-3.3.x to move things forward.
> >>>> @Galen, @Filip if you want to, after the release branch is cut, you
> >>>> could
> >>>> probably upgrade the master branch to Flink 1.16.x as well.
> >>>>
> >>>> Afterwards we should decide who is available to drive the actual
> release
> >>>> process for 3.3.0.
> >>>> There's quite a few steps that would require committer write access.
> >>>> Unless someone else is up for this earlier, I'll have some
> availability
> >>>> towards the end of next week to help drive this.
> >>>>
> >>>> Thanks,
> >>>> Gordon
> >>>>
> >>>> On Mon, Oct 31, 2022 at 12:17 PM Galen Warren <
> ga...@cvillewarrens.com>
> >>>> wrote:
> >>>>
> >>>> > Yes, that makes sense.
> >>>> >
> >>>> > PR is here: [FLINK-29814][statefun] Change supported Flink version
> to
> >>>> > 1.15.2 by galenwarren · Pull Request #319 · apache/flink-statefun
> >>>> > (github.com) <https://github.com/apache/flink-statefun/pull/319>.
> >>>> >
> >>>> > On Mon, Oct 31, 2022 at 11:35 AM Till Rohrmann <
> trohrm...@apache.org>
> >>>> > wrote:
> >>>> >
> >>>> > > I think there might still be value in supporting 1.15 since not
> >>>> everyone
> >>>> > > upgrades Flink very fast. Hopefully, for Statefun the diff between
> >>>> Flink
> >>>> > > 1.15 and 1.16 boils down to changing the Flink dependencies.
> >>>> > >
> >>>> > > Cheers,
> >>>> > > Till
> >>>> > >
> >>>> > > On Mon, Oct 31, 2022 at 2:06 PM Galen Warren <
> &g

Re: Stateful Functions with Flink 1.15 and onwards

2022-11-03 Thread Tzu-Li (Gordon) Tai
FYI, release-3.3 branch has been cut and is ready for the release process
for StateFun 3.3.0:
https://github.com/apache/flink-statefun/tree/release-3.3

On Tue, Nov 1, 2022 at 10:21 AM Tzu-Li (Gordon) Tai 
wrote:

> Btw, I'll assume that we're using this thread to gather consensus for
> code-freezing for 3.3.x series of StateFun. I know there hasn't been much
> activity on the repo, so this is just a formality really :)
>
> From the commit history, it looks like we're mainly including the below
> major changes and bug fixes for 3.3.x:
> - Flink upgrade to 1.15.2
> - https://issues.apache.org/jira/browse/FLINK-26340
> - https://issues.apache.org/jira/browse/FLINK-25866
> - https://issues.apache.org/jira/browse/FLINK-25936
> - https://issues.apache.org/jira/browse/FLINK-25933
>
> I'll wait for 24 hours before cutting the release branch for 3.3.x, unless
> anyone raises any objections before that.
>
> Thanks,
> Gordon
>
> On Tue, Nov 1, 2022 at 10:09 AM Galen Warren 
> wrote:
>
>> Thanks Gordon and Filip, I appreciate your help on this one.
>>
>> On Tue, Nov 1, 2022 at 1:07 PM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> PR for upgrading to Flink 1.15.2 has been merged. Thanks for the efforts,
>>> Galen and Filip!
>>>
>>> We should be ready to kick off a new release for StateFun with the Flink
>>> version upgrade.
>>> I'll cut off a release branch now on apache/flink-statefun for
>>> release-3.3.x to move things forward.
>>> @Galen, @Filip if you want to, after the release branch is cut, you could
>>> probably upgrade the master branch to Flink 1.16.x as well.
>>>
>>> Afterwards we should decide who is available to drive the actual release
>>> process for 3.3.0.
>>> There's quite a few steps that would require committer write access.
>>> Unless someone else is up for this earlier, I'll have some availability
>>> towards the end of next week to help drive this.
>>>
>>> Thanks,
>>> Gordon
>>>
>>> On Mon, Oct 31, 2022 at 12:17 PM Galen Warren 
>>> wrote:
>>>
>>> > Yes, that makes sense.
>>> >
>>> > PR is here: [FLINK-29814][statefun] Change supported Flink version to
>>> > 1.15.2 by galenwarren · Pull Request #319 · apache/flink-statefun
>>> > (github.com) <https://github.com/apache/flink-statefun/pull/319>.
>>> >
>>> > On Mon, Oct 31, 2022 at 11:35 AM Till Rohrmann 
>>> > wrote:
>>> >
>>> > > I think there might still be value in supporting 1.15 since not
>>> everyone
>>> > > upgrades Flink very fast. Hopefully, for Statefun the diff between
>>> Flink
>>> > > 1.15 and 1.16 boils down to changing the Flink dependencies.
>>> > >
>>> > > Cheers,
>>> > > Till
>>> > >
>>> > > On Mon, Oct 31, 2022 at 2:06 PM Galen Warren <
>>> ga...@cvillewarrens.com>
>>> > > wrote:
>>> > >
>>> > >> Sure thing. One question -- Flink 1.16 was just released a few days
>>> ago.
>>> > >> Should I support 1.15, or just go straight to 1.16?
>>> > >>
>>> > >> On Mon, Oct 31, 2022 at 8:49 AM Till Rohrmann >> >
>>> > >> wrote:
>>> > >>
>>> > >>> Hi folks,
>>> > >>>
>>> > >>> if you can open a PR for supporting Flink 1.15 Galen, then this
>>> would
>>> > be
>>> > >>> awesome. I've assigned you to this ticket. The next thing after
>>> merging
>>> > >>> this PR would be creating a new StateFun release. Once we have
>>> merged
>>> > the
>>> > >>> PR, let's check who can help with it the fastest.
>>> > >>>
>>> > >>> Cheers,
>>> > >>> Till
>>> > >>>
>>> > >>> On Mon, Oct 31, 2022 at 1:10 PM Galen Warren <
>>> ga...@cvillewarrens.com>
>>> > >>> wrote:
>>> > >>>
>>> > >>>> Yes, I could do that.
>>> > >>>>
>>> > >>>> On Mon, Oct 31, 2022 at 7:48 AM Filip Karnicki <
>>> > >>>> filip.karni...@gmail.com> wrote:
>>> > >>>>
>>> > >>>>> Hi All
>>> > >>>>>
>>> > >>>>> So what's the play here?
>>&

Re: Stateful Functions with Flink 1.15 and onwards

2022-11-01 Thread Tzu-Li (Gordon) Tai
Btw, I'll assume that we're using this thread to gather consensus for
code-freezing for 3.3.x series of StateFun. I know there hasn't been much
activity on the repo, so this is just a formality really :)

>From the commit history, it looks like we're mainly including the below
major changes and bug fixes for 3.3.x:
- Flink upgrade to 1.15.2
- https://issues.apache.org/jira/browse/FLINK-26340
- https://issues.apache.org/jira/browse/FLINK-25866
- https://issues.apache.org/jira/browse/FLINK-25936
- https://issues.apache.org/jira/browse/FLINK-25933

I'll wait for 24 hours before cutting the release branch for 3.3.x, unless
anyone raises any objections before that.

Thanks,
Gordon

On Tue, Nov 1, 2022 at 10:09 AM Galen Warren 
wrote:

> Thanks Gordon and Filip, I appreciate your help on this one.
>
> On Tue, Nov 1, 2022 at 1:07 PM Tzu-Li (Gordon) Tai 
> wrote:
>
>> PR for upgrading to Flink 1.15.2 has been merged. Thanks for the efforts,
>> Galen and Filip!
>>
>> We should be ready to kick off a new release for StateFun with the Flink
>> version upgrade.
>> I'll cut off a release branch now on apache/flink-statefun for
>> release-3.3.x to move things forward.
>> @Galen, @Filip if you want to, after the release branch is cut, you could
>> probably upgrade the master branch to Flink 1.16.x as well.
>>
>> Afterwards we should decide who is available to drive the actual release
>> process for 3.3.0.
>> There's quite a few steps that would require committer write access.
>> Unless someone else is up for this earlier, I'll have some availability
>> towards the end of next week to help drive this.
>>
>> Thanks,
>> Gordon
>>
>> On Mon, Oct 31, 2022 at 12:17 PM Galen Warren 
>> wrote:
>>
>> > Yes, that makes sense.
>> >
>> > PR is here: [FLINK-29814][statefun] Change supported Flink version to
>> > 1.15.2 by galenwarren · Pull Request #319 · apache/flink-statefun
>> > (github.com) <https://github.com/apache/flink-statefun/pull/319>.
>> >
>> > On Mon, Oct 31, 2022 at 11:35 AM Till Rohrmann 
>> > wrote:
>> >
>> > > I think there might still be value in supporting 1.15 since not
>> everyone
>> > > upgrades Flink very fast. Hopefully, for Statefun the diff between
>> Flink
>> > > 1.15 and 1.16 boils down to changing the Flink dependencies.
>> > >
>> > > Cheers,
>> > > Till
>> > >
>> > > On Mon, Oct 31, 2022 at 2:06 PM Galen Warren > >
>> > > wrote:
>> > >
>> > >> Sure thing. One question -- Flink 1.16 was just released a few days
>> ago.
>> > >> Should I support 1.15, or just go straight to 1.16?
>> > >>
>> > >> On Mon, Oct 31, 2022 at 8:49 AM Till Rohrmann 
>> > >> wrote:
>> > >>
>> > >>> Hi folks,
>> > >>>
>> > >>> if you can open a PR for supporting Flink 1.15 Galen, then this
>> would
>> > be
>> > >>> awesome. I've assigned you to this ticket. The next thing after
>> merging
>> > >>> this PR would be creating a new StateFun release. Once we have
>> merged
>> > the
>> > >>> PR, let's check who can help with it the fastest.
>> > >>>
>> > >>> Cheers,
>> > >>> Till
>> > >>>
>> > >>> On Mon, Oct 31, 2022 at 1:10 PM Galen Warren <
>> ga...@cvillewarrens.com>
>> > >>> wrote:
>> > >>>
>> > >>>> Yes, I could do that.
>> > >>>>
>> > >>>> On Mon, Oct 31, 2022 at 7:48 AM Filip Karnicki <
>> > >>>> filip.karni...@gmail.com> wrote:
>> > >>>>
>> > >>>>> Hi All
>> > >>>>>
>> > >>>>> So what's the play here?
>> > >>>>>
>> > >>>>> Galen, what do you think about taking this on? Perhaps ++Till
>> would
>> > >>>>> assign this jira to you (with your permission) given he's helped
>> me
>> > out
>> > >>>>> with statefun work before
>> > >>>>> https://issues.apache.org/jira/browse/FLINK-29814
>> > >>>>>
>> > >>>>> I can try to move to move statefun to flink 1.16 when it's out
>> > >>>>>
>> > >>>>>
>> > >>>>> Kind regards
>> > >>&g

Re: Stateful Functions with Flink 1.15 and onwards

2022-11-01 Thread Tzu-Li (Gordon) Tai
; |on failure, delayed message to retry| remote
> >>>>>> remote --> |async puts/gets with side effects| other(other
> >>>>>> systems)
> >>>>>>
> >>>>>> Having the processing happen outside of Flink is nice-to-have from
> an
> >>>>>> independent scalability point of view, but is not strictly required.
> >>>>>>
> >>>>>> So long story short - no cyclic messaging, but also no way I can
> >>>>>> think of to use existing native Flink operators like async i/o
> (which when
> >>>>>> I last checked a few years back didn't have access to keyed state)
> >>>>>>
> >>>>>>
> >>>>>> P.S. Please note that there is already a pull request that has
> >>>>>> something to do wtih Flink 1.15, albeit without a description or a
> jira:
> >>>>>> https://github.com/apache/flink-statefun/pull/314
> >>>>>>
> >>>>>>
> >>>>>> On Wed, 26 Oct 2022 at 19:54, Galen Warren  >
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Gordon (and others),
> >>>>>>>
> >>>>>>> I'm also using this project for stateful messaging, including
> >>>>>>> messaging
> >>>>>>> among functions.
> >>>>>>>
> >>>>>>> I've contributed a small amount of code in the past and have also
> >>>>>>> enabled
> >>>>>>> Flink 1.15 compatibility in a local fork, so I might be able to
> help
> >>>>>>> out
> >>>>>>> here.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Galen
> >>>>>>>
> >>>>>>> On Wed, Oct 26, 2022 at 1:34 PM Ken Krugler <
> >>>>>>> kkrugler_li...@transpac.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> > Hi Gordon,
> >>>>>>> >
> >>>>>>> > We’re using it for stateful messaging, and also calling remote
> >>>>>>> > Python-based functions.
> >>>>>>> >
> >>>>>>> > So yes, also very interested in what is going to happen with the
> >>>>>>> this
> >>>>>>> > subproject in the future.
> >>>>>>> >
> >>>>>>> > — Ken
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>> > > Begin forwarded message:
> >>>>>>> > >
> >>>>>>> > > From: "Tzu-Li (Gordon) Tai" 
> >>>>>>> > > Subject: Re: Stateful Functions with Flink 1.15 and onwards
> >>>>>>> > > Date: October 26, 2022 at 10:25:26 AM PDT
> >>>>>>> > > To: dev@flink.apache.org
> >>>>>>> > > Reply-To: dev@flink.apache.org
> >>>>>>> > >
> >>>>>>> > > Hi Filip,
> >>>>>>> > >
> >>>>>>> > > Thanks for bringing this up.
> >>>>>>> > >
> >>>>>>> > > The hard truth is that committers who were previously active on
> >>>>>>> the
> >>>>>>> > > StateFun subproject, including myself, all currently have other
> >>>>>>> focuses.
> >>>>>>> > > Indeed, we may need to discuss with the community on how to
> >>>>>>> proceed if
> >>>>>>> > > there seems to be no continued committer coverage.
> >>>>>>> > >
> >>>>>>> > > If it's just a matter of upgrading the supported Flink version,
> >>>>>>> I'm still
> >>>>>>> > > familiar enough with the subproject to probably be able to
> drive
> >>>>>>> this (or
> >>>>>>> > > if your team is up to it, I can assist you on that).
> >>>>>>> > >
> >>>>>>> > > For the long-term, as a data point I'm curious to see how many
> >>>>>>> users are
> >>>>>>> > > using StateFun in production today, and how you're using it?
> >>>>>>> > >
> >>>>>>> > >   - Do your applications have arbitrary / cyclic /
> bi-directional
> >>>>>>> > >   messaging between individual functions?
> >>>>>>> > >   - Or are you utilizing StateFun simply to allow your stateful
> >>>>>>> functions
> >>>>>>> > >   to run remotely as separate processes?
> >>>>>>> > >
> >>>>>>> > > If the majority is only the latter category, there might be a
> >>>>>>> case to
> >>>>>>> > > support remote functions natively in Flink (which has been a
> >>>>>>> discussion
> >>>>>>> > in
> >>>>>>> > > the past).
> >>>>>>> > >
> >>>>>>> > > Thanks,
> >>>>>>> > > Gordon
> >>>>>>> > >
> >>>>>>> > > On Wed, Oct 26, 2022 at 3:30 AM Filip Karnicki <
> >>>>>>> filip.karni...@gmail.com
> >>>>>>> > >
> >>>>>>> > > wrote:
> >>>>>>> > >
> >>>>>>> > >> Hi, I noticed that the development on stateful functions has
> >>>>>>> come to a
> >>>>>>> > bit
> >>>>>>> > >> of a halt, with a pull request to update statefun to use Flink
> >>>>>>> 1.15
> >>>>>>> > being
> >>>>>>> > >> in the `open` state since May 2022.
> >>>>>>> > >>
> >>>>>>> > >> What do we think is the future of this sub-project?
> >>>>>>> > >>
> >>>>>>> > >> The background to this question is that my team is on a shared
> >>>>>>> Flink
> >>>>>>> > >> cluster which will soon be upgrading to Flink 1.15. If I need
> to
> >>>>>>> > re-write
> >>>>>>> > >> all our code as a native Flink job (rather than a remote
> >>>>>>> stateful
> >>>>>>> > function)
> >>>>>>> > >> then I need to get started right away.
> >>>>>>> > >>
> >>>>>>> > >> Many thanks,
> >>>>>>> > >> Fil
> >>>>>>> > >>
> >>>>>>> >
> >>>>>>> > --
> >>>>>>> > Ken Krugler
> >>>>>>> > http://www.scaleunlimited.com
> >>>>>>> > Custom big data solutions
> >>>>>>> > Flink, Pinot, Solr, Elasticsearch
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>
>


Re: [VOTE] FLIP-263: Improve resolving schema compatibility

2022-10-31 Thread Tzu-Li (Gordon) Tai
+1. Thanks Hangxiang, the proposed plan looks good!

On Sun, Oct 30, 2022 at 3:38 AM Yu Li  wrote:

> +1 (binding)
>
> Thanks Hangxiang for driving this and thanks all for the thorough
> discussion.
>
> Best Regards,
> Yu
>
>
> On Fri, 28 Oct 2022 at 16:01, Dawid Wysakowicz 
> wrote:
>
> > +1,
> >
> > Best,
> >
> > Dawid
> >
> > On 28/10/2022 08:08, godfrey he wrote:
> > > +1 (binding)
> > >
> > > Thanks for driving this!
> > >
> > > Best,
> > > Godfrey
> > >
> > > Yun Gao  于2022年10月28日周五 13:50写道:
> > >> +1 (binding)
> > >>
> > >> Thanks Hangxiang for driving the FLIP.
> > >>
> > >> Best,
> > >> Yun Gao
> > >>
> > >>
> > >>
> > >>
> > >>   --Original Mail --
> > >> Sender:Zakelly Lan 
> > >> Send Date:Fri Oct 28 12:27:01 2022
> > >> Recipients:Flink Dev 
> > >> Subject:Re: [VOTE] FLIP-263: Improve resolving schema compatibility
> > >> Hi Hangxiang,
> > >>
> > >> The current plan looks good to me, +1 (non-binding). Thanks for
> driving
> > this.
> > >>
> > >> Best,
> > >> Zakelly
> > >>
> > >> On Fri, Oct 28, 2022 at 11:18 AM Yuan Mei 
> > wrote:
> > >>> +1 (binding)
> > >>>
> > >>> Thanks for driving this.
> > >>>
> > >>> Best
> > >>> Yuan
> > >>>
> > >>> On Fri, Oct 28, 2022 at 11:17 AM yanfei lei 
> > wrote:
> > >>>
> >  +1(non-binding) and thanks for Hangxiang's driving.
> > 
> > 
> > 
> >  Hangxiang Yu  于2022年10月28日周五 09:24写道:
> > 
> > > Hi everyone,
> > >
> > > I'd like to start the vote for FLIP-263 [1].
> > >
> > > Thanks for your feedback and the discussion in [2][3].
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > Best regards,
> > > Hangxiang.
> > >
> > > [1]
> > >
> > >
> > 
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility
> > > [2]
> https://lists.apache.org/thread/4w36oof8dh28b9f593sgtk21o8qh8qx4
> > >
> > > [3]
> https://lists.apache.org/thread/t0bdkx1161rlbnsf06x0kswb05mch164
> > >
> > 
> >  --
> >  Best,
> >  Yanfei
> > 
> >
>


Re: [VOTE] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

2022-10-28 Thread Tzu-Li (Gordon) Tai
+1

On Fri, Oct 28, 2022 at 10:21 AM Konstantin Knauf  wrote:

> +1 (binding)
>
> Am Fr., 28. Okt. 2022 um 16:58 Uhr schrieb Piotr Nowojski <
> pnowoj...@apache.org>:
>
> > Hi,
> >
> > As discussed on the dev mailing list [0] I would like to start a vote to
> > drop support of older savepoint formats (for Flink versions older than
> > 1.8). You can find the original explanation from the aforementioned dev
> > mailing list thread at the bottom of this message.
> >
> > Draft PR containing the proposed change you can find here:
> > https://github.com/apache/flink/pull/21056
> >
> > Vote will be open at least until Wednesday, November 2nd 18:00 CET.
> >
> > Best,
> > Piotrek
> >
> > [0] https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w
> >
> > I would like to open a discussion to remove the long deprecated
> > (@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
> > code.
> >
> > The motivation behind this move is two fold. One reason is that it
> > complicates our code base unnecessarily and creates confusion on how to
> > actually implement custom serializers. The immediate reason is that I
> > wanted to clean up Flink's configuration stack a bit and refactor the
> > ExecutionConfig class [2]. This refactor would keep the API compatibility
> > of the ExecutionConfig, but it would break savepoint compatibility with
> > snapshots written with some of the old serializers, which had
> > ExecutionConfig as a field and were serialized in the snapshot. This
> issue
> > has been resolved by the introduction of TypeSerializerSnapshot in Flink
> > 1.7 [3], where serializers are no longer part of the snapshot.
> >
> > TypeSerializerConfigSnapshot has been deprecated and no longer used by
> > built-in serializers since Flink 1.8 [4] and [5]. Users were encouraged
> to
> > migrate to TypeSerializerSnapshot since then with their own custom
> > serializers. That has been plenty of time for the migration.
> >
> > This proposal would have the following impact for the users:
> > 1. we would drop support for recovery from savepoints taken with Flink <
> > 1.7.0 for all built in types serializers
> > 2. we would drop support for recovery from savepoints taken with Flink <
> > 1.8.0 for built in kryo serializers
> > 3. we would drop support for recovery from savepoints taken with Flink <
> > 1.17 for custom serializers using deprecated TypeSerializerConfigSnapshot
> >
> > 1. and 2. would have a simple migration path. Users migrating from those
> > old savepoints would have to first start his job using a Flink version
> from
> > the [1.8, 1.16] range, and take a new savepoint that would be compatible
> > with Flink 1.17.
> > 3. This is a bit more problematic, because users would have to first
> > migrate their own custom serializers to use TypeSerializerSnapshot
> (using a
> > Flink version from the [1.8, 1.16]), take a savepoint, and only then
> > migrate to Flink 1.17. However users had already 4 years to migrate,
> which
> > in my opinion has been plenty of time to do so.
> >
> > As a side effect, we could also drop support for some of the legacy
> > metadata serializers from LegacyStateMetaInfoReaders and potentially
> other
> > places that we are keeping for the sake of compatibility with old
> > savepoints.
> >
> > [1]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
> > [2] https://issues.apache.org/jira/browse/FLINK-29379
> > [3] https://issues.apache.org/jira/browse/FLINK-9377
> > [4] https://issues.apache.org/jira/browse/FLINK-9376
> > [5] https://issues.apache.org/jira/browse/FLINK-11323
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>


Re: Stateful Functions with Flink 1.15 and onwards

2022-10-26 Thread Tzu-Li (Gordon) Tai
Hi Filip,

Thanks for bringing this up.

The hard truth is that committers who were previously active on the
StateFun subproject, including myself, all currently have other focuses.
Indeed, we may need to discuss with the community on how to proceed if
there seems to be no continued committer coverage.

If it's just a matter of upgrading the supported Flink version, I'm still
familiar enough with the subproject to probably be able to drive this (or
if your team is up to it, I can assist you on that).

For the long-term, as a data point I'm curious to see how many users are
using StateFun in production today, and how you're using it?

   - Do your applications have arbitrary / cyclic / bi-directional
   messaging between individual functions?
   - Or are you utilizing StateFun simply to allow your stateful functions
   to run remotely as separate processes?

If the majority is only the latter category, there might be a case to
support remote functions natively in Flink (which has been a discussion in
the past).

Thanks,
Gordon

On Wed, Oct 26, 2022 at 3:30 AM Filip Karnicki 
wrote:

> Hi, I noticed that the development on stateful functions has come to a bit
> of a halt, with a pull request to update statefun to use Flink 1.15 being
> in the `open` state since May 2022.
>
> What do we think is the future of this sub-project?
>
> The background to this question is that my team is on a shared Flink
> cluster which will soon be upgrading to Flink 1.15. If I need to re-write
> all our code as a native Flink job (rather than a remote stateful function)
> then I need to get started right away.
>
> Many thanks,
> Fil
>


Re: [VOTE] FLIP-252: Amazon DynamoDB Sink Connector

2022-07-20 Thread Tzu-Li (Gordon) Tai
+1

On Wed, Jul 20, 2022 at 6:13 AM Danny Cranmer 
wrote:

> Hi there,
>
> After the discussion in [1], I’d like to open a voting thread for FLIP-252
> [2], which proposes the addition of an Amazon DynamoDB sink based on the
> Async Sink [3].
>
> The vote will be open until July 23rd earliest (72h), unless there are any
> binding vetos.
>
> Cheers, Danny
>
> [1] https://lists.apache.org/thread/ssmf2c86n3xyd5qqmcdft22sqn4qw8mw
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector
> [3]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
>


Re: [VOTE] Apache Flink Stateful Functions 3.2.0, release candidate #1

2022-01-28 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- verified checksums and signature
- built from source with e2e tests
- Ran Js greeter example
- Checked new updated Dockerfile to be released

Thanks,
Gordon

On Fri, Jan 28, 2022 at 8:58 AM Seth Wiesman  wrote:

> +1 (non-binding)
>
> - verified checksums and signatures
> - built from source and ran e2e tests
> - checked dependencies / licenses
> - deployed greeter with golang sdk
> - reviewed blog post
>
> Cheers,
>
> Seth
>
>
> On Fri, Jan 28, 2022 at 4:22 AM Igal Shilman  wrote:
>
> > +1 (binding)
> >
> > - verified checksum and signatures
> > - build from source code
> > - run e2e tests
> > - verified that there are no binary artifacts in the source release
> > - built the python sdk from sources, and run the python greeter example
> > - built from source the js sdk and run the javascript greeter example.
> >
> >
> > Thanks,
> > Igal.
> >
> >
> >
> >
> >
> >
> > On Thu, Jan 27, 2022 at 8:20 PM Mingmin Xu  wrote:
> >
> > > +1 (non-binding)
> > >
> > > - verified checksum and signatures
> > > - build from source code
> > > - version checked
> > > - test docker PR
> > > - test flink-statefun-playground/greeter with 3.2.0
> > >
> > > Misc, do we want to upgrade flink-statefun-playground together?
> Currently
> > > the README information is a little behind.
> > >
> > > B.R.
> > > Mingmin
> > >
> > >
> > > On Wed, Jan 26, 2022 at 4:55 AM Till Rohrmann 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > a quick update on the vote:
> > > >
> > > > The correct link for the artifacts at the Apache Nexus repository is
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1485/.
> > > >
> > > > Moreover, there is now also a tag for the GoLang SDK:
> > > >
> > https://github.com/apache/flink-statefun/tree/statefun-sdk-go/v3.2.0-rc1
> > > .
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jan 25, 2022 at 10:49 PM Till Rohrmann  >
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Please review and vote on the release candidate #1 for the version
> > > 3.2.0
> > > > > of Apache Flink Stateful Functions, as follows:
> > > > >
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > > > >
> > > > > **Release Overview**
> > > > >
> > > > > As an overview, the release consists of the following:
> > > > > a) Stateful Functions canonical source distribution, to be deployed
> > to
> > > > the
> > > > > release repository at dist.apache.org
> > > > > b) Stateful Functions Python SDK distributions to be deployed to
> PyPI
> > > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > > > d) New Dockerfiles for the release
> > > > > e) GoLang SDK (contained in the repository)
> > > > > f) JavaScript SDK (contained in the repository; will be uploaded to
> > npm
> > > > > after the release)
> > > > >
> > > > > **Staging Areas to Review**
> > > > >
> > > > > The staging areas containing the above mentioned artifacts are as
> > > > follows,
> > > > > for your review:
> > > > > * All artifacts for a) and b) can be found in the corresponding dev
> > > > > repository at dist.apache.org [2]
> > > > > * All artifacts for c) can be found at the Apache Nexus Repository
> > [3]
> > > > >
> > > > > All artifacts are signed with the key
> > > > > B9499FA69EFF5DEEEBC3C1F5BA7E4187C6F73D82 [4]
> > > > >
> > > > > Other links for your review:
> > > > > * JIRA release notes [5]
> > > > > * source code tag "release-3.2.0-rc1" [6]
> > > > > * PR for the new Dockerfiles [7]
> > > > > * PR to update the website Downloads page to include Stateful
> > Functions
> > > > > links [8]
> > > > > * GoLang SDK [9]
> > > > > * JavaScript SDK [10]
> > > > >
> > > > > **Vote Duration**
> > > > >
> > > > > The voting time will run for at least 72 hours.
> > > > > It is adopted by majority approval, with at least 3 PMC affirmative
> > > > votes.
> > > > >
> > > > > Thanks,
> > > > > Till
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > > > > [2]
> > > >
> https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.2.0-rc1/
> > > > > [3]
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1483/
> > > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [5]
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350540
> > > > > [6]
> https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1
> > > > > [7] https://github.com/apache/flink-statefun-docker/pull/19
> > > > > [8] https://github.com/apache/flink-web/pull/501
> > > > > [9]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1/statefun-sdk-go
> > > > > [10]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink-statefun/tree/release-3.2.0-rc1/statefun-sdk-js
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Stateful Functions 3.2.0 release?

2022-01-24 Thread Tzu-Li (Gordon) Tai
+1

On Mon, Jan 24, 2022 at 9:43 AM Igal Shilman  wrote:

> +1 and thanks for volunteering to be the release manager!
>
> Cheers,
> Igal.
>
> On Mon, Jan 24, 2022 at 4:13 PM Seth Wiesman  wrote:
>
> > +1
> >
> > These are already a useful set of features to release to our users.
> >
> > Seth
> >
> > On Mon, Jan 24, 2022 at 8:45 AM Till Rohrmann 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > We would like to kick off a new StateFun release 3.2.0. The new release
> > > will include the new JavaScript SDK and some useful major features:
> > >
> > > * JavaScript SDK [1]
> > > * Flink version upgrade to 1.14.3 [2]
> > > * Support different remote functions module names [3]
> > > * Allow creating custom metrics [4]
> > >
> > > The only missing ticket for this release is the documentation of the
> > > JavaScript SDK [5]. We plan to complete this in the next few days.
> > >
> > > Please let us know if you have any concerns.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-24256
> > > [2] https://issues.apache.org/jira/browse/FLINK-25708
> > > [3] https://issues.apache.org/jira/browse/FLINK-25308
> > > [4] https://issues.apache.org/jira/browse/FLINK-22533
> > > [5] https://issues.apache.org/jira/browse/FLINK-25775
> > >
> > > Cheers,
> > > Till
> > >
> >
>


Re: [VOTE] Stateful functions 3.1.1 release

2021-12-20 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Checked hash and signatures
- Checked diff contains Flink upgrade
- mvn clean install with e2e

Thanks,
Gordon

On Mon, Dec 20, 2021, 13:55 Seth Wiesman  wrote:

> +1 (non-binding)
>
> - Verified signatures
> - Checked diff
> - Checked site PR
> - Build from source and ran e2e tests
>
> Seth
>
> On Mon, Dec 20, 2021 at 10:59 AM Igal Shilman  wrote:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #2 for the version 3.1.1
> of
> > Apache Flink Stateful Functions, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > This release updates the Flink version to fix the log4j CVEs
> >
> > **Testing Guideline**
> >
> > You can find here [1] a page in the project wiki on instructions for
> > testing.
> > To cast a vote, it is not necessary to perform all listed checks,
> > but please mention which checks you have performed when voting.
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Stateful Functions canonical source distribution, to be deployed to
> the
> > release repository at dist.apache.org
> > b) Stateful Functions Python SDK distributions to be deployed to PyPI
> > c) Maven artifacts to be deployed to the Maven Central Repository
> > d) New Dockerfiles for the release
> > e) GoLang SDK tag v3.1.1-rc2
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as
> follows,
> > for your review:
> > * All artifacts for a) and b) can be found in the corresponding dev
> > repository at dist.apache.org [2]
> > * All artifacts for c) can be found at the Apache Nexus Repository [3]
> >
> > All artifacts are signed with the key
> > 73BC0A2B04ABC80BF0513382B0ED0E338D622A92 [4]
> >
> > Other links for your review:
> > * JIRA release notes [5]
> > * source code tag "release-3.0.0-rc1" [6]
> > * PR for the new Dockerfiles [7]
> > * PR for the flink website [8]
> >
> > **Vote Duration**
> >
> > The voting time will run for 24 hours. We are targeting this vote to last
> > until December. 21nd, 22:00 CET.
> > Or It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> >
> > Thanks,
> > Seth & Igal
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.1.1-rc2/
> > [3]
> https://repository.apache.org/content/repositories/orgapacheflink-1466
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351096==12315522
> > [6] https://github.com/apache/flink-statefun/tree/release-3.1.1-rc2
> > [7] https://github.com/apache/flink-statefun-docker/pull/18
> > [8] https://github.com/apache/flink-web/pull/492
> >
>


Re: [DISCUSS] Immediate dedicated StateFun releases for log4j vulnerability

2021-12-16 Thread Tzu-Li (Gordon) Tai
+1

On Thu, Dec 16, 2021, 10:30 Igal Shilman  wrote:

> Hi All,
>
> Following the recent Apache Flink releases due to the log4j vulnerability,
> I'd like to propose an immediate StateFun release - 3.1.1.
> This release is basically the same as 3.1 but updates the Flink dependency
> to 1.13.3.
>
> Please raise your concerns if any, otherwise we'll proceed with the
> release.
>
> Thanks,
> Igal.
>


Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1

2021-12-14 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- verified hashes and signatures
- checked that diff of all RCs contain only the log4j version upgrade

On Tue, Dec 14, 2021 at 4:06 AM Yun Gao 
wrote:

> +1 (non-binding)
>
> * Reviewed the blog post.
> * Verified each version could run normally with example jobs.
> * Checked each version only contains the log4j2 fix.
>
> Thanks Chesnay for driving the emergency fix releases!
>
> Best,
> Yun
>
>
> --
> From:Yun Tang 
> Send Time:2021 Dec. 14 (Tue.) 18:25
> To:dev@flink.apache.org ; Till Rohrmann <
> trohrm...@apache.org>
> Subject:Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate
> #1
>
> + 1 (non-binding) for releasing flink-1.13.4 and flink-1.14.1 currently
>
>
>   *   reviewed blog post
>   *   checked that the hot fix verion only contains the log4j2 version bump
>
> Best
> Yun Tang
> 
> From: Chesnay Schepler 
> Sent: Tuesday, December 14, 2021 17:12
> To: dev@flink.apache.org ; Till Rohrmann <
> trohrm...@apache.org>
> Subject: Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate
> #1
>
> I think that should be possible.
>
> On 14/12/2021 10:06, Till Rohrmann wrote:
> > +1 (binding)
> >
> > - reviewed blog post
> > - verified shasum and signatures
> > - checked that diff only contains the log4j version bump
> >
> > Can we simply add the missing Python binaries for MacOS after the release
> > of the other artifacts?
> >
> > Cheers,
> > Till
> >
> > On Tue, Dec 14, 2021 at 4:56 AM Yun Tang  wrote:
> >
> >> Hi Chesnay,
> >>
> >> Thanks a lot for driving these emergency patch releases!
> >>
> >> I just noticed that current flink-1.11.4 offers python files on mac os
> >> [1]. Is it okay to release Flink-1.11.5 and flink-1.12.6 without those
> >> python binaries on mac os?
> >>
> >>
> >> [1] https://pypi.org/project/apache-flink/1.11.4/#files
> >>
> >> Best
> >> Yun Tang
> >> 
> >> From: Zhu Zhu 
> >> Sent: Tuesday, December 14, 2021 11:00
> >> To: dev 
> >> Subject: Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release
> candidate
> >> #1
> >>
> >> +1 (binding)
> >>
> >> - verified the differences of source releases to the corresponding
> latest
> >> releases, there are only dependency updates and release version update
> >> commits
> >> - verified versions of log4j dependencies in the all binary releases are
> >> 2.15.0
> >> - ran example jobs against all the binary releases, logs look good
> >> - release notes and blogpost look good
> >>
> >> Thanks,
> >> Zhu
> >>
> >> Xintong Song  于2021年12月14日周二 10:23写道:
> >>
> >>> +1 (binding)
> >>>
> >>> - verified checksum and signature
> >>> - verified that release candidates only contain the log4j dependency
> >>> changes compared to previous releases.
> >>> - release notes and blogpost LGTM
> >>>
> >>> Thanks a lot for driving these emergency patch releases, Chesnay!
> >>>
> >>> Thank you~
> >>>
> >>> Xintong Song
> >>>
> >>>
> >>>
> >>> On Tue, Dec 14, 2021 at 7:45 AM Chesnay Schepler 
> >>> wrote:
> >>>
>  I forgot to mention something important:
> 
>  The 1.11/1.12 releases do *NOT* contain flink-python releases for
> *mac*
>  due to compile problems.
> 
>  On 13/12/2021 20:28, Chesnay Schepler wrote:
> > Hi everyone,
> >
> > This vote is for the emergency patch releases for 1.11, 1.12, 1.13
> >> and
> > 1.14 to address CVE-2021-44228.
> > It covers all 4 releases as they contain the same changes (upgrading
> > Log4j to 2.15.0) and were prepared simultaneously by the same person.
> > (Hence, if something is broken, it likely applies to all releases)
> >
> > Please review and vote on the release candidate #1 for the versions
> > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows:
> > [ ] +1, Approve the releases
> > [ ] -1, Do not approve the releases (please provide specific
> >> comments)
> > The complete staging area is available for your review, which
> >> includes:
> > * JIRA release notes [1],
> > * the official Apache source releases and binary convenience releases
> > to be deployed to dist.apache.org [2], which are signed with the key
> > with fingerprint C2EED7B111D464BA [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> >  * *the jars for 1.13/1.14 are still being built*
> > * source code tags [5],
> > * website pull request listing the new releases and adding
> > announcement blog post [6].
> >
> > The vote will be open for at least 24 hours. The minimum vote time
> >> has
> > been shortened as the changes are minimal and the matter is urgent.
> > It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> >
> > Thanks,
> > Chesnay
> >
> > [1]
> > 1.11:
> >
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327
> > 1.12:
> >
> >>
> 

Re: [ANNOUNCE] Apache Flink Stateful Functions 3.1.0 released

2021-08-31 Thread Tzu-Li (Gordon) Tai
Congrats on the release!

And thank you for driving this release, Igal.

Cheers
Gordon

On Tue, Aug 31, 2021, 23:13 Igal Shilman  wrote:

> The Apache Flink community is very happy to announce the release of Apache
> Flink Stateful Functions (StateFun) 3.1.0.
>
> StateFun is a cross-platform stack for building Stateful Serverless
> applications, making it radically simpler to develop scalable, consistent,
> and elastic distributed applications.
>
> Please check out the release blog post for an overview of the release:
> https://flink.apache.org/news/2021/08/31/release-statefun-3.1.0.html
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Maven artifacts for StateFun can be found at:
> https://search.maven.org/search?q=g:org.apache.flink%20statefun
>
> Python SDK for StateFun published to the PyPI index can be found at:
> https://pypi.org/project/apache-flink-statefun/
>
> Official Docker images for StateFun are published to Docker Hub:
> https://hub.docker.com/r/apache/flink-statefun
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350038=12315522
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Thanks,
> Igal
>


Re: [VOTE] Apache Flink Stateful Functions 3.1.0, release candidate #1

2021-08-26 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- Built from source with Java 11 and Java 8 (mvn clean install
-Prun-e2e-tests)
- verified signatures and hashes
- verified NOTICE files of Maven artifacts properly list actual bundled
dependencies
- Ran GoLang greeter and showcase with the proposed Dockerfiles for 3.1.0
- Ran a local smoke E2E against the Java SDK, with adjusted parameters to
run for a longer period of time

Thanks for driving the release Igal!

Cheers,
Gordon

On Thu, Aug 26, 2021 at 4:06 AM Seth Wiesman  wrote:

> +1 (non-binding)
>
> - verified signatures and hashes
> - Checked licenses
> - ran mvn clean install -Prun-e2e-tests
> - ran golang greeter and showcase from the playground [1]
>
> Seth
>
> [1] https://github.com/apache/flink-statefun-playground/pull/12
>
> On Wed, Aug 25, 2021 at 9:44 AM Igal Shilman  wrote:
>
> > +1 from my side:
> >
> > Here are the results of my RC2 testing:
> >
> > - verified the signatures and hashes
> > - verified that the source distribution doesn't contain any binary files
> > - ran mvn clean install -Prun-e2e-tests
> > - ran Java and Python greeters from the playground [1] with the new
> module
> > structure, and async transport enabled.
> > - verified that the docker image [2] builds and inspected the contents
> > manually.
> >
> > Thanks,
> > Igal
> >
> > [1] https://github.com/apache/flink-statefun-playground/tree/dev
> > [2] https://github.com/apache/flink-statefun-docker/pull/15
> >
> >
> > On Tue, Aug 24, 2021 at 3:34 PM Igal Shilman  wrote:
> >
> > > Sorry, the subject of the previous message should have said "[VOTE]
> > Apache
> > > Flink Stateful Functions 3.1.0, release candidate #2".
> > >
> > >
> > > On Tue, Aug 24, 2021 at 3:24 PM Igal Shilman  wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Please review and vote on the release candidate #2 for the version
> 3.1.0
> > >> of Apache Flink Stateful Functions, as follows:
> > >> [ ] +1, Approve the release
> > >> [ ] -1, Do not approve the release (please provide specific comments)
> > >>
> > >> **Testing Guideline**
> > >>
> > >> You can find here [1] a page in the project wiki on instructions for
> > >> testing.
> > >> To cast a vote, it is not necessary to perform all listed checks,
> > >> but please mention which checks you have performed when voting.
> > >>
> > >> **Release Overview**
> > >>
> > >> As an overview, the release consists of the following:
> > >> a) Stateful Functions canonical source distribution, to be deployed to
> > >> the release repository at dist.apache.org
> > >> b) Stateful Functions Python SDK distributions to be deployed to PyPI
> > >> c) Maven artifacts to be deployed to the Maven Central Repository
> > >> d) New Dockerfiles for the release
> > >> e) GoLang SDK tag statefun-sdk-go/v3.1.0-rc2
> > >>
> > >> **Staging Areas to Review**
> > >>
> > >> The staging areas containing the above mentioned artifacts are as
> > >> follows, for your review:
> > >> * All artifacts for a) and b) can be found in the corresponding dev
> > >> repository at dist.apache.org [2]
> > >> * All artifacts for c) can be found at the Apache Nexus Repository [3]
> > >>
> > >> All artifacts are signed with the key
> > >> 73BC0A2B04ABC80BF0513382B0ED0E338D622A92 [4]
> > >>
> > >> Other links for your review:
> > >> * JIRA release notes [5]
> > >> * source code tag "release-3.1.0-rc2" [6]
> > >> * PR for the new Dockerfiles [7]
> > >>
> > >> **Vote Duration**
> > >>
> > >> The voting time will run for at least 72 hours (since RC1). We are
> > >> targeting this vote to last until Thursday. 26th of August, 6pm CET.
> > >> If it is adopted by majority approval, with at least 3 PMC affirmative
> > >> votes, it will be released.
> > >>
> > >> Thanks,
> > >> Igal
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > >> [2]
> > >>
> https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.1.0-rc2/
> > >> [3]
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1446/
> > >> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >> [5]
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350038=12315522
> > >> [6] https://github.com/apache/flink-statefun/tree/release-3.1.0-rc2
> > >> [7] https://github.com/apache/flink-statefun-docker/pull/15
> > >>
> > >>
> >
>


Re: [NOTICE] StateFun docs have been migrated to new ASF Buildbot at ci2.apache.org

2021-08-22 Thread Tzu-Li (Gordon) Tai
Hi Chesnay,

I didn't have plans to migrate the Flink builders, but I'd be happy to take
a look.

The StateFun builders were migrated as part of an effort to help ASF infra
test the new CI infrastructure. They were broken due to issues with the old
CI, so ASF infra decided to might as well move it to live on the new CI.
That's why StateFun's docs were migrated first.

For the Flink builders, I would suggest waiting for ASF infra to do the
initial move. Then, we can take it from there and update the builders to
rsync to nightlies.apache.org instead of uploading to master (which I'm
happy to do).

One discussion point would be: would the community prefer to continue using
SVN for maintaining the builder scripts, or move to git (which StateFun
did)?

Thanks,
Gordon

On Mon, Aug 16, 2021 at 3:37 PM Chesnay Schepler  wrote:

> Do you plan to migrate the Flink builders accordingly?
>
> On 16/08/2021 08:59, Tzu-Li (Gordon) Tai wrote:
> > Hi all,
> >
> > Just a quick announcement that we've officially migrated the StateFun
> docs
> > to ASF's new buildbot at https://ci2.apache.org/.
> >
> > There are a few other notable changes that took part in this migration:
> >
> > - The buildbot configuration file now using Git, and lives at
> >
> https://github.com/apache/infrastructure-bb2/blob/master/flink-statefun.py
> .
> >
> > - All Flink committers, and not only PMC members, have read write access
> to
> > that config file (as opposed to the PMC-only SVN access before).
> >
> > - Directly publishing and serving docs in the Buildbot master is
> > discouraged and deprecated in the new buildbot infrastructure. As such,
> the
> > built docs are now being rsync'ed to https://nightlies.apache.org/. Our
> new
> > doc URLs are now
> > https://nightlies.apache.org/flink/flink-statefun-docs-master/ (or
> replace
> > "master" with target branch).
> >
> > This change is now live.
> > StateFun doc links on the Flink website have been updated to point to
> > https://nightlies.apache.org/, and I've also reflected the changes to
> the
> > documentation instructions on our community wiki [1].
> >
> > Thanks,
> > Gordon
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/Managing+Documentation
> >
>
>


[jira] [Created] (FLINK-23853) Update StateFun's Flink dependency to 1.13.2

2021-08-18 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23853:
---

 Summary: Update StateFun's Flink dependency to 1.13.2
 Key: FLINK-23853
 URL: https://issues.apache.org/jira/browse/FLINK-23853
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.1.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Stateful Functions 3.1.0 release?

2021-08-16 Thread Tzu-Li (Gordon) Tai
Hi Igal,

Thanks a lot for starting this discussion!

The list of features sounds like this will be a great release. As the list
is already quite packed with major features, +1 to feature freeze on August
20 so we can move towards 3.1.0.

Thanks,
Gordon

On Mon, Aug 16, 2021, 18:57 Igal Shilman  wrote:

> Hi everyone!
>
> We would like to kick off a new StateFun release (3.1.0), this release will
> have some major features that users would benefit from, such as:
>
> * [FLINK-21308] Support delayed message cancellation.
> * [FLINK-23296] Support pluggable transports for remote functions
> invocation.
> * [FLINK-23711] Add a non blocking, asynchronous transport for remote
> function invocation.
> * [FLINK-23600] Adopting kubernetes style resource formats for the
> module.yaml definition.
> * [FLINK-18810] Add a GoLang SDK for remote functions.
>
> We would like to propose a feature freeze for this Friday 20th of Aug, and
> would like to kick off the release candidates early next week.
>
> Please let us know if you have any concerns.
>
> Thanks,
> Igal
>


[NOTICE] StateFun docs have been migrated to new ASF Buildbot at ci2.apache.org

2021-08-16 Thread Tzu-Li (Gordon) Tai
Hi all,

Just a quick announcement that we've officially migrated the StateFun docs
to ASF's new buildbot at https://ci2.apache.org/.

There are a few other notable changes that took part in this migration:

- The buildbot configuration file now using Git, and lives at
https://github.com/apache/infrastructure-bb2/blob/master/flink-statefun.py.

- All Flink committers, and not only PMC members, have read write access to
that config file (as opposed to the PMC-only SVN access before).

- Directly publishing and serving docs in the Buildbot master is
discouraged and deprecated in the new buildbot infrastructure. As such, the
built docs are now being rsync'ed to https://nightlies.apache.org/. Our new
doc URLs are now
https://nightlies.apache.org/flink/flink-statefun-docs-master/ (or replace
"master" with target branch).

This change is now live.
StateFun doc links on the Flink website have been updated to point to
https://nightlies.apache.org/, and I've also reflected the changes to the
documentation instructions on our community wiki [1].

Thanks,
Gordon

[1] https://cwiki.apache.org/confluence/display/FLINK/Managing+Documentation


[jira] [Created] (FLINK-23762) Revisit RequestReplyFunctionTest unit tests

2021-08-13 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23762:
---

 Summary: Revisit RequestReplyFunctionTest unit tests
 Key: FLINK-23762
 URL: https://issues.apache.org/jira/browse/FLINK-23762
 Project: Flink
  Issue Type: Technical Debt
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


There's some tech debt piling up in the {{RequestReplyFunctionTest}}. We need 
to revisit how we're unit testing the {{RequestReplyFunction}}.

Some outstanding issues:
* We're explicitly calling `invoke` with async results to simulate function 
responses. With changes in FLINK-20574, that's no longer always the case (the 
first request is a blocking call). This hints the fact that those unit tests 
are leaking implementation detail and thus making them hard to extend.
* State restore is not properly mocked, which becomes apparent in 
{{retryBatchOnUnkownAsyncResponseAfterRestore}}. The states "batch" and 
"requestState" starts again from fresh in that test. To properly simulate a 
snapshot and restore, we probably want to move all persisted state of 
{{RequestReplyFunction}}, including "batch", "requestState", and the remote 
function values, all into a separate wrapper class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23718) Manage StateFun Python SDK as a Maven module

2021-08-10 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23718:
---

 Summary: Manage StateFun Python SDK as a Maven module
 Key: FLINK-23718
 URL: https://issues.apache.org/jira/browse/FLINK-23718
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


As of now, the StateFun Python SDK lives as a "dangling" directory in the 
repository that is not managed by Maven. We'd like include the directory as a 
Maven module. To start things simple, we can have the Maven POM do nothing - 
the purpose is just so that the directory is included in the build process 
(e.g. for ASF license checks on the source files).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23717) Allow setting configs using plain strings in StatefulFunctionsAppContainers

2021-08-10 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23717:
---

 Summary: Allow setting configs using plain strings in 
StatefulFunctionsAppContainers
 Key: FLINK-23717
 URL: https://issues.apache.org/jira/browse/FLINK-23717
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


The {{StatefulFunctionsAppContainers}} has a {{withConfiguration(ConfigOption, 
value)}} method that allows setting Flink configurations.

While this is useful, having to include external dependencies (most of the 
time, core) just to get access to the ConfigOption is often too much. It would 
be nice if the utility supports setting configs using plain strings as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23714) Expose both master and worker logs when using StatefulFunctionsAppContainers

2021-08-10 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23714:
---

 Summary: Expose both master and worker logs when using 
StatefulFunctionsAppContainers
 Key: FLINK-23714
 URL: https://issues.apache.org/jira/browse/FLINK-23714
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


We currently only support exposing master logs when using the 
{{StatefulFunctionsAppContainers}} E2E test utility. A lot of the times, the 
worker logs are also insightful, e.g. checking logs of state restore operations.

Lets extend the {{exposeMasterLogs(Logger)}} method to simply be 
{{exposeLogs(Logger)}} which exposes both the master and worker logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23709) Remove SanityVerificationE2E and ExactlyOnceRemoteE2E

2021-08-10 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23709:
---

 Summary: Remove SanityVerificationE2E and ExactlyOnceRemoteE2E
 Key: FLINK-23709
 URL: https://issues.apache.org/jira/browse/FLINK-23709
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


Over time, the smoke E2E tests have proven to be extensive enough to subsume 
the {{SanityVerificationE2E}} and {{ExactlyOnceRemoteE2E}}, which are far less 
covering in terms of test scope. As a matter of fact, a large majority (if not 
all) if the more important bugs we have discovered over the last few releases 
were surfaced by the smoke E2Es.

As the build times are growing larger and larger in StateFun, we suggest to 
remove {{SanityVerificationE2E}} and {{ExactlyOnceRemoteE2E}} to be 
conservative on the build times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-180: Adjust StreamStatus and Idleness definition

2021-08-09 Thread Tzu-Li (Gordon) Tai
+1 (binding)

On Tue, Aug 10, 2021 at 12:15 AM Piotr Nowojski 
wrote:

> +1 (binding)
>
> Piotrek
>
> pon., 9 sie 2021 o 18:00 Stephan Ewen  napisał(a):
>
> > +1 (binding)
> >
> > On Mon, Aug 9, 2021 at 12:08 PM Till Rohrmann 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Aug 5, 2021 at 9:09 PM Arvid Heise  wrote:
> > >
> > > > Dear devs,
> > > >
> > > > I'd like to open a vote on FLIP-180: Adjust StreamStatus and Idleness
> > > > definition [1] which was discussed in this thread [2].
> > > > The vote will be open for at least 72 hours unless there is an
> > objection
> > > or
> > > > not enough votes.
> > > >
> > > > Best,
> > > >
> > > > Arvid
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-180%3A+Adjust+StreamStatus+and+Idleness+definition
> > > > [2]
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r8357d64b9cfdf5a233c53a20d9ac62b75c07c925ce2c43e162f1e39c%40%3Cdev.flink.apache.org%3E
> > > >
> > >
> >
>


[jira] [Created] (FLINK-23600) Rework StateFun's remote module parsing and binding

2021-08-03 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23600:
---

 Summary: Rework StateFun's remote module parsing and binding
 Key: FLINK-23600
 URL: https://issues.apache.org/jira/browse/FLINK-23600
 Project: Flink
  Issue Type: New Feature
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


Currently, we have a {{JsonModule}} class that is responsible for parsing 
user's module YAML specifications, resolving the specification into application 
components (i.e. function providers, ingresses, routers, and egresses) that is 
then bound to the application universe.

Over time, the {{JsonModule}} class has overgrown with several changes as we 
progressively adapted the YAML format.
* The class handles ALL kinds of components, including ingresses / functions / 
egresses etc. The code is extremely fragile and becoming hard to extend.
* Users have no access to extend this class, if they somehow need to plugin 
custom components (e.g. adding an unsupported ingress / egress, custom protocol 
implementations etc).

We aim to rework this with the following goals in mind:
# The system should only handle {{module.yaml}} parsing up to the point where 
it extracts a list of JSON objects that individually represent an application 
component.
# The system has no knowledge of what each JSON objects contains, other than 
its {{TypeName}} which would map to a corresponding {{ComponentBinder}}.
# A {{ComponentBinder}} is essentially an extension bound to the system that 
knows how to parse a specific JSON object, and bind components to the 
application universe.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23296) Add RequestReplyClientFactory as a pluggable extension

2021-07-07 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23296:
---

 Summary: Add RequestReplyClientFactory as a pluggable extension
 Key: FLINK-23296
 URL: https://issues.apache.org/jira/browse/FLINK-23296
 Project: Flink
  Issue Type: Sub-task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.1.0


Currently, we ship and always use by default a {{RequestReplyClient}} 
implementation based on OkHttp. We'd like to allow users to use their own 
implementations of the {{RequestReplyClient}} for remote invocations. This is 
strictly for extending the means of transport, and should not leave room to 
touch the invocation protocol.

The way this would look like in the module YAML files for remote modules would 
be:

{code}
module:
spec:
endpoints:
- endpoint:
   meta:
   kind: http
   spec:
   functions: com.foo.bar/*
   urlPathTemplate: 
http://bar.foo.com:8080/functions/{function.name}
   transport:
   extension: com.foo.bar/some.custom.client
   prop1: foobar
   prop2:
   - k: v
   - k2: v2
   
{code}

The important part is the transport section. If not specified, then the default 
OkHttp implementation will be used. Otherwise, if specified, an extension with 
the specified typename must be bound and exist in the application, and that 
extension needs to be a {{RequestReplyClientFactory}}:

{code}
interface RequestReplyClientFactory {
RequestReplyClient create(JSONNode properties, URI endpointUrl);
}
{code}

The provided JSON node properties will be as is provided in the {{transport}} 
block of the module YAML endpoint specification.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23295) Introduce extension module SPI to StateFun

2021-07-07 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23295:
---

 Summary: Introduce extension module SPI to StateFun
 Key: FLINK-23295
 URL: https://issues.apache.org/jira/browse/FLINK-23295
 Project: Flink
  Issue Type: Sub-task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.1.0


To support pluggable extensions, e.g. custom HTTP clients for remote 
invocation, we'd like to add a generic way for users to provide their own 
implementations of various pluggable components.

This will take the form of {{ExtensionModule}}s, where a user that is providing 
an extension implements such a module to be included into the StateFun app. 
Each {{ExtensionModule}} may bind multiple extensions identified by unique 
typenames. Other components of the application, such as functions and IO 
modules, may access these extensions through their typenames.

The SPI would likely look like this:
{code}
public interface ExtensionModule {
void configure(Map globalConfiguration, Binder binder);

interface Binder {
 void bindExtension(TypeName typeName, T extension);
}
}
{code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23293) Support pluggable / extendable HTTP transport for StateFun remote invocations

2021-07-07 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-23293:
---

 Summary: Support pluggable / extendable HTTP transport for 
StateFun remote invocations
 Key: FLINK-23293
 URL: https://issues.apache.org/jira/browse/FLINK-23293
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.1.0


For some of our advanced users, it is required for them to use their own HTTP 
client implementations for remote function invocations.

Towards that end, we'd like to support a generic way to plugin custom 
implementations, with the HTTP client being one of the initially supported 
extensions.

This includes a few separate sub-tasks that will be added under this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Definition of idle partitions

2021-06-09 Thread Tzu-Li (Gordon) Tai
Forgot to provide the link to the [1] reference:

[1] https://issues.apache.org/jira/browse/FLINK-5017

On Thu, Jun 10, 2021 at 11:43 AM Tzu-Li (Gordon) Tai 
wrote:

> Hi everyone,
>
> Sorry for chiming in late here.
>
> Regarding the topic of changing the definition of StreamStatus and
> changing the name as well:
> After digging into some of the roots of this implementation [1], initially
> the StreamStatus was actually defined to mark "watermark idleness", and not
> "record idleness" (in fact, the alternative name "WatermarkStatus" was
> considered at the time).
>
> The concern at the time causing us to alter the definition to be "record
> idleness" in the final implementation was due to the existence of periodic
> timestamp / watermark generators within the pipeline. Those would continue
> to generate non-increasing watermarks in the absence of any input records
> from upstream. In this scenario, downstream operators would not be able to
> consider that channel as idle and therefore watermark progress is locked.
> We could consider a timeout-based approach on those specific operators to
> toggle watermark idleness if the values remain constant for a period of
> time, but then again, this is very ill-defined and most likely wrong.
>
> I have not followed the newest changes to the watermark generator
> operators and am not sure if this issue is still relevant.
> Otherwise, I don't see other problems with changing the definition here.
>
> Thanks,
> Gordon
>
> On Wed, Jun 9, 2021 at 3:06 PM Arvid Heise  wrote:
>
>> Hi Eron,
>>
>> again to recap from the other thread:
>> - You are right that idleness is correct with static assignment and fully
>> active partitions. In this case, the source defines idleness. (case A)
>> - For the more pressing use cases of idle, assigned partitions, the user
>> defines an idleness threshold, and it becomes potentially incorrect, when
>> the partition becomes active again. (case B)
>> - Same holds for dynamic assignment of splits. If a source without a split
>> gets a split assigned dynamically, there is a realistic chance that the
>> watermark advanced past the first record of the newly assigned split.
>> (case
>> C)
>> You can certainly insist that only the first case is valid (as it's
>> correct) but we know that users use it in other ways and that was also the
>> intent of the devs.
>>
>> Now the question could be if it makes sense to distinguish these cases.
>> Would you treat the idleness information differently (especially in the
>> sink/source that motivated FLIP-167) if you knew that the idleness is
>> guaranteed correct?
>> We could have some WatermarkStatus with ACTIVE, IDLE (case A), TIMEOUT
>> (case B).
>>
>> However, that would still leave case C, which probably would need to be
>> solved completely differently. I could imagine that a source with dynamic
>> assignments should never have IDLE subtasks and rather manage the idleness
>> itself. For example, it could emit a watermark per second/minute that is
>> directly fetched from the source system. I'm just not sure if the current
>> WatermarkAssigner interface suffices in that regard...
>>
>>
>> On Wed, Jun 9, 2021 at 7:31 AM Piotr Nowojski 
>> wrote:
>>
>> > Hi Eron,
>> >
>> > Can you elaborate a bit more what do you mean? I don’t understand what
>> do
>> > you mean by more general solution.
>> >
>> > As of now, stream is marked idle by a source/watermark generator, which
>> > has an effect of temporarily ignoring this stream/partition from
>> > calculating min watermark in the downstream tasks. However stream is
>> > switching back to active when any record is emitted. This is what’s
>> causing
>> > problems described by Arvid.
>> >
>> > The core of our proposal is very simple. Keep everything as it is except
>> > stating that stream will be changed back to active only once a
>> watermark is
>> > emitted again - not record. In other words disconnecting idleness from
>> > presence of records and connecting it only to presence or lack of
>> > watermarks and allowing to emit records while “stream status” is “idle”
>> >
>> > Piotrek
>> >
>> >
>> > > Wiadomość napisana przez Eron Wright > .invalid>
>> > w dniu 09.06.2021, o godz. 06:01:
>> > >
>> > > It seems to me that idleness was introduced to deal with a very
>> specific
>> > > issue.  In the pipeline, watermarks are aggregated not on a per-split
>> &

Re: [DISCUSS] Definition of idle partitions

2021-06-09 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Sorry for chiming in late here.

Regarding the topic of changing the definition of StreamStatus and changing
the name as well:
After digging into some of the roots of this implementation [1], initially
the StreamStatus was actually defined to mark "watermark idleness", and not
"record idleness" (in fact, the alternative name "WatermarkStatus" was
considered at the time).

The concern at the time causing us to alter the definition to be "record
idleness" in the final implementation was due to the existence of periodic
timestamp / watermark generators within the pipeline. Those would continue
to generate non-increasing watermarks in the absence of any input records
from upstream. In this scenario, downstream operators would not be able to
consider that channel as idle and therefore watermark progress is locked.
We could consider a timeout-based approach on those specific operators to
toggle watermark idleness if the values remain constant for a period of
time, but then again, this is very ill-defined and most likely wrong.

I have not followed the newest changes to the watermark generator operators
and am not sure if this issue is still relevant.
Otherwise, I don't see other problems with changing the definition here.

Thanks,
Gordon

On Wed, Jun 9, 2021 at 3:06 PM Arvid Heise  wrote:

> Hi Eron,
>
> again to recap from the other thread:
> - You are right that idleness is correct with static assignment and fully
> active partitions. In this case, the source defines idleness. (case A)
> - For the more pressing use cases of idle, assigned partitions, the user
> defines an idleness threshold, and it becomes potentially incorrect, when
> the partition becomes active again. (case B)
> - Same holds for dynamic assignment of splits. If a source without a split
> gets a split assigned dynamically, there is a realistic chance that the
> watermark advanced past the first record of the newly assigned split. (case
> C)
> You can certainly insist that only the first case is valid (as it's
> correct) but we know that users use it in other ways and that was also the
> intent of the devs.
>
> Now the question could be if it makes sense to distinguish these cases.
> Would you treat the idleness information differently (especially in the
> sink/source that motivated FLIP-167) if you knew that the idleness is
> guaranteed correct?
> We could have some WatermarkStatus with ACTIVE, IDLE (case A), TIMEOUT
> (case B).
>
> However, that would still leave case C, which probably would need to be
> solved completely differently. I could imagine that a source with dynamic
> assignments should never have IDLE subtasks and rather manage the idleness
> itself. For example, it could emit a watermark per second/minute that is
> directly fetched from the source system. I'm just not sure if the current
> WatermarkAssigner interface suffices in that regard...
>
>
> On Wed, Jun 9, 2021 at 7:31 AM Piotr Nowojski 
> wrote:
>
> > Hi Eron,
> >
> > Can you elaborate a bit more what do you mean? I don’t understand what do
> > you mean by more general solution.
> >
> > As of now, stream is marked idle by a source/watermark generator, which
> > has an effect of temporarily ignoring this stream/partition from
> > calculating min watermark in the downstream tasks. However stream is
> > switching back to active when any record is emitted. This is what’s
> causing
> > problems described by Arvid.
> >
> > The core of our proposal is very simple. Keep everything as it is except
> > stating that stream will be changed back to active only once a watermark
> is
> > emitted again - not record. In other words disconnecting idleness from
> > presence of records and connecting it only to presence or lack of
> > watermarks and allowing to emit records while “stream status” is “idle”
> >
> > Piotrek
> >
> >
> > > Wiadomość napisana przez Eron Wright 
> > w dniu 09.06.2021, o godz. 06:01:
> > >
> > > It seems to me that idleness was introduced to deal with a very
> specific
> > > issue.  In the pipeline, watermarks are aggregated not on a per-split
> > basis
> > > but on a per-subtask basis.  This works well when each subtask has
> > exactly
> > > one split.  When a sub-task has multiple splits, various complications
> > > occur involving the commingling of watermarks.  And when a sub-task has
> > no
> > > splits, the pipeline stalls altogether.  To deal with the latter
> problem,
> > > idleness was introduced.  The sub-task simply declares itself to be
> idle
> > to
> > > be taken out of consideration for purposes of watermark aggregation.
> > >
> > > If we're looking for a more general solution, I would suggest we
> discuss
> > > how to track watermarks on a per-split basis.  Or, as Till mentioned
> > > recently, an alternate solution may be to dynamically adjust the
> > > parallelism of the task.
> > >
> > > I don't agree with the notion that idleness involves a correctness
> > > tradeoff.  The facility I described above has no impact on 

Re: [VOTE] Watermark propagation with Sink API

2021-06-09 Thread Tzu-Li (Gordon) Tai
+1

On Thu, Jun 10, 2021 at 9:04 AM jincheng sun 
wrote:

> +1 (binding)  // Sorry for the late reply.
>
> Best,
> Jincheng
>
>
> Piotr Nowojski  于2021年6月9日周三 下午10:04写道:
>
> > Thanks for driving this Eron, and sorry for causing the delay.
> >
> > +1 (binding) from my side
> >
> > Piotrek
> >
> > wt., 8 cze 2021 o 23:48 Eron Wright 
> > napisał(a):
> >
> > > Voting is re-open for FLIP-167 as-is (without idleness support as was
> the
> > > point of contention).
> > >
> > > On Fri, Jun 4, 2021 at 10:45 AM Eron Wright 
> > > wrote:
> > >
> > > > Little update on this, more good discussion over the last few days,
> and
> > > > the FLIP will probably be amended to incorporate idleness.   Voting
> > will
> > > > remain open until, let's say, mid-next week.
> > > >
> > > > On Thu, Jun 3, 2021 at 8:00 AM Piotr Nowojski 
> > > > wrote:
> > > >
> > > >> I would like to ask you to hold on with counting the votes until I
> get
> > > an
> > > >> answer for my one question in the dev mailing list thread (sorry if
> it
> > > was
> > > >> already covered somewhere).
> > > >>
> > > >> Best, Piotrek
> > > >>
> > > >> czw., 3 cze 2021 o 16:12 Jark Wu  napisał(a):
> > > >>
> > > >> > +1 (binding)
> > > >> >
> > > >> > Best,
> > > >> > Jark
> > > >> >
> > > >> > On Thu, 3 Jun 2021 at 21:34, Dawid Wysakowicz <
> > dwysakow...@apache.org
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > +1 (binding)
> > > >> > >
> > > >> > > Best,
> > > >> > >
> > > >> > > Dawid
> > > >> > >
> > > >> > > On 03/06/2021 03:50, Zhou, Brian wrote:
> > > >> > > > +1 (non-binding)
> > > >> > > >
> > > >> > > > Thanks Eron, looking forward to seeing this feature soon.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Brian
> > > >> > > >
> > > >> > > > -Original Message-
> > > >> > > > From: Arvid Heise 
> > > >> > > > Sent: Wednesday, June 2, 2021 15:44
> > > >> > > > To: dev
> > > >> > > > Subject: Re: [VOTE] Watermark propagation with Sink API
> > > >> > > >
> > > >> > > >
> > > >> > > > [EXTERNAL EMAIL]
> > > >> > > >
> > > >> > > > +1 (binding)
> > > >> > > >
> > > >> > > > Thanks Eron for driving this effort; it will open up new
> > exciting
> > > >> use
> > > >> > > cases.
> > > >> > > >
> > > >> > > > On Tue, Jun 1, 2021 at 7:17 PM Eron Wright <
> > > ewri...@streamnative.io
> > > >> > > .invalid>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > >> After some good discussion about how to enhance the Sink API
> to
> > > >> > > >> process watermarks, I believe we're ready to proceed with a
> > vote.
> > > >> > > >> Voting will be open until at least Friday, June 4th, 2021.
> > > >> > > >>
> > > >> > > >> Reference:
> > > >> > > >>
> > > >> > > >>
> > > >> >
> > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/displa
> > > >> > > >>
> > > >>
> y/FLINK/FLIP-167*3A*Watermarks*for*Sink*API__;JSsrKys!!LpKI!zkBBhuqEEi
> > > >> > > >> fxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viFWXPLul-GCBb-PTq$
> > > >> > > >> [cwiki[.]apache[.]org]
> > > >> > > >>
> > > >> > > >> Discussion thread:
> > > >> > > >>
> > > >> > > >>
> > > >> >
> > > https://urldefense.com/v3/__https://lists.apache.org/thread.html/r5194
> > > >> > > >>
> > > >>
> e1cf157d1fd5ba7ca9b567cb01723bd582aa12dda57d25bca37e*40*3Cdev.flink.ap
> > > >> > > >> ache.org
> > > >> > *3E__;JSUl!!LpKI!zkBBhuqEEifxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viF
> > > >> > > >> WXPLul-GJXlxwqN$ [lists[.]apache[.]org]
> > > >> > > >>
> > > >> > > >> Implementation Issue:
> > > >> > > >>
> > > >> >
> > > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/FLIN
> > > >> > > >>
> > > >>
> K-22700__;!!LpKI!zkBBhuqEEifxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viFWXPLul-G
> > > >> > > >> N6AJm4h$ [issues[.]apache[.]org]
> > > >> > > >>
> > > >> > > >> Thanks,
> > > >> > > >> Eron Wright
> > > >> > > >> StreamNative
> > > >> > > >>
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>


[jira] [Created] (FLINK-22529) StateFun Kinesis ingresses should support configs that are available via FlinkKinesisConsumer's ConsumerConfigConstants

2021-04-29 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-22529:
---

 Summary: StateFun Kinesis ingresses should support configs that 
are available via FlinkKinesisConsumer's ConsumerConfigConstants
 Key: FLINK-22529
 URL: https://issues.apache.org/jira/browse/FLINK-22529
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai


The Kinesis ingress should support the configs that are available in 
{{FlinkKinesisConsumer}}'s {{ConsumerConfigConstants}}. Instead, currently, all 
property keys provided to the Kinesis ingress are assumed to be AWS-client 
related keys, and therefore have all been appended with the `aws.clientconfigs` 
string.

I'd suggest to avoid mixing the {{ConsumerConfigConstants}} configs within the 
properties as well. Having named methods on the {{KinesisIngressBuilder}} for 
those configuration would provide a cleaner solution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] Apache Flink Stateful Functions 3.0.0 released

2021-04-15 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache
Flink Stateful Functions (StateFun) 3.0.0.

StateFun is a cross-platform stack for building Stateful Serverless
applications, making it radically simpler to develop scalable, consistent,
and elastic distributed applications.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/news/2021/04/15/release-statefun-3.0.0.html

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for StateFun can be found at:
https://search.maven.org/search?q=g:org.apache.flink%20statefun

Python SDK for StateFun published to the PyPI index can be found at:
https://pypi.org/project/apache-flink-statefun/

Official Docker images for StateFun are published to Docker Hub:
https://hub.docker.com/r/apache/flink-statefun

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348822

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Cheers,
Gordon


Re: [PSA] Documentation is currently not being updated

2021-04-14 Thread Tzu-Li (Gordon) Tai
The same thing happened to the StateFun builds as well. They also have not
been running since April 7th.

After asking infra to reset the builds, the issue persisted.

I recently switched the node to asf945_ubuntu for the StateFun doc builds
because of this.

On Thu, Apr 15, 2021, 12:48 AM Chesnay Schepler  wrote:

> Just a quick PSA that our buildbot builds hasn't been running since
> April 7th, meaning that the documentation has not been updated since.
>
> This appears to be an issue on the INFRA side of things, for which I
> have opened a ticket .
>
> So don't be surprised if some change you made isn't visible ;)
>
>


Re: [VOTE] Apache Flink Stateful Functions 3.0.0, release candidate #1

2021-04-08 Thread Tzu-Li (Gordon) Tai
Thanks for voting and testing everyone!

We have a total of 6 +1 votes, 3 of which are binding:
- Igal Shilman
- Gordon Tai (binding)
- Konstantin Knauf
- Seth Wiesman
- Yu Li (binding)
- Robert Metzger (binding)

I'll proceed now with finalizing the release of StateFun 3.0.0.
The official announcement will likely happen next week, as we're finishing
up with the announcement blog post which would probably also take a few
days to be reviewed.

Thanks,
Gordon

On Thu, Apr 8, 2021 at 1:50 PM Robert Metzger  wrote:

> I see. Thanks a lot for clarifying.
>
> I then vote
>
> +1 (binding)
>
> on this release. Thanks a lot for driving this!
>
>
> On Thu, Apr 8, 2021 at 5:19 AM Tzu-Li (Gordon) Tai 
> wrote:
>
>> @Robert Metzger 
>>
>> Sorry, this is the correct link to the class file you are referring to
>> (previous link I mentioned is incorrect):
>>
>> https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-common/src/main/java/com/google/protobuf/MoreByteStrings.java
>>
>> On Thu, Apr 8, 2021 at 11:17 AM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> @Robert Metzger 
>>>
>>> I assume the com/google/protobuf classfile you found is this one:
>>>
>>> https://github.com/apache/flink-statefun/blob/master/statefun-shaded/statefun-protobuf-shaded/src/main/java/org/apache/flink/statefun/sdk/shaded/com/google/protobuf/MoreByteStrings.java
>>>
>>> This actually isn't a class pulled from a Protobuf dependency - it's
>>> code developed under StateFun.
>>> The package com/google/protobuf was required because the class exists
>>> essentially as a workaround to access some package-private protected
>>> methods on Protobuf.
>>>
>>> I believe that in this case, a NOTICE acknowledgement is not required as
>>> we actually own that piece of code.
>>>
>>> Let me know what you think and if this clears things up!
>>>
>>> Cheers,
>>> Gordon
>>>
>>> On Thu, Apr 8, 2021 at 4:00 AM Robert Metzger 
>>> wrote:
>>>
>>>> This jar contains a com/google/protobuf classfile, which is not
>>>> declared in
>>>> any NOTICE file (and doesn't ship the license file of protobuf):
>>>>
>>>> https://repository.apache.org/content/repositories/orgapacheflink-1415/org/apache/flink/statefun-flink-common/3.0.0/statefun-flink-common-3.0.0.jar
>>>>
>>>> I fear that this could be a blocker for the release?
>>>>
>>>> Otherwise, I did the following check:
>>>>
>>>> - src distribution looks fine: No binaries, js related files are
>>>> declared
>>>> (the copyright in the NOTICE file could be updated to 2021, but that's
>>>> not
>>>> a blocker)
>>>>
>>>>
>>>> On Fri, Apr 2, 2021 at 8:29 AM Yu Li  wrote:
>>>>
>>>> > +1 (binding)
>>>> >
>>>> > Checked sums and signatures: OK
>>>> > Checked RAT and end-to-end tests: OK
>>>> > Checked version in pom/README/setup.py files: OK
>>>> > Checked release notes: OK
>>>> > Checked docker PR: OK
>>>> >
>>>> > Thanks for driving this release, Gordon!
>>>> >
>>>> > Best Regards,
>>>> > Yu
>>>> >
>>>> >
>>>> > On Fri, 2 Apr 2021 at 09:22, Seth Wiesman 
>>>> wrote:
>>>> >
>>>> > > +1 (non-binding)
>>>> > >
>>>> > > - Built from source and executed end to end tests
>>>> > > - Checked licenses and signatures
>>>> > > - Deployed remote Java SDK to gke cluster
>>>> > > - Took savepoint and statefully rescaled
>>>> > >
>>>> > > Seth
>>>> > >
>>>> > > On Thu, Apr 1, 2021 at 9:05 AM Konstantin Knauf 
>>>> > wrote:
>>>> > >
>>>> > > > +1 (non-binding)
>>>> > > >
>>>> > > > - mvn clean install -Prun-e2e-tests (java 8) from source
>>>> > > > - python3 -m unittest tests
>>>> > > > - spin up Statefun Cluster on EKS with an image built from the
>>>> > > Dockerfiles
>>>> > > > of [1]
>>>> > > > - run Python & Java Greeter example on AWS Lambda
>>>> > > > - read through documentation (opened [2] to fix some tpoys)
>>>> > > >
>>>> &

Re: [VOTE] Apache Flink Stateful Functions 3.0.0, release candidate #1

2021-04-07 Thread Tzu-Li (Gordon) Tai
@Robert Metzger 

Sorry, this is the correct link to the class file you are referring to
(previous link I mentioned is incorrect):
https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-common/src/main/java/com/google/protobuf/MoreByteStrings.java

On Thu, Apr 8, 2021 at 11:17 AM Tzu-Li (Gordon) Tai 
wrote:

> @Robert Metzger 
>
> I assume the com/google/protobuf classfile you found is this one:
>
> https://github.com/apache/flink-statefun/blob/master/statefun-shaded/statefun-protobuf-shaded/src/main/java/org/apache/flink/statefun/sdk/shaded/com/google/protobuf/MoreByteStrings.java
>
> This actually isn't a class pulled from a Protobuf dependency - it's code
> developed under StateFun.
> The package com/google/protobuf was required because the class exists
> essentially as a workaround to access some package-private protected
> methods on Protobuf.
>
> I believe that in this case, a NOTICE acknowledgement is not required as
> we actually own that piece of code.
>
> Let me know what you think and if this clears things up!
>
> Cheers,
> Gordon
>
> On Thu, Apr 8, 2021 at 4:00 AM Robert Metzger  wrote:
>
>> This jar contains a com/google/protobuf classfile, which is not declared
>> in
>> any NOTICE file (and doesn't ship the license file of protobuf):
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1415/org/apache/flink/statefun-flink-common/3.0.0/statefun-flink-common-3.0.0.jar
>>
>> I fear that this could be a blocker for the release?
>>
>> Otherwise, I did the following check:
>>
>> - src distribution looks fine: No binaries, js related files are declared
>> (the copyright in the NOTICE file could be updated to 2021, but that's not
>> a blocker)
>>
>>
>> On Fri, Apr 2, 2021 at 8:29 AM Yu Li  wrote:
>>
>> > +1 (binding)
>> >
>> > Checked sums and signatures: OK
>> > Checked RAT and end-to-end tests: OK
>> > Checked version in pom/README/setup.py files: OK
>> > Checked release notes: OK
>> > Checked docker PR: OK
>> >
>> > Thanks for driving this release, Gordon!
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Fri, 2 Apr 2021 at 09:22, Seth Wiesman  wrote:
>> >
>> > > +1 (non-binding)
>> > >
>> > > - Built from source and executed end to end tests
>> > > - Checked licenses and signatures
>> > > - Deployed remote Java SDK to gke cluster
>> > > - Took savepoint and statefully rescaled
>> > >
>> > > Seth
>> > >
>> > > On Thu, Apr 1, 2021 at 9:05 AM Konstantin Knauf 
>> > wrote:
>> > >
>> > > > +1 (non-binding)
>> > > >
>> > > > - mvn clean install -Prun-e2e-tests (java 8) from source
>> > > > - python3 -m unittest tests
>> > > > - spin up Statefun Cluster on EKS with an image built from the
>> > > Dockerfiles
>> > > > of [1]
>> > > > - run Python & Java Greeter example on AWS Lambda
>> > > > - read through documentation (opened [2] to fix some tpoys)
>> > > >
>> > > > [1] https://github.com/apache/flink-statefun-docker/pull/13
>> > > > [2] https://github.com/apache/flink-statefun/pull/219
>> > > >
>> > > > On Thu, Apr 1, 2021 at 6:46 AM Tzu-Li (Gordon) Tai <
>> > tzuli...@apache.org>
>> > > > wrote:
>> > > >
>> > > > > +1 (binding)
>> > > > >
>> > > > > - verified signatures and hashes
>> > > > > - NOTICE and LICENSE files in statefun-flink-distribution,
>> > > > > statefun-protobuf-shaded, and statefun-sdk-java looks sane
>> > > > > - maven clean install -Prun-e2e-tests (java 8) from source
>> > > > > - ran all examples and tutorials in
>> apache/flink-statefun-playground
>> > > with
>> > > > > the new artifacts
>> > > > > - Ran my SDK verifier utility [1] against the new Java and Python
>> > SDKs.
>> > > > >
>> > > > > Cheers,
>> > > > > Gordon
>> > > > >
>> > > > > [1] https://github.com/tzulitai/statefun-sdk-verifier
>> > > > >
>> > > > > On Wed, Mar 31, 2021 at 8:50 PM Igal Shilman 
>> > > wrote:
>> > > > >
>> > > > > > Thanks Gordon for managing the release!
>> > > > > >
>

Re: [VOTE] Apache Flink Stateful Functions 3.0.0, release candidate #1

2021-04-07 Thread Tzu-Li (Gordon) Tai
@Robert Metzger 

I assume the com/google/protobuf classfile you found is this one:
https://github.com/apache/flink-statefun/blob/master/statefun-shaded/statefun-protobuf-shaded/src/main/java/org/apache/flink/statefun/sdk/shaded/com/google/protobuf/MoreByteStrings.java

This actually isn't a class pulled from a Protobuf dependency - it's code
developed under StateFun.
The package com/google/protobuf was required because the class exists
essentially as a workaround to access some package-private protected
methods on Protobuf.

I believe that in this case, a NOTICE acknowledgement is not required as we
actually own that piece of code.

Let me know what you think and if this clears things up!

Cheers,
Gordon

On Thu, Apr 8, 2021 at 4:00 AM Robert Metzger  wrote:

> This jar contains a com/google/protobuf classfile, which is not declared in
> any NOTICE file (and doesn't ship the license file of protobuf):
>
> https://repository.apache.org/content/repositories/orgapacheflink-1415/org/apache/flink/statefun-flink-common/3.0.0/statefun-flink-common-3.0.0.jar
>
> I fear that this could be a blocker for the release?
>
> Otherwise, I did the following check:
>
> - src distribution looks fine: No binaries, js related files are declared
> (the copyright in the NOTICE file could be updated to 2021, but that's not
> a blocker)
>
>
> On Fri, Apr 2, 2021 at 8:29 AM Yu Li  wrote:
>
> > +1 (binding)
> >
> > Checked sums and signatures: OK
> > Checked RAT and end-to-end tests: OK
> > Checked version in pom/README/setup.py files: OK
> > Checked release notes: OK
> > Checked docker PR: OK
> >
> > Thanks for driving this release, Gordon!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 2 Apr 2021 at 09:22, Seth Wiesman  wrote:
> >
> > > +1 (non-binding)
> > >
> > > - Built from source and executed end to end tests
> > > - Checked licenses and signatures
> > > - Deployed remote Java SDK to gke cluster
> > > - Took savepoint and statefully rescaled
> > >
> > > Seth
> > >
> > > On Thu, Apr 1, 2021 at 9:05 AM Konstantin Knauf 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - mvn clean install -Prun-e2e-tests (java 8) from source
> > > > - python3 -m unittest tests
> > > > - spin up Statefun Cluster on EKS with an image built from the
> > > Dockerfiles
> > > > of [1]
> > > > - run Python & Java Greeter example on AWS Lambda
> > > > - read through documentation (opened [2] to fix some tpoys)
> > > >
> > > > [1] https://github.com/apache/flink-statefun-docker/pull/13
> > > > [2] https://github.com/apache/flink-statefun/pull/219
> > > >
> > > > On Thu, Apr 1, 2021 at 6:46 AM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - verified signatures and hashes
> > > > > - NOTICE and LICENSE files in statefun-flink-distribution,
> > > > > statefun-protobuf-shaded, and statefun-sdk-java looks sane
> > > > > - maven clean install -Prun-e2e-tests (java 8) from source
> > > > > - ran all examples and tutorials in
> apache/flink-statefun-playground
> > > with
> > > > > the new artifacts
> > > > > - Ran my SDK verifier utility [1] against the new Java and Python
> > SDKs.
> > > > >
> > > > > Cheers,
> > > > > Gordon
> > > > >
> > > > > [1] https://github.com/tzulitai/statefun-sdk-verifier
> > > > >
> > > > > On Wed, Mar 31, 2021 at 8:50 PM Igal Shilman 
> > > wrote:
> > > > >
> > > > > > Thanks Gordon for managing the release!
> > > > > >
> > > > > > +1 (non binding) from my side:
> > > > > >
> > > > > > Here are the results of my testing:
> > > > > > - verified the signatures
> > > > > > - verified that the source distribution doesn't contain any
> binary
> > > > files
> > > > > > - ran mvn clean install -Prun-e2e-tests with java8
> > > > > > - ran the smoke test that sends 100 million messages locally.
> > > > > > - extended the smoke test to include the remote sdks (1 function
> in
> > > the
> > > > > > Java SDK, 1 function in the Python SDK), and it passes.
> > > > > > - deployed to kubernetes with m

Re: [VOTE] Apache Flink Stateful Functions 3.0.0, release candidate #1

2021-03-31 Thread Tzu-Li (Gordon) Tai
+1 (binding)

- verified signatures and hashes
- NOTICE and LICENSE files in statefun-flink-distribution,
statefun-protobuf-shaded, and statefun-sdk-java looks sane
- maven clean install -Prun-e2e-tests (java 8) from source
- ran all examples and tutorials in apache/flink-statefun-playground with
the new artifacts
- Ran my SDK verifier utility [1] against the new Java and Python SDKs.

Cheers,
Gordon

[1] https://github.com/tzulitai/statefun-sdk-verifier

On Wed, Mar 31, 2021 at 8:50 PM Igal Shilman  wrote:

> Thanks Gordon for managing the release!
>
> +1 (non binding) from my side:
>
> Here are the results of my testing:
> - verified the signatures
> - verified that the source distribution doesn't contain any binary files
> - ran mvn clean install -Prun-e2e-tests with java8
> - ran the smoke test that sends 100 million messages locally.
> - extended the smoke test to include the remote sdks (1 function in the
> Java SDK, 1 function in the Python SDK), and it passes.
> - deployed to kubernetes with minio as an S3 replacement.
>
>
> On Tue, Mar 30, 2021 at 12:29 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version 3.0.0
> of
> > Apache Flink Stateful Functions, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > **Testing Guideline**
> >
> > You can find here [1] a page in the project wiki on instructions for
> > testing.
> > To cast a vote, it is not necessary to perform all listed checks,
> > but please mention which checks you have performed when voting.
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Stateful Functions canonical source distribution, to be deployed to
> the
> > release repository at dist.apache.org
> > b) Stateful Functions Python SDK distributions to be deployed to PyPI
> > c) Maven artifacts to be deployed to the Maven Central Repository
> > d) New Dockerfiles for the release
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as
> follows,
> > for your review:
> > * All artifacts for a) and b) can be found in the corresponding dev
> > repository at dist.apache.org [2]
> > * All artifacts for c) can be found at the Apache Nexus Repository [3]
> >
> > All artifacts are signed with the key
> > 1C1E2394D3194E1944613488F320986D35C33D6A [4]
> >
> > Other links for your review:
> > * JIRA release notes [5]
> > * source code tag “release-3.0.0-rc1” [6]
> > * PR for the new Dockerfiles [7]
> >
> > **Vote Duration**
> >
> > The voting time will run for at least 72 hours. I’m targeting this vote
> to
> > last until April. 2nd, 12pm CET.
> > It is adopted by majority approval, with at least 3 PMC affirmative
> votes.
> >
> > Thanks,
> > Gordon
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.0.0-rc1/
> > [3]
> > https://repository.apache.org/content/repositories/orgapacheflink-1415/
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348822
> > [6] https://github.com/apache/flink-statefun/tree/release-3.0.0-rc1
> > [7] https://github.com/apache/flink-statefun-docker/pull/13
> >
>


Re: [DISCUSS] Releasing Stateful Functions 3.0.0

2021-03-30 Thread Tzu-Li (Gordon) Tai
@Konstantin Knauf 
Yes, we'll make sure that migration docs are added before finalizing the
release.

I've proceeded to create RC1 for StateFun 3.0.0. Voting is happening on a
separate thread.

On Tue, Mar 30, 2021 at 3:06 AM Robert Metzger  wrote:

> Wow, the feature list sounds really exciting!
>
> No concerns from my side!
>
> On Thu, Mar 25, 2021 at 1:57 PM Konstantin Knauf 
> wrote:
>
> > Hi Gordon,
> >
> > Thank you for the update. +1 for a timely release. For existing Statefun
> > users, is there already something in the documentation that describes the
> > breaking changes/migration in more detail in order to prepare?
> >
> > Cheers,
> >
> > Konstantin
> >
> > On Thu, Mar 25, 2021 at 9:27 AM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > Hi everyone,
> > >
> > > We'd like to prepare to release StateFun 3.0.0 over the next few days,
> > > ideally starting the first release candidate early next week.
> > >
> > > This is a version bump from 2.x to 3.x, with some major new features
> and
> > > reworks:
> > >
> > >- New request-reply protocol:
> > >the protocol has been reworked as StateFun is moving forward
> towards a
> > >"remote functions first" design. The new protocol enhances StateFun
> > > apps to
> > >be hot upgraded without restarting the StateFun runtime, including
> > >registering new state for functions, and adding new functions to the
> > > app.
> > >- Cross-language type system:
> > >The new protocol also enables a much more ergonomic, cross-language
> > type
> > >system. This makes it much easier and natural for users to send
> > > messages of
> > >various types around between their functions (primitive types, or
> > custom
> > >types such as JSON messages).
> > >- Java SDK for remote functions: Going remote first, we've now also
> > >added a new Java SDK for remote functions.
> > >
> > > These are some major features that users would benefit from greatly.
> > > Since this release also contains breaking changes, it's nice if we can
> > get
> > > this out earlier so that new users would not be onboarded with APIs
> that
> > > are going to be immediately deprecated.
> > >
> > > We're in the final stages of preparing documentation and examples [1]
> to
> > go
> > > with the release, and would like to kick off the release candidates
> early
> > > next week.
> > >
> > > Please let us know if you have any concerns.
> > >
> > > Thanks,
> > > Gordon
> > >
> > > [1] https://github.com/apache/flink-statefun-playground
> > >
> >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>


[VOTE] Apache Flink Stateful Functions 3.0.0, release candidate #1

2021-03-30 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on the release candidate #1 for the version 3.0.0 of
Apache Flink Stateful Functions, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Testing Guideline**

You can find here [1] a page in the project wiki on instructions for
testing.
To cast a vote, it is not necessary to perform all listed checks,
but please mention which checks you have performed when voting.

**Release Overview**

As an overview, the release consists of the following:
a) Stateful Functions canonical source distribution, to be deployed to the
release repository at dist.apache.org
b) Stateful Functions Python SDK distributions to be deployed to PyPI
c) Maven artifacts to be deployed to the Maven Central Repository
d) New Dockerfiles for the release

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as follows,
for your review:
* All artifacts for a) and b) can be found in the corresponding dev
repository at dist.apache.org [2]
* All artifacts for c) can be found at the Apache Nexus Repository [3]

All artifacts are signed with the key
1C1E2394D3194E1944613488F320986D35C33D6A [4]

Other links for your review:
* JIRA release notes [5]
* source code tag “release-3.0.0-rc1” [6]
* PR for the new Dockerfiles [7]

**Vote Duration**

The voting time will run for at least 72 hours. I’m targeting this vote to
last until April. 2nd, 12pm CET.
It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
[2] https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.0.0-rc1/
[3] https://repository.apache.org/content/repositories/orgapacheflink-1415/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12348822
[6] https://github.com/apache/flink-statefun/tree/release-3.0.0-rc1
[7] https://github.com/apache/flink-statefun-docker/pull/13


[jira] [Created] (FLINK-22023) Remove outdated StateFun quickstart archetype

2021-03-30 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-22023:
---

 Summary: Remove outdated StateFun quickstart archetype
 Key: FLINK-22023
 URL: https://issues.apache.org/jira/browse/FLINK-22023
 Project: Flink
  Issue Type: Task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.0.0


The StateFun Maven quickstart archetype should be removed, because it is 
outdated (only works for embedded functions).

We can add a quickstart archetype for Java remote SDKs in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Releasing Stateful Functions 3.0.0

2021-03-25 Thread Tzu-Li (Gordon) Tai
Hi everyone,

We'd like to prepare to release StateFun 3.0.0 over the next few days,
ideally starting the first release candidate early next week.

This is a version bump from 2.x to 3.x, with some major new features and
reworks:

   - New request-reply protocol:
   the protocol has been reworked as StateFun is moving forward towards a
   "remote functions first" design. The new protocol enhances StateFun apps to
   be hot upgraded without restarting the StateFun runtime, including
   registering new state for functions, and adding new functions to the app.
   - Cross-language type system:
   The new protocol also enables a much more ergonomic, cross-language type
   system. This makes it much easier and natural for users to send messages of
   various types around between their functions (primitive types, or custom
   types such as JSON messages).
   - Java SDK for remote functions: Going remote first, we've now also
   added a new Java SDK for remote functions.

These are some major features that users would benefit from greatly.
Since this release also contains breaking changes, it's nice if we can get
this out earlier so that new users would not be onboarded with APIs that
are going to be immediately deprecated.

We're in the final stages of preparing documentation and examples [1] to go
with the release, and would like to kick off the release candidates early
next week.

Please let us know if you have any concerns.

Thanks,
Gordon

[1] https://github.com/apache/flink-statefun-playground


[jira] [Created] (FLINK-21904) parseJmJvmArgsAndExportLogs: command not found warning when starting StateFun

2021-03-21 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-21904:
---

 Summary: parseJmJvmArgsAndExportLogs: command not found warning 
when starting StateFun
 Key: FLINK-21904
 URL: https://issues.apache.org/jira/browse/FLINK-21904
 Project: Flink
  Issue Type: Bug
  Components: Stateful Functions
Affects Versions: statefun-2.2.2
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.0.0


I'm seeing this warning in E2E logs:
{code}
11:37:12,572 ERROR 
org.apache.flink.statefun.e2e.remote.ExactlyOnceWithRemoteFnE2E  - 
/opt/flink/bin/standalone-job.sh: line 43: parseJmJvmArgsAndExportLogs: command 
not found
{code}

This was caused by FLINK-19662, which renamed {{parseJmJvmArgsAndExportLogs}} 
to {{parseJmArgsAndExportLogs}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21901) Update StateFun version to 3.0-SNAPSHOT

2021-03-21 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-21901:
---

 Summary: Update StateFun version to 3.0-SNAPSHOT
 Key: FLINK-21901
 URL: https://issues.apache.org/jira/browse/FLINK-21901
 Project: Flink
  Issue Type: Task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.0.0


Our version is still on 2.3-SNAPSHOT in the main repository, since directly 
jumping to 3.0-SNAPSHOT was something that was decided during the development 
cycle.

To prepare for the upcoming release, we should update the main branch to 
3.0-SNAPSHOT already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   >