Re: [VOTE] Release 2.0-preview1, release candidate #1

2024-10-16 Thread Jingsong Li
+1 (binding)

- Downloaded artifacts from dist
- Verified SHA512 checksum
- Verified GPG signature
- Build the source with java-11

Best,
Jingsong

On Thu, Oct 17, 2024 at 11:53 AM Yunfeng Zhou
 wrote:
>
> +1 (non-binding)
>
> - Verified checksums
> - Built from source
> - Reviewed release notes
> - Ran WordCount example and it works as expected
>
> Best,
> Yunfeng
>
>
> > 2024年10月13日 00:46,Xintong Song  写道:
> >
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version
> > 2.0-preview1, as follows:
> >
> > [ ] +1, Approve the release
> >
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1]
> > * the official Apache source release and binary convenience releases to be
> > deployed to dist.apache.org [2] (PyFlink artifacts are excluded because
> > PyPI does not support preview versions), which are signed with the key with
> > fingerprint 8D56AE6E7082699A4870750EA4E8C4C05EE6861F [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-2.0-preview1-rc1" [5],
> > * website pull request listing the new release and adding announcement blog
> > post [6].
> >
> > *Please note that Flink 2.0-preview-1 is not a stable version and should
> > not be used in production environments. Therefore, functionality tests
> > should not be the focus of verifications for this release candidate.*
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1]
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12355070
> >
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-2.0-preview1-rc1/
> >
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >
> > [4] https://repository.apache.org/content/repositories/orgapacheflink-1761/
> >
> > [5] https://github.com/apache/flink/releases/tag/release-2.0-preview1-rc1
> >
> > [6] https://github.com/apache/flink-web/pull/754
>


Re: [ANNOUNCE] Apache Flink 1.20.0 released

2024-08-03 Thread Jingsong Li
Congrats! Thanks all

Ferenc Csaky 于2024年8月2日 周五19:43写道:

> Congrats!
>
> Thanks for the effort to the release mgrs and everyone involved!
>
> Best,
> Ferenc
>
>
>
> On Friday, 2 August 2024 at 12:16, Jiadong Lu  wrote:
>
> >
> >
> > Congrats!
> > Best regards,
> > Jiadong Lu
> >
> > On 2024/8/2 18:13, Aleksandr Pilipenko wrote:
> >
> > > Congrats!
> > >
> > > Best,
> > > Aleksandr
> > >
> > > On Fri, 2 Aug 2024 at 11:11, Feng Jin jinfeng1...@gmail.com wrote:
> > >
> > > > Congratulations!
> > > >
> > > > Thanks to release managers and everyone involved!
> > > >
> > > > Best,
> > > > Feng Jin
> > > >
> > > > On Fri, Aug 2, 2024 at 5:45 PM Yubin Li lyb5...@gmail.com wrote:
> > > >
> > > > > Congrats!
> > > > > Thanks to release managers and everyone involved for the excellent
> work !
> > > > >
> > > > > Best,
> > > > > Yubin Li
> > > > >
> > > > > On Fri, Aug 2, 2024 at 5:41 PM Paul Lam paullin3...@gmail.com
> wrote:
> > > > >
> > > > > > Congrats!
> > > > > >
> > > > > > Best,
> > > > > > Paul Lam
> > > > > >
> > > > > > > 2024年8月2日 17:26,Ahmed Hamdy hamdy10...@gmail.com 写道:
> > > > > > >
> > > > > > > Congratulations!
> > > > > > > Thanks Weijie and the release managers for the huge efforts.
> > > > > > > Best Regards
> > > > > > > Ahmed Hamdy
> > > > > > >
> > > > > > > On Fri, 2 Aug 2024 at 10:15, Zakelly Lan zakelly@gmail.com
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations! Thanks to release managers and everyone
> involved!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Zakelly
> > > > > > > >
> > > > > > > > On Fri, Aug 2, 2024 at 5:05 PM weijie guo <
> > > > > > > > guoweijieres...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The Apache Flink community is very happy to announce the
> release of
> > > > > > > > > Apache
> > > > > > > > >
> > > > > > > > > Flink 1.20.0, which is the first release for the Apache
> Flink 1.20
> > > > > > > > > series.
> > > > > > > > >
> > > > > > > > > Apache Flink® is an open-source stream processing
> framework for
> > > > > > > > >
> > > > > > > > > distributed, high-performing, always-available, and
> accurate data
> > > > > > > > > streaming
> > > > > > > > >
> > > > > > > > > applications.
> > > > > > > > >
> > > > > > > > > The release is available for download at:
> > > > > > > > >
> > > > > > > > > https://flink.apache.org/downloads.html
> > > > > > > > >
> > > > > > > > > Please check out the release blog post for an overview of
> the
> > > > > > > > > improvements
> > > > > > > > > for this release:
> > > >
> > > >
> https://flink.apache.org/2024/08/02/announcing-the-release-of-apache-flink-1.20/
> > > >
> > > > > > > > > The full release notes are available in Jira:
> > > >
> > > >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354210
> > > >
> > > > > > > > > We would like to thank all contributors of the Apache Flink
> > > > > > > > > community who
> > > > > > > > >
> > > > > > > > > made this release possible!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > >
> > > > > > > > > Robert, Rui, Ufuk, Weijie
>


Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-06-26 Thread Jingsong Li
+1 to release a preview version.

Best,
Jingsong

On Wed, Jun 26, 2024 at 10:12 AM Jark Wu  wrote:
>
> I also think this should not block new feature development.
> Having "nice-to-have" and "must-to-have" tags on the FLIPs is a good idea.
>
> For the downstream projects, I think we need to release a 2.0 preview
> version one or
> two months before the formal release. This can leave some time for the
> downstream
> projects to integrate and provide feedback. So we can fix the problems
> (e.g. unexpected
> breaking changes, Java versions) before 2.0.
>
> Best,
> Jark
>
> On Wed, 26 Jun 2024 at 09:39, Xintong Song  wrote:
>
> > I also don't think we should block new feature development until 2.0. From
> > my understanding, the new major release is no different from the regular
> > minor releases for new features.
> >
> > I think tracking new features, either as nice-to-have items or in a
> > separate list, is necessary. It helps us understand what's going on in the
> > release cycle, and what to announce and promote. Maybe we should start a
> > discussion on updating the 2.0 item list, to 1) collect new items that are
> > proposed / initiated after the original list being created and 2) to remove
> > some items that are no longer suitable. I'll discuss this with the other
> > release managers first.
> >
> > For the connectors and operators, I think it depends on whether they depend
> > on any deprecated APIs or internal implementations of Flink. Ideally,
> > all @Public APIs and @PublicEvolving APIs that we plan to change / remove
> > should have been deprecated in 1.19 and 1.20 respectively. That means if
> > the connectors and operators only use non-deprecated @Puclib
> > and @PublicEvolving APIs in 1.20, hopefully there should not be any
> > problems upgrading to 2.0.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Wed, Jun 26, 2024 at 5:20 AM Becket Qin  wrote:
> >
> > > Thanks for the question, Matthias.
> > >
> > > My two cents, I don't think we are blocking new feature development. My
> > > understanding is that the community will just prioritize removing
> > > deprecated APIs in the 2.0 dev cycle. Because of that, it is possible
> > that
> > > some new feature development may slow down a little bit since some
> > > contributors may be working on the must-have features for 2.0. But policy
> > > wise, I don't see a reason to block the new feature development for the
> > 2.0
> > > release feature plan[1].
> > >
> > > Process wise, I like your idea of adding the new features as nice-to-have
> > > in the 2.0 feature list.
> > >
> > > Re: David,
> > > Given it is a major version bump. It is possible that some of the
> > > downstream projects (e.g. connectors, Paimon, etc) will have to see if a
> > > major version bump is also needed there. And it is probably going to be
> > > decisions made on a per-project basis.
> > > Regarding the Java version specifically, this probably worth a separate
> > > discussion. According to a recent report[2] on the state of Java, it
> > might
> > > be a little early to drop support for Java 11. We can discuss this
> > > separately.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > [2]
> > >
> > >
> > https://newrelic.com/sites/default/files/2024-04/new-relic-state-of-the-java-ecosystem-report-2024-04-30.pdf
> > >
> > > On Tue, Jun 25, 2024 at 4:58 AM David Radley 
> > > wrote:
> > >
> > > > Hi,
> > > > I think this is a great question. I am not sure if this has been
> > covered
> > > > elsewhere, but it would be good to be clear how this effects the
> > > connectors
> > > > and operator repos, with potentially v1 and v2 oriented new featuresI
> > > > suspect this will be a connector by connector investigation. I am
> > > thinking
> > > > connectors with Hadoop eco-system dependencies (e.g. Paimon) which may
> > > not
> > > > work nicely with Java 17,
> > > >
> > > >  Kind regards, David.
> > > >
> > > >
> > > > From: Matthias Pohl 
> > > > Date: Tuesday, 25 June 2024 at 09:57
> > > > To: dev@flink.apache.org 
> > > > Cc: Xintong Song , martijnvis...@apache.org <
> > > > martijnvis...@apache.org>, imj...@gmail.com ,
> > > > becket@gmail.com 
> > > > Subject: [EXTERNAL] [2.0] How to handle on-going feature development in
> > > > Flink 2.0?
> > > > Hi 2.0 release managers,
> > > > With the 1.20 release branch being cut [1], master is now referring to
> > > > 2.0-SNAPSHOT. I remember that, initially, the community had the idea of
> > > > keeping the 2.0 release as small as possible focusing on API changes
> > [2].
> > > >
> > > > What does this mean for new features? I guess blocking them until 2.0
> > is
> > > > released is not a good option. Shall we treat new features as
> > > > "nice-to-have" items as documented in the 2.0 release overview [3] and
> > > > merge them to master like it was done for minor releases in the past?
> > Do
> > > > you want to add a separate sectio

Re: [VOTE] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-18 Thread Jingsong Li
+1 binding

On Tue, Jun 18, 2024 at 11:54 AM Feng Jin  wrote:
>
> +1 (non-binding)
>
>
> Best,
> Feng Jin
>
>
> On Tue, Jun 18, 2024 at 10:24 AM Lincoln Lee  wrote:
>
> > +1 (binding)
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Xintong Song  于2024年6月17日周一 13:39写道:
> >
> > > +1 (binding)
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Mon, Jun 17, 2024 at 11:41 AM Zhanghao Chen <
> > zhanghao.c...@outlook.com>
> > > wrote:
> > >
> > > > +1 (unbinding)
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > From: weijie guo 
> > > > Sent: Monday, June 17, 2024 10:13
> > > > To: dev 
> > > > Subject: [VOTE] FLIP-462: Support Custom Data Distribution for Input
> > > > Stream of Lookup Join
> > > >
> > > > Hi everyone,
> > > >
> > > >
> > > > Thanks for all the feedback about the FLIP-462: Support Custom Data
> > > > Distribution for Input Stream of Lookup Join [1]. The discussion
> > > > thread is here [2].
> > > >
> > > >
> > > > The vote will be open for at least 72 hours unless there is an
> > > > objection or insufficient votes.
> > > >
> > > >
> > > > Best,
> > > >
> > > > Weijie
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join
> > > >
> > > >
> > > > [2] https://lists.apache.org/thread/kds2zrcdmykrz5lmn0hf9m4phdl60nfb
> > > >
> > >
> >


Re: Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-11 Thread Jingsong Li
Hi all,

+1 to this FLIP, very thanks all for your proposal.

isDeterministic looks good to me too.

We can consider stating the following points:

1. How to enable custom data distribution? Is it a dynamic hint? Can
you provide an SQL example.

2. What impact will it have when the mainstream is changelog? Causing
disorder? This may need to be emphasized.

3. Does this feature work in batch mode too?

Best,
Jingsong

On Tue, Jun 11, 2024 at 8:22 PM Wencong Liu  wrote:
>
> Hi Lincoln,
>
>
> Thanks for your reply. Weijie and I discussed these two issues offline,
> and here are the results of our discussion:
> 1. When the user utilizes the hash lookup join hint introduced by FLIP-204[1],
> the `SupportsLookupCustomShuffle` interface should be ignored. This is because
> the hash lookup join hint is directly specified by the user through a SQL 
> HINT,
> which is more in line with user intuition. WDYT?
> 2. We agree with the introduction of the `isDeterministic` method. The
> `SupportsLookupCustomShuffle` interface introduces a custom shuffle, which
> can cause ADD/UPDATE_AFTER events (+I, +U) to appear
> after UPDATE_BEFORE/DELETE events (-D, -U), thus breaking the current
> limitations of the Flink Sink Operator[2]. If `isDeterministic` returns false 
> and the
> changelog event type is not insert-only, the Planner should not apply the 
> shuffle
> provided by `SupportsLookupCustomShuffle`.
>
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
> [2] 
> https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
>
>
> Best,
> Wencong
>
>
>
>
>
>
>
>
>
> At 2024-06-11 00:02:57, "Lincoln Lee"  wrote:
> >Hi Weijie,
> >
> >Thanks for your proposal, this will be a useful advanced optimization for
> >connector developers!
> >
> >I have two questions:
> >
> >1. FLIP-204[1] hash lookup join hint is mentioned in this FLIP, what's the
> >apply ordering of the two feature? For example, a connector that
> >implements the `SupportsLookupCustomShuffle` interface also has a
> >`SHUFFLE_HASH` lookup join hint specified by the user in sql, what's
> >the expected behavior?
> >
> >2. This FLIP considers the relationship with NDU processing, and I agree
> >with the current choice to prioritize NDU first. However, we should also
> >consider another issue: out-of-orderness of the changelog events in
> >streaming[2]. If the connector developer supplies a non-deterministic
> >partitioner, e.g., a random partitioner for anti-skew purpose, then it'll
> >break the assumption relied by current SQL operators in streaming: the
> >ADD/UDPATE_AFTER events (+I, +U) always occur before its related
> >UDPATE_BEFORE/DELETE events (-D, -U) and they are always
> >processed by the same task even if a data shuffle is involved. So a
> >straightforward approach would be to add method `isDeterministic` to
> >the `InputDataPartitioner` interface to explicitly tell the planner whether
> >the partitioner is deterministic or not(then the planner can reject the
> >non-deterministic custom partitioner for correctness requirements).
> >
> >[1]
> >https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
> >[2]
> >https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
> >
> >
> >Best,
> >Lincoln Lee
> >
> >
> >Xintong Song  于2024年6月7日周五 13:53写道:
> >
> >> +1 for this proposal.
> >>
> >> This FLIP will make it possible for each lookup join parallel task to only
> >> access and cache a subset of the data. This will significantly improve the
> >> performance and reduce the overhead when using Paimon for the dimension
> >> table. And it's general enough to also be leveraged by other connectors.
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Fri, Jun 7, 2024 at 10:01 AM weijie guo 
> >> wrote:
> >>
> >> > Hi devs,
> >> >
> >> >
> >> > I'd like to start a discussion about FLIP-462[1]: Support Custom Data
> >> > Distribution for Input Stream of Lookup Join.
> >> >
> >> >
> >> > Lookup Join is an important feature in Flink, It is typically used to
> >> > enrich a table with data that is queried from an external system.
> >> > If we interact with the external systems for each incoming record, we
> >> > incur significant network IO and RPC overhead.
> >> >
> >> > Therefore, most connectors introduce caching to reduce the per-record
> >> > level query overhead. However, because the data distribution of Lookup
> >> > Join's input stream is arbitrary, the cache hit rate is sometimes
> >> > unsatisfactory.
> >> >
> >> >
> >> > We want to introduce a mechanism for the connector to tell the Flink
> >> > planner its desired input stream data distribution or partitioning
> >> > strategy. This can significantly reduce the amount of cached data and
> >> > improve performance of Lookup Join.
> >> >
> >> >
> >> > You can find more details in this FLIP[1]. Looking forward to hearing
> >> > from you, thanks!

Re: Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-19 Thread Jingsong Li
CC to the Paimon community.

Best,
Jingsong

On Mon, May 20, 2024 at 9:55 AM Jingsong Li  wrote:
>
> Amazing, congrats!
>
> Best,
> Jingsong
>
> On Sat, May 18, 2024 at 3:10 PM 大卫415 <2446566...@qq.com.invalid> wrote:
> >
> > 退订
> >
> >
> >
> >
> >
> >
> >
> > Original Email
> >
> >
> >
> > Sender:"gongzhongqiang"< gongzhongqi...@apache.org >;
> >
> > Sent Time:2024/5/17 23:10
> >
> > To:"Qingsheng Ren"< re...@apache.org >;
> >
> > Cc recipient:"dev"< dev@flink.apache.org >;"user"< u...@flink.apache.org 
> > >;"user-zh"< user...@flink.apache.org >;"Apache Announce List"< 
> > annou...@apache.org >;
> >
> > Subject:Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released
> >
> >
> > Congratulations !
> > Thanks for all contributors.
> >
> >
> > Best,
> >
> > Zhongqiang Gong
> >
> > Qingsheng Ren  于 2024年5月17日周五 17:33写道:
> >
> > > The Apache Flink community is very happy to announce the release of
> > > Apache Flink CDC 3.1.0.
> > >
> > > Apache Flink CDC is a distributed data integration tool for real time
> > > data and batch data, bringing the simplicity and elegance of data
> > > integration via YAML to describe the data movement and transformation
> > > in a data pipeline.
> > >
> > > Please check out the release blog post for an overview of the release:
> > >
> > > 
> > https://flink.apache.org/2024/05/17/apache-flink-cdc-3.1.0-release-announcement/
> > >
> > > The release is available for download at:
> > > https://flink.apache.org/downloads.html
> > >
> > > Maven artifacts for Flink CDC can be found at:
> > > https://search.maven.org/search?q=g:org.apache.flink%20cdc
> > >
> > > The full release notes are available in Jira:
> > >
> > > 
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354387
> > >
> > > We would like to thank all contributors of the Apache Flink community
> > > who made this release possible!
> > >
> > > Regards,
> > > Qingsheng Ren
> > >


Re: [VOTE] Apache Flink CDC Release 3.1.0, release candidate #3

2024-05-16 Thread Jingsong Li
+1 (binding)

- Verified signature and checksum hash
- Verified that no binaries exist in the source archive
- Build source code successful
- Reviewed the release web PR
- Check tag and branch exist

Best,
Jingsong

On Thu, May 16, 2024 at 4:25 PM Jark Wu  wrote:
>
> +1 (binding)
>
> - checked signatures
> - checked hashes
> - checked release notes
> - reviewed the release web PR
> - checked the jars in the staging repo
> - build and compile the source code locally with jdk8
>
> Best,
> Jark
>
> On Wed, 15 May 2024 at 16:05, gongzhongqiang 
> wrote:
>
> > +1 (non-binding)
> >
> > - Verified signature and checksum hash
> > - Verified that no binaries exist in the source archive
> > - Build source code successful on ubuntu 22.04 with jdk8
> > - Check tag and branch exist
> > - Check jars are built by jdk8
> >
> > Best,
> > Zhongqiang Gong
> >
> > Qingsheng Ren  于2024年5月11日周六 10:10写道:
> >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #3 for the version 3.1.0
> > of
> > > Apache Flink CDC, as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > **Release Overview**
> > >
> > > As an overview, the release consists of the following:
> > > a) Flink CDC source release to be deployed to dist.apache.org
> > > b) Maven artifacts to be deployed to the Maven Central Repository
> > >
> > > **Staging Areas to Review**
> > >
> > > The staging areas containing the above mentioned artifacts are as
> > follows,
> > > for your review:
> > > * All artifacts for a) can be found in the corresponding dev repository
> > at
> > > dist.apache.org [1], which are signed with the key with fingerprint
> > > A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> > > * All artifacts for b) can be found at the Apache Nexus Repository [3]
> > >
> > > Other links for your review:
> > > * JIRA release notes [4]
> > > * Source code tag "release-3.1.0-rc3" with commit hash
> > > 5452f30b704942d0ede64ff3d4c8699d39c63863 [5]
> > > * PR for release announcement blog post of Flink CDC 3.1.0 in flink-web
> > [6]
> > >
> > > **Vote Duration**
> > >
> > > The voting time will run for at least 72 hours, adopted by majority
> > > approval with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Qingsheng Ren
> > >
> > > [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc3/
> > > [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [3]
> > https://repository.apache.org/content/repositories/orgapacheflink-1733
> > > [4]
> > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354387
> > > [5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc3
> > > [6] https://github.com/apache/flink-web/pull/739
> > >
> >


Re: [VOTE] Apache Flink CDC Release 3.1.0, release candidate #1

2024-05-07 Thread Jingsong Li
-1

Thanks Qingsheng for preparing this RC.

If you bundle third-party dependencies (non Flink dependencies) in
your published jar, you need to write them in the NOTICE file.

I recommend adding a test using flink-ci-tools to verify if the NOTICE
file is correct. Of course, you cannot rely too much on tools and
still need to manually verify if it is correct.

Best,
Jingsong

On Sat, May 4, 2024 at 7:45 PM Ahmed Hamdy  wrote:
>
> Hi Qisheng,
>
> +1 (non-binding)
>
> - Verified checksums and hashes
> - Verified signatures
> - Verified github tag exists
> - Verified no binaries in source
> - build source
>
>
> Best Regards
> Ahmed Hamdy
>
>
> On Fri, 3 May 2024 at 23:03, Jeyhun Karimov  wrote:
>
> > Hi Qinsheng,
> >
> > Thanks for driving the release.
> > +1 (non-binding)
> >
> > - No binaries in source
> > - Verified Signatures
> > - Github tag exists
> > - Build source
> >
> > Regards,
> > Jeyhun
> >
> > On Thu, May 2, 2024 at 10:52 PM Muhammet Orazov
> >  wrote:
> >
> > > Hey Qingsheng,
> > >
> > > Thanks a lot! +1 (non-binding)
> > >
> > > - Checked sha512sum hash
> > > - Checked GPG signature
> > > - Reviewed release notes
> > > - Reviewed GitHub web pr (added minor suggestions)
> > > - Built the source with JDK 11 & 8
> > > - Checked that src doesn't contain binary files
> > >
> > > Best,
> > > Muhammet
> > >
> > > On 2024-04-30 05:11, Qingsheng Ren wrote:
> > > > Hi everyone,
> > > >
> > > > Please review and vote on the release candidate #1 for the version
> > > > 3.1.0 of
> > > > Apache Flink CDC, as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > **Release Overview**
> > > >
> > > > As an overview, the release consists of the following:
> > > > a) Flink CDC source release to be deployed to dist.apache.org
> > > > b) Maven artifacts to be deployed to the Maven Central Repository
> > > >
> > > > **Staging Areas to Review**
> > > >
> > > > The staging areas containing the above mentioned artifacts are as
> > > > follows,
> > > > for your review:
> > > > * All artifacts for a) can be found in the corresponding dev repository
> > > > at
> > > > dist.apache.org [1], which are signed with the key with fingerprint
> > > > A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> > > > * All artifacts for b) can be found at the Apache Nexus Repository [3]
> > > >
> > > > Other links for your review:
> > > > * JIRA release notes [4]
> > > > * Source code tag "release-3.1.0-rc1" with commit hash
> > > > 63b42cb937d481f558209ab3c8547959cf039643 [5]
> > > > * PR for release announcement blog post of Flink CDC 3.1.0 in flink-web
> > > > [6]
> > > >
> > > > **Vote Duration**
> > > >
> > > > The voting time will run for at least 72 hours, adopted by majority
> > > > approval with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Qingsheng Ren
> > > >
> > > > [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.1.0-rc1/
> > > > [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [3]
> > > > https://repository.apache.org/content/repositories/orgapacheflink-1731
> > > > [4]
> > > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354387
> > > > [5] https://github.com/apache/flink-cdc/releases/tag/release-3.1.0-rc1
> > > > [6] https://github.com/apache/flink-web/pull/739
> > >
> >


Re: [VOTE] FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-04-18 Thread Jingsong Li
+1

On Thu, Apr 18, 2024 at 11:09 AM Yun Tang  wrote:
>
> +1 (binding)
>
> Best,
> Yun Tang
> 
> From: Jark Wu 
> Sent: Thursday, April 18, 2024 9:54
> To: dev@flink.apache.org 
> Subject: Re: [VOTE] FLIP-435: Introduce a New Materialized Table for 
> Simplifying Data Pipelines
>
> +1 (binding)
>
> Best,
> Jark
>
> On Wed, 17 Apr 2024 at 20:52, Leonard Xu  wrote:
>
> > +1(binding)
> >
> > Best,
> > Leonard
> >
> > > 2024年4月17日 下午8:31,Lincoln Lee  写道:
> > >
> > > +1(binding)
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Ferenc Csaky  于2024年4月17日周三 19:58写道:
> > >
> > >> +1 (non-binding)
> > >>
> > >> Best,
> > >> Ferenc
> > >>
> > >>
> > >>
> > >>
> > >> On Wednesday, April 17th, 2024 at 10:26, Ahmed Hamdy <
> > hamdy10...@gmail.com>
> > >> wrote:
> > >>
> > >>>
> > >>>
> > >>> + 1 (non-binding)
> > >>>
> > >>> Best Regards
> > >>> Ahmed Hamdy
> > >>>
> > >>>
> > >>> On Wed, 17 Apr 2024 at 08:28, Yuepeng Pan panyuep...@apache.org wrote:
> > >>>
> >  +1(non-binding).
> > 
> >  Best,
> >  Yuepeng Pan
> > 
> >  At 2024-04-17 14:27:27, "Ron liu" ron9@gmail.com wrote:
> > 
> > > Hi Dev,
> > >
> > > Thank you to everyone for the feedback on FLIP-435: Introduce a New
> > > Materialized Table for Simplifying Data Pipelines[1][2].
> > >
> > > I'd like to start a vote for it. The vote will be open for at least
> > >> 72
> > > hours unless there is an objection or not enough votes.
> > >
> > > [1]
> > 
> > 
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
> > 
> > > [2] https://lists.apache.org/thread/c1gnn3bvbfs8v1trlf975t327s4rsffs
> > >
> > > Best,
> > > Ron
> > >>
> >
> >


Re: Re: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee

2024-04-14 Thread Jingsong Li
Congratulations

On Mon, Apr 15, 2024 at 9:32 AM Xuyang  wrote:
>
> Congratulations, Lincoln.
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> 在 2024-04-14 21:49:53,"Benchao Li"  写道:
> >Congratulations, Lincoln. Well deserved!
> >
> >Leonard Xu  于2024年4月14日周日 21:19写道:
> >>
> >> Congratulations, Lincoln~
> >>
> >> Best,
> >> Leonard
> >>
> >>
> >>
> >> > 2024年4月12日 下午4:40,Yuepeng Pan  写道:
> >> >
> >> > Congratulations, Lincoln!
> >> >
> >> > Best,Yuepeng Pan
> >> > At 2024-04-12 16:24:01, "Yun Tang"  wrote:
> >> >> Congratulations, Lincoln!
> >> >>
> >> >>
> >> >> Best
> >> >> Yun Tang
> >> >> 
> >> >> From: Jark Wu 
> >> >> Sent: Friday, April 12, 2024 15:59
> >> >> To: dev 
> >> >> Cc: Lincoln Lee 
> >> >> Subject: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee
> >> >>
> >> >> Hi everyone,
> >> >>
> >> >> On behalf of the PMC, I'm very happy to announce that Lincoln Lee has
> >> >> joined the Flink PMC!
> >> >>
> >> >> Lincoln has been an active member of the Apache Flink community for
> >> >> many years. He mainly works on Flink SQL component and has driven
> >> >> /pushed many FLIPs around SQL, including FLIP-282/373/415/435 in
> >> >> the recent versions. He has a great technical vision of Flink SQL and
> >> >> participated in plenty of discussions in the dev mailing list. Besides
> >> >> that,
> >> >> he is community-minded, such as being the release manager of 1.19,
> >> >> verifying releases, managing release syncs, writing the release
> >> >> announcement etc.
> >> >>
> >> >> Congratulations and welcome Lincoln!
> >> >>
> >> >> Best,
> >> >> Jark (on behalf of the Flink PMC)
> >>
> >
> >
> >--
> >
> >Best,
> >Benchao Li


Re: [ANNOUNCE] New Apache Flink PMC Member - Jing Ge

2024-04-14 Thread Jingsong Li
Congratulations

On Mon, Apr 15, 2024 at 9:32 AM Yanquan Lv  wrote:
>
> Congratulations, Jing!
>
> Jark Wu  于2024年4月12日周五 16:03写道:
>
> > Hi everyone,
> >
> > On behalf of the PMC, I'm very happy to announce that Jing Ge has
> > joined the Flink PMC!
> >
> > Jing has been contributing to Apache Flink for a long time. He continuously
> > works on SQL, connectors, Source, and Sink APIs, test, and document
> > modules while contributing lots of code and insightful discussions. He is
> > one of the maintainers of Flink CI infra. He is also willing to help a lot
> > in the
> > community work, such as being the release manager for both 1.18 and 1.19,
> > verifying releases, and answering questions on the mailing list. Besides
> > that,
> > he is continuously helping with the expansion of the Flink community and
> > has
> > given several talks about Flink at many conferences, such as Flink Forward
> > 2022 and 2023.
> >
> > Congratulations and welcome Jing!
> >
> > Best,
> > Jark (on behalf of the Flink PMC)
> >


Re: [DISCUSS] Flink Website Menu Adjustment

2024-03-25 Thread Jingsong Li
+1 for the proposal

On Mon, Mar 25, 2024 at 10:01 PM Ferenc Csaky
 wrote:
>
> Suggested changes makes sense, +1 for the proposed menus and order.
>
> Best,
> Ferenc
>
>
>
>
> On Monday, March 25th, 2024 at 14:50, Gyula Fóra  wrote:
>
> >
> >
> > +1 for the proposal
> >
> > Gyula
> >
> > On Mon, Mar 25, 2024 at 12:49 PM Leonard Xu xbjt...@gmail.com wrote:
> >
> > > Thanks Zhongqiang for starting this discussion, updating documentation
> > > menus according to sub-projects' activities makes sense to me.
> > >
> > > +1 for the proposed menus:
> > >
> > > > After:
> > > >
> > > > With Flink
> > > > With Flink Kubernetes Operator
> > > > With Flink CDC
> > > > With Flink ML
> > > > With Flink Stateful Functions
> > > > Training Course
> > >
> > > Best,
> > > Leonard
> > >
> > > > 2024年3月25日 下午3:48,gongzhongqiang gongzhongqi...@apache.org 写道:
> > > >
> > > > Hi everyone,
> > > >
> > > > I'd like to start a discussion on adjusting the Flink website [1] menu 
> > > > to
> > > > improve accuracy and usability.While migrating Flink CDC documentation
> > > > to the website, I found outdated links, need to review and update menus
> > > > for the most relevant information for our users.
> > > >
> > > > Proposal:
> > > >
> > > > - Remove Paimon [2] from the "Getting Started" and "Documentation" 
> > > > menus:
> > > > Paimon [2] is now an independent top project of ASF. CC: jingsong lees
> > > >
> > > > - Sort the projects in the subdirectory by the activity of the projects.
> > > > Here I list the number of releases for each project in the past year.
> > > >
> > > > Flink Kubernetes Operator : 7
> > > > Flink CDC : 5
> > > > Flink ML : 2
> > > > Flink Stateful Functions : 1
> > > >
> > > > Expected Outcome :
> > > >
> > > > - Menu "Getting Started"
> > > >
> > > > Before:
> > > >
> > > > With Flink
> > > >
> > > > With Flink Stateful Functions
> > > >
> > > > With Flink ML
> > > >
> > > > With Flink Kubernetes Operator
> > > >
> > > > With Paimon(incubating) (formerly Flink Table Store)
> > > >
> > > > With Flink CDC
> > > >
> > > > Training Course
> > > >
> > > > After:
> > > >
> > > > With Flink
> > > > With Flink Kubernetes Operator
> > > >
> > > > With Flink CDC
> > > >
> > > > With Flink ML
> > > >
> > > > With Flink Stateful Functions
> > > >
> > > > Training Course
> > > >
> > > > - Menu "Documentation" will same with "Getting Started"
> > > >
> > > > I look forward to hearing your thoughts and suggestions on this 
> > > > proposal.
> > > >
> > > > [1] https://flink.apache.org/
> > > > [2] https://github.com/apache/incubator-paimon
> > > > [3] https://github.com/apache/flink-statefun
> > > >
> > > > Best regards,
> > > >
> > > > Zhongqiang Gong


Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Jingsong Li
Congratulations!

On Mon, Mar 18, 2024 at 4:30 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> Congratulations, thanks for the great work!
>
> Best,
> Rui
>
> On Mon, Mar 18, 2024 at 4:26 PM Lincoln Lee  wrote:
>>
>> The Apache Flink community is very happy to announce the release of Apache 
>> Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19 series.
>>
>> Apache Flink® is an open-source stream processing framework for distributed, 
>> high-performing, always-available, and accurate data streaming applications.
>>
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>>
>> Please check out the release blog post for an overview of the improvements 
>> for this bugfix release:
>> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
>>
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353282
>>
>> We would like to thank all contributors of the Apache Flink community who 
>> made this release possible!
>>
>>
>> Best,
>> Yun, Jing, Martijn and Lincoln


Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-13 Thread Jingsong Li
+1 for this.

We are missing a series of catalog related syntaxes.
Especially after the introduction of catalog store. [1]

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations

Best,
Jingsong

On Wed, Mar 13, 2024 at 5:09 PM Yubin Li  wrote:
>
> Hi devs,
>
> I'd like to start a discussion about FLIP-436: Introduce "SHOW CREATE
> CATALOG" Syntax [1].
>
> At present, the `SHOW CREATE TABLE` statement provides strong support for
> users to easily
> reuse created tables. However, despite the increasing importance of the
> `Catalog` in user's
> business, there is no similar statement for users to use.
>
> According to the online discussion in FLINK-24939 [2] with Jark Wu and Feng
> Jin, since `CatalogStore`
> has been introduced in FLIP-295 [3], we could use this component to
> implement such a long-awaited
> feature, Please refer to the document [1] for implementation details.
>
> examples as follows:
>
> Flink SQL> create catalog cat2 WITH ('type'='generic_in_memory',
> > 'default-database'='db');
> > [INFO] Execute statement succeeded.
> > Flink SQL> show create catalog cat2;
> >
> > ++
> > | result |
> >
> > ++
> > | CREATE CATALOG `cat2` WITH (
> >   'default-database' = 'db',
> >   'type' = 'generic_in_memory'
> > )
> >  |
> >
> > ++
> > 1 row in set
>
>
>
> Looking forward to hearing from you, thanks!
>
> Best regards,
> Yubin
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> [2] https://issues.apache.org/jira/browse/FLINK-24939
> [3]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations


Re: [DISCUSS] Add "Special Thanks" Page on the Flink Website

2024-03-05 Thread Jingsong Li
+1 for setting up

On Tue, Mar 5, 2024 at 5:39 PM Jing Ge  wrote:
>
> +1 and thanks for the proposal!
>
> Best regards,
> Jing
>
> On Tue, Mar 5, 2024 at 10:26 AM tison  wrote:
>
> > I like this idea, so +1 for setting up.
> >
> > For anyone who have the access, this is a related thread about
> > project-wise sponsor in the foundation level [1].
> >
> > Best,
> > tison.
> >
> > [1] https://lists.apache.org/thread/2nv0x9gfk9lfnpb2315xgywyx84y97v6
> >
> > Jark Wu  于2024年3月5日周二 17:17写道:
> > >
> > > Sorry, I posted the wrong [7] link. The Flink benchmark ML link is:
> > > https://lists.apache.org/thread/bkw6ozoflgltwfwmzjtgx522hyssfko6
> > >
> > >
> > > On Tue, 5 Mar 2024 at 16:56, Jark Wu  wrote:
> > >
> > > > Hi all,
> > > >
> > > >
> > > >
> > > > I want to propose adding a "Special Thanks" page to our Apache Flink
> > website [1]
> > > >
> > > > to honor and appreciate the
> > > > companies and organizations that have sponsored
> > > >
> > > >
> > > > machines or services for our project. The establishment of such a page
> > serves as
> > > >
> > > >
> > > > a public acknowledgment of our sponsors' contributions and
> > simultaneously acts
> > > >
> > > >
> > > > as a positive encouragement for other entities to consider supporting
> > our project.
> > > >
> > > >
> > > >
> > > > Adding Per-Project Thanks pages is allowed by ASF policy[2], which
> > says "PMCs
> > > >
> > > >
> > > > may wish to provide recognition for third parties that provide
> > software or services
> > > >
> > > >
> > > > to the project's committers to further the goals of the project. These
> > are typically
> > > >
> > > > called Per-Project Thanks pages".  Many Apache projects have added such
> > > >
> > > > pages, for example, Apache HBase[3] and Apache Mina[4].
> > > >
> > > >
> > > > To initiate this idea, I have drafted a preliminary page under the
> > > > "About" menu
> > > >
> > > > on the
> > > > Flink website to specifically thank Alibaba and Ververica, by following
> > > >
> > > > the ASF guidelines and the Apache Mina project.
> > > >
> > > >
> > > > page image:
> > > >
> > https://github.com/apache/flink/assets/5378924/e51aaffe-565e-46d1-90af-3900904afcc0
> > > >
> > > >
> > > >
> > > > Below companies are on the thanks list for their donation to Flink
> > testing infrastructure:
> > > >
> > > > - Alibaba donated 8 machines (32vCPU,64GB) for running Flink CI builds
> > [5].
> > > >
> > > >
> > > > - Ververica donated 2 machines for hosting flink-ci repositories [6]
> > and running Flink benchmarks [7].
> > > >
> > > >
> > > > I may miss some other donations or companies, please add them if you
> > know.
> > > >
> > > > Looking forward to your feedback about this proposal!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Jark
> > > >
> > > >
> > > > [1]: https://flink.apache.org/
> > > >
> > > > [2]: https://www.apache.org/foundation/marks/linking#projectthanks
> > > >
> > > > [3]: https://hbase.apache.org/sponsors.html
> > > >
> > > > [4]: https://mina.apache.org/special-thanks.html
> > > >
> > > > [5]:
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-AvailableCustomBuildMachines
> > > >
> > > > [6]:
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Continuous+Integration
> > > >
> > > > [7]:
> > > >
> > https://lists.apache.org/thread.html/41a68c775753a7841896690c75438e0a497634102e676db880f30225@%3Cdev.flink.apache.org%3E
> > > >
> >


Re: [ANNOUNCE] Flink 1.19 Cross-team testing completed & sync summary on 02/27/2024

2024-02-27 Thread Jingsong Li
Thanks Lincoln and all contributors!

Best,
Jingsong

On Wed, Feb 28, 2024 at 12:15 PM Xintong Song  wrote:
>
> Matthias has already said it on the release sync, but I still want to say
> it again. It's amazing how smooth the release testing goes for this
> release. Great thanks to the release managers and all contributors who make
> this happen.
>
> Best,
>
> Xintong
>
>
>
> On Wed, Feb 28, 2024 at 10:10 AM Lincoln Lee  wrote:
>
> > Hi devs,
> >
> > I'd like to share some highlights from the release sync on 02/27/2024
> >
> > - Cross-team testing
> >
> > We've finished all of the testing work[1]. Huge thanks to all contributors
> > and volunteers for the effort on this!
> >
> > - Blockers
> >
> > Two api change merge requests[2][3] had been discussed, there was an
> > agreement on the second pr, as it is a fix for an
> > unintended behavior newly introduced in 1.19, and we need to avoid
> > releasing it to users. For the 1st pr, we suggest continue the discussing
> > separately.
> > So we will wait for [3] done and then create the first release candidate
> > 1.19.0-rc1(expecting within this week if no new blockers).
> >
> > - Release notes
> >
> > Revision to the draft version of the release note[4] has been closed, and
> > the formal pr[5] has been submitted,
> > also the release announcement pr will be ready later this week, please
> > continue to help review before 1.19 release, thanks!
> >
> > - Sync meeting (https://meet.google.com/vcx-arzs-trv)
> >
> > We've already switched to weekly release sync, so the next release sync
> > will be on Mar 5th, 2024. Feel free to join!
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-34285
> > [2] https://lists.apache.org/thread/2llhhbkcx5w7chp3d6cthoqc8kwfvw6x
> > [3]
> > https://github.com/apache/flink/pull/24387#pullrequestreview-1902749309
> > [4]
> > https://docs.google.com/document/d/1HLF4Nhvkln4zALKJdwRErCnPzufh7Z3BhhkWlk9Zh7w
> > [5] https://github.com/apache/flink/pull/24394
> >
> > Best,
> > Yun, Jing, Martijn and Lincoln
> >


Re: [ANNOUNCE] New Apache Flink Committer - Jiabao Sun

2024-02-21 Thread Jingsong Li
Congratulations! Well deserved!

On Wed, Feb 21, 2024 at 4:36 PM Yuepeng Pan  wrote:
>
> Congratulations~ :)
>
> Best,
> Yuepeng Pan
>
>
>
>
>
>
>
>
>
>
> 在 2024-02-21 09:52:17,"Hongshun Wang"  写道:
> >Congratulations, Jiabao :)
> >Congratulations Jiabao!
> >
> >Best,
> >Hongshun
> >Best regards,
> >
> >Weijie
> >
> >On Tue, Feb 20, 2024 at 2:19 PM Runkang He  wrote:
> >
> >> Congratulations Jiabao!
> >>
> >> Best,
> >> Runkang He
> >>
> >> Jane Chan  于2024年2月20日周二 14:18写道:
> >>
> >> > Congrats, Jiabao!
> >> >
> >> > Best,
> >> > Jane
> >> >
> >> > On Tue, Feb 20, 2024 at 10:32 AM Paul Lam  wrote:
> >> >
> >> > > Congrats, Jiabao!
> >> > >
> >> > > Best,
> >> > > Paul Lam
> >> > >
> >> > > > 2024年2月20日 10:29,Zakelly Lan  写道:
> >> > > >
> >> > > >> Congrats! Jiabao!
> >> > >
> >> > >
> >> >
> >>


Re: [DISCUSS] FLIP-415: Introduce a new join operator to support minibatch

2024-01-11 Thread Jingsong Li
Hi all,

This is a relatively large optimization that may pose a significant
risk of bugs, so I like to keep it from being enabled by default for
now.

Best,
Jingsong

On Fri, Jan 12, 2024 at 3:01 PM shuai xu  wrote:
>
> Suppose we currently have a job that joins two CDC sources after 
> de-duplicating them and the output is available for audit analysis, and the 
> user turns off the parameter 
> "table.exec.deduplicate.mini-batch.compact-changes-enabled" to ensure that it 
> does not lose update details. If we don't introduce this parameter, after the 
> user upgrades the version, some update details may be lost due to the 
> mini-batch connection being enabled by default, resulting in distorted audit 
> results.
>
> > 2024年1月11日 16:19,Benchao Li  写道:
> >
> >> the change might not be supposed for the downstream of the job which 
> >> requires details of changelog
> >
> > Could you elaborate on this a bit? I've never met such kinds of
> > requirements before, I'm curious what is the scenario that requires
> > this.
> >
> > shuai xu  于2024年1月11日周四 13:08写道:
> >>
> >> Thanks for your response, Benchao.
> >>
> >> Here is my thought on the newly added option.
> >> Users' current jobs are running on a version without minibatch join. If 
> >> the existing option to enable minibatch join is utilized, then when users' 
> >> jobs are migrated to the new version, the internal behavior of the join 
> >> operation within the jobs will change. Although the semantic of changelog 
> >> emitted by the Join operator is eventual consistency, the change might not 
> >> be supposed for the downstream of the job which requires details of 
> >> changelog. This newly added option also refers to 
> >> 'table.exec.deduplicate.mini-batch.compact-changes-enabled'.
> >>
> >> As for the implementation,The new operator shares the state of the 
> >> original operator and it merely has an additional minibatch for storing 
> >> records to do some optimization. The storage remains consistent, and there 
> >> is minor modification to the computational logic.
> >>
> >> Best,
> >> Xu Shuai
> >>
> >>> 2024年1月10日 22:56,Benchao Li  写道:
> >>>
> >>> Thanks shuai for driving this, mini-batch Join is a very useful
> >>> optimization, +1 for the general idea.
> >>>
> >>> Regarding the configuration
> >>> "table.exec.stream.join.mini-batch-enabled", I'm not sure it's really
> >>> necessary. The semantic of changelog emitted by the Join operator is
> >>> eventual consistency, so there is no much difference between original
> >>> Join and mini-batch Join from this aspect. Besides, introducing more
> >>> options would make it more complex for users, harder to understand and
> >>> maintain, which we should be careful about.
> >>>
> >>> One thing about the implementation, could you make the new operator
> >>> share the same state definition with the original one?
> >>>
> >>> shuai xu  于2024年1月10日周三 21:23写道:
> 
>  Hi devs,
> 
>  I’d like to start a discussion on FLIP-415: Introduce a new join 
>  operator to support minibatch[1].
> 
>  Currently, when performing cascading connections in Flink, there is a 
>  pain point of record amplification. Every record join operator receives 
>  would trigger join process. However, if records of +I and -D matches , 
>  they could be folded to reduce two times of join process. Besides, 
>  records of  -U +U might output 4 records in which two records are 
>  redundant when encountering outer join .
> 
>  To address this issue, this FLIP introduces a new  
>  MiniBatchStreamingJoinOperator to achieve batch processing which could 
>  reduce number of outputting redundant messages and avoid unnecessary 
>  join processes.
>  A new option is added to control the operator to avoid influencing 
>  existing jobs.
> 
>  Please find more details in the FLIP wiki document [1]. Looking
>  forward to your feedback.
> 
>  [1]
>  https://cwiki.apache.org/confluence/display/FLINK/FLIP-415%3A+Introduce+a+new+join+operator+to+support+minibatch
> 
>  Best,
>  Xu Shuai
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Best,
> >>> Benchao Li
> >>
> >
> >
> > --
> >
> > Best,
> > Benchao Li
>
> Best,
> Xu Shuai
>


Re: Re: [VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-09 Thread Jingsong Li
+1

On Wed, Jan 10, 2024 at 11:24 AM Xuyang  wrote:
>
> +1(non-binding)--
>
> Best!
> Xuyang
>
>
>
>
>
> 在 2024-01-08 00:34:55,"Feng Jin"  写道:
> >Hi Alexey
> >
> >Thank you for the reminder, the link has been updated.
> >
> >Best,
> >Feng Jin
> >
> >On Sat, Jan 6, 2024 at 12:55 AM Alexey Leonov-Vendrovskiy <
> >vendrov...@gmail.com> wrote:
> >
> >> Thanks for starting the vote!
> >> Do you mind adding a link from the FLIP to this thread?
> >>
> >> Thanks,
> >> Alexey
> >>
> >> On Thu, Jan 4, 2024 at 6:48 PM Feng Jin  wrote:
> >>
> >> > Hi everyone
> >> >
> >> > Thanks for all the feedback about the FLIP-387: Support named parameters
> >> > for functions and call procedures [1] [2] .
> >> >
> >> > I'd like to start a vote for it. The vote will be open for at least 72
> >> > hours(excluding weekends,until Jan 10, 12:00AM GMT) unless there is an
> >> > objection or an insufficient number of votes.
> >> >
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> >> > [2] https://lists.apache.org/thread/bto7mpjvcx7d7k86owb00dwrm65jx8cn
> >> >
> >> >
> >> > Best,
> >> > Feng Jin
> >> >
> >>


Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-08 Thread Jingsong Li
+1

On Tue, Jan 9, 2024 at 3:23 PM Yuan Mei  wrote:
>
> +1 binding
>
>
>
> On Tue, Jan 9, 2024 at 3:21 PM Yuan Mei  wrote:
>
> > +1
> >
> > Best,
> > Yuan
> >
> > On Tue, Jan 9, 2024 at 3:06 PM tison  wrote:
> >
> >> +1 non-binding
> >>
> >> Best,
> >> tison.
> >>
> >> Leonard Xu  于2024年1月9日周二 15:05写道:
> >> >
> >> > Hello all,
> >> >
> >> > This is the official vote whether to accept the Flink CDC code
> >> contribution
> >> >  to Apache Flink.
> >> >
> >> > The current Flink CDC code, documentation, and website can be
> >> > found here:
> >> > code: https://github.com/ververica/flink-cdc-connectors <
> >> https://github.com/ververica/flink-cdc-connectors>
> >> > docs: https://ververica.github.io/flink-cdc-connectors/ <
> >> https://ververica.github.io/flink-cdc-connectors/>
> >> >
> >> > This vote should capture whether the Apache Flink community is
> >> interested
> >> > in accepting, maintaining, and evolving Flink CDC.
> >> >
> >> > Regarding my original proposal[1] in the dev mailing list, I firmly
> >> believe
> >> > that this initiative aligns perfectly with Flink. For the Flink
> >> community,
> >> > it represents an opportunity to bolster Flink's competitive edge in
> >> streaming
> >> > data integration, fostering the robust growth and prosperity of the
> >> Apache Flink
> >> > ecosystem. For the Flink CDC project, becoming a sub-project of Apache
> >> Flink
> >> > means becoming an integral part of a neutral open-source community,
> >> capable of
> >> > attracting a more diverse pool of contributors.
> >> >
> >> > All Flink CDC maintainers are dedicated to continuously contributing to
> >> achieve
> >> > seamless integration with Flink. Additionally, PMC members like Jark,
> >> Qingsheng,
> >> > and I are willing to infacilitate the expansion of contributors and
> >> committers to
> >> > effectively maintain this new sub-project.
> >> >
> >> > This is a "Adoption of a new Codebase" vote as per the Flink bylaws [2].
> >> > Only PMC votes are binding. The vote will be open at least 7 days
> >> > (excluding weekends), meaning until Thursday January 18 12:00 UTC, or
> >> until we
> >> > achieve the 2/3rd majority. We will follow the instructions in the
> >> Flink Bylaws
> >> > in the case of insufficient active binding voters:
> >> >
> >> > > 1. Wait until the minimum length of the voting passes.
> >> > > 2. Publicly reach out via personal email to the remaining binding
> >> voters in the
> >> > voting mail thread for at least 2 attempts with at least 7 days between
> >> two attempts.
> >> > > 3. If the binding voter being contacted still failed to respond after
> >> all the attempts,
> >> > the binding voter will be considered as inactive for the purpose of
> >> this particular voting.
> >> >
> >> > Welcome voting !
> >> >
> >> > Best,
> >> > Leonard
> >> > [1] https://lists.apache.org/thread/o7klnbsotmmql999bnwmdgo56b6kxx9l
> >> > [2]
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> >>
> >


Re: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-02 Thread Jingsong Li
Congratulations!

Best,
Jingsong

On Wed, Jan 3, 2024 at 10:28 AM Benchao Li  wrote:
>
> Congratulations, Alex!
>
> Yuepeng Pan  于2024年1月3日周三 10:10写道:
> >
> > Congrats, Alex!
> >
> > Best,
> > Yuepeng Pan
> > At 2024-01-02 20:15:08, "Maximilian Michels"  wrote:
> > >Happy New Year everyone,
> > >
> > >I'd like to start the year off by announcing Alexander Fedulov as a
> > >new Flink committer.
> > >
> > >Alex has been active in the Flink community since 2019. He has
> > >contributed more than 100 commits to Flink, its Kubernetes operator,
> > >and various connectors [1][2].
> > >
> > >Especially noteworthy are his contributions on deprecating and
> > >migrating the old Source API functions and test harnesses, the
> > >enhancement to flame graphs, the dynamic rescale time computation in
> > >Flink Autoscaling, as well as all the small enhancements Alex has
> > >contributed which make a huge difference.
> > >
> > >Beyond code contributions, Alex has been an active community member
> > >with his activity on the mailing lists [3][4], as well as various
> > >talks and blog posts about Apache Flink [5][6].
> > >
> > >Congratulations Alex! The Flink community is proud to have you.
> > >
> > >Best,
> > >The Flink PMC
> > >
> > >[1] https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> > >[2] 
> > >https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> > >[3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> > >[4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> > >[5] 
> > >https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> > >[6] 
> > >https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
>
>
>
> --
>
> Best,
> Benchao Li


Re: [PROPOSAL] Contribute Flink CDC Connectors project to Apache Flink

2023-12-06 Thread Jingsong Li
Wow, Cool, Nice

CDC is playing an increasingly important role.

+1

Best,
Jingsong

On Thu, Dec 7, 2023 at 11:25 AM Leonard Xu  wrote:
>
> Dear Flink devs,
>
> As you may have heard, we at Alibaba (Ververica) are planning to donate CDC 
> Connectors for the Apache Flink project[1] to the Apache Flink community.
>
> CDC Connectors for Apache Flink comprise a collection of source connectors 
> designed specifically for Apache Flink. These connectors[2] enable the 
> ingestion of changes from various databases using Change Data Capture (CDC), 
> most of these CDC connectors are powered by Debezium[3]. They support both 
> the DataStream API and the Table/SQL API, facilitating the reading of 
> database snapshots and continuous reading of transaction logs with 
> exactly-once processing, even in the event of failures.
>
>
> Additionally, in the latest version 3.0, we have introduced many long-awaited 
> features. Starting from CDC version 3.0, we've built a Streaming ELT 
> Framework available for streaming data integration. This framework allows 
> users to write their data synchronization logic in a simple YAML file, which 
> will automatically be translated into a Flink DataStreaming job. It 
> emphasizes optimizing the task submission process and offers advanced 
> functionalities such as whole database synchronization, merging sharded 
> tables, and schema evolution[4].
>
>
> I believe this initiative is a perfect match for both sides. For the Flink 
> community, it presents an opportunity to enhance Flink's competitive 
> advantage in streaming data integration, promoting the healthy growth and 
> prosperity of the Apache Flink ecosystem. For the CDC Connectors project, 
> becoming a sub-project of Apache Flink means being part of a neutral 
> open-source community, which can attract a more diverse pool of contributors.
>
> Please note that the aforementioned points represent only some of our 
> motivations and vision for this donation. Specific future operations need to 
> be further discussed in this thread. For example, the sub-project name after 
> the donation; we hope to name it Flink-CDC aiming to streaming data 
> intergration through Apache Flink, following the naming convention of 
> Flink-ML; And this project is managed by a total of 8 maintainers, including 
> 3 Flink PMC members and 1 Flink Committer. The remaining 4 maintainers are 
> also highly active contributors to the Flink community, donating this project 
> to the Flink community implies that their permissions might be reduced. 
> Therefore, we may need to bring up this topic for further discussion within 
> the Flink PMC. Additionally, we need to discuss how to migrate existing users 
> and documents. We have a user group of nearly 10,000 people and a 
> multi-version documentation site need to migrate. We also need to plan for 
> the migration of CI/CD processes and other specifics.
>
>
> While there are many intricate details that require implementation, we are 
> committed to progressing and finalizing this donation process.
>
>
> Despite being Flink’s most active ecological project (as evaluated by GitHub 
> metrics), it also boasts a significant user base. However, I believe it's 
> essential to commence discussions on future operations only after the 
> community reaches a consensus on whether they desire this donation.
>
>
> Really looking forward to hear what you think!
>
>
> Best,
> Leonard (on behalf of the Flink CDC Connectors project maintainers)
>
> [1] https://github.com/ververica/flink-cdc-connectors
> [2] 
> https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-connectors.html
> [3] https://debezium.io
> [4] 
> https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-pipeline.html


Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-12-04 Thread Jingsong Li
+1 binding

On Mon, Dec 4, 2023 at 10:33 PM Etienne Chauchot  wrote:
>
> Correct,
>
> I forgot that in the bylaws, committer vote is binding for FLIPs thanks
> for the reminder.
>
> Best
>
> Etienne
>
> Le 30/11/2023 à 10:43, Leonard Xu a écrit :
> > +1(binding)
> >
> > Btw, @Etienne, IIRC, your vote should be a binding one.
> >
> >
> > Best,
> > Leonard
> >
> >> 2023年11月30日 下午5:03,Etienne Chauchot  写道:
> >>
> >> +1 (non-biding)
> >>
> >> Etienne
> >>
> >> Le 30/11/2023 à 09:13, Rui Fan a écrit :
> >>> +1(binding)
> >>>
> >>> Best,
> >>> Rui
> >>>
> >>> On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang   
> >>> wrote:
> >>>
>  +1 (binding)
> 
>  Best,
>  Lijie
> 
>  Zhu Zhu   于2023年11月30日周四 13:13写道:
> 
> > +1
> >
> > Thanks,
> > Zhu
> >
> > Xia Sun   于2023年11月30日周四 11:41写道:
> >
> >> Hi everyone,
> >>
> >> I'd like to start a vote on FLIP-379: Dynamic source parallelism
> > inference
> >> for batch jobs[1] which has been discussed in this thread [2].
> >>
> >> The vote will be open for at least 72 hours unless there is an
>  objection
> > or
> >> not enough votes.
> >>
> >>
> >> [1]
> >>
> >>
>  https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs
> >> [2]https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8
> >>
> >>
> >> Best Regards,
> >> Xia


Re: [DISCUSS] Contribute Flink Doris Connector to the Flink community

2023-11-26 Thread Jingsong Li
+1 for this

On Mon, Nov 27, 2023 at 10:26 AM Yun Tang  wrote:
>
> Hi, Di.Wu
>
> Thanks for creating this discussion. The Apache Doris community might have 
> the most active contributors in ASF this year, and since I also contributed 
> to doris-flink-connector before, I'm very glad to see this contribution to 
> make the two communities work closer.
>
> I'm +1 for this proposal and please create a FLIP like [1] to kick off the 
> discussion officially.
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift
>
> Best
> Yun Tang
> 
> From: wudi <676366...@qq.com.INVALID>
> Sent: Sunday, November 26, 2023 18:22
> To: dev@flink.apache.org 
> Subject: [DISCUSS] Contribute Flink Doris Connector to the Flink community
>
> Hi all,
>
> At present, Flink Connector and Flink's repository have been decoupled[1].
> At the same time, the Flink-Doris-Connector[3] has been maintained based on 
> the Apache Doris[2] community.
> I think the Flink Doris Connector can be migrated to the Flink community 
> because it It is part of Flink Connectors and can also expand the ecosystem 
> of Flink Connectors.
>
> I volunteer to move this forward if I can.
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
> [2] https://doris.apache.org/
> [3] https://github.com/apache/doris-flink-connector
>
> --
>
> Brs,
> di.wu


Re: Re:Re: [DISCUSS] Release 1.17.2

2023-11-07 Thread Jingsong Li
+1 thanks Yun

1.17.2 is really important.

Best,
Jingsong

On Tue, Nov 7, 2023 at 9:32 AM Danny Cranmer  wrote:
>
> +1, thanks for picking this up.
>
> I am happy to help out with the bits you need PMC support for.
>
> Thanks,
> Danny
>
> On Tue, Nov 7, 2023 at 4:03 AM Yun Tang  wrote:
>
> >  Hi @casel.chen
> >
> > It seems FLINK-33365 is more related to JDBC connector than the Flink
> > runtime, and the discussion will focus on the release of Flink-1.17.2.
> >
> >
> > Best
> > Yun Tang
> > 
> > From: casel.chen 
> > Sent: Tuesday, November 7, 2023 16:04
> > To: dev@flink.apache.org 
> > Cc: rui fan <1996fan...@gmail.com>; yuchen.e...@gmail.com <
> > yuchen.e...@gmail.com>
> > Subject: Re:Re: [DISCUSS] Release 1.17.2
> >
> >
> >
> >
> >
> >
> >
> > https://issues.apache.org/jira/browse/FLINK-33365 fixed or not in release
> > 1.17.2?
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2023-11-07 09:47:29, "liu ron"  wrote:
> > >+1
> > >
> > >Best,
> > >Ron
> > >
> > >Jing Ge  于2023年11月6日周一 22:07写道:
> > >
> > >> +1
> > >> Thanks for your effort!
> > >>
> > >> Best regards,
> > >> Jing
> > >>
> > >> On Mon, Nov 6, 2023 at 1:15 AM Konstantin Knauf 
> > wrote:
> > >>
> > >> > Thank you for picking it up! +1
> > >> >
> > >> > Cheers,
> > >> >
> > >> > Konstantin
> > >> >
> > >> > Am Mo., 6. Nov. 2023 um 03:48 Uhr schrieb Yun Tang  > >:
> > >> >
> > >> > > Hi all,
> > >> > >
> > >> > > I would like to discuss creating a new 1.17 patch release (1.17.2).
> > The
> > >> > > last 1.17 release is near half a year old, and since then, 79
> > tickets
> > >> > have
> > >> > > been closed [1], of which 15 are blocker/critical [2]. Some
> > >> > > of them are quite important, such as FLINK-32758 [3], FLINK-32296
> > [4],
> > >> > > FLINK-32548 [5]
> > >> > > and FLINK-33010[6].
> > >> > >
> > >> > > In addition to this, FLINK-33149 [7] is important to bump
> > snappy-java
> > >> to
> > >> > > 1.1.10.4.
> > >> > > Although FLINK-33149 is unresolved, it was done in 1.17.2.
> > >> > >
> > >> > > I am not aware of any unresolved blockers and there are no
> > in-progress
> > >> > > tickets [8]. Please let me know if there are any issues you'd like
> > to
> > >> be
> > >> > > included in this release but still not merged.
> > >> > >
> > >> > > If the community agrees to create this new patch release, I could
> > >> > > volunteer as the release manager with Yu Chen.
> > >> > >
> > >> > > Since there will be another flink-1.16.3 release request during the
> > >> same
> > >> > > time, we will work with Rui Fan since many issues will be fixed in
> > both
> > >> > > releases.
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.2%20%20and%20resolution%20%20!%3D%20%20Unresolved%20order%20by%20priority%20DESC
> > >> > > [2]
> > >> > >
> > >> >
> > >>
> > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.2%20and%20resolution%20%20!%3D%20Unresolved%20%20and%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20by%20priority%20%20DESC
> > >> > > [3] https://issues.apache.org/jira/browse/FLINK-32758
> > >> > > [4] https://issues.apache.org/jira/browse/FLINK-32296
> > >> > > [5] https://issues.apache.org/jira/browse/FLINK-32548
> > >> > > [6] https://issues.apache.org/jira/browse/FLINK-33010
> > >> > > [7] https://issues.apache.org/jira/browse/FLINK-33149
> > >> > > [8] https://issues.apache.org/jira/projects/FLINK/versions/12353260
> > >> > >
> > >> > > Best
> > >> > > Yun Tang
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> > https://twitter.com/snntrable
> > >> > https://github.com/knaufk
> > >> >
> > >>
> >


Re: [VOTE] FLIP-376: Add DISTRIBUTED BY clause

2023-11-07 Thread Jingsong Li
+1

On Tue, Nov 7, 2023 at 5:56 AM Jim Hughes  wrote:
>
> Hi all,
>
> +1 (non-binding)
>
> Cheers,
>
> Jim
>
> On Mon, Nov 6, 2023 at 6:39 AM Timo Walther  wrote:
>
> > Hi everyone,
> >
> > I'd like to start a vote on FLIP-376: Add DISTRIBUTED BY clause[1] which
> > has been discussed in this thread [2].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> > or not enough votes.
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-376%3A+Add+DISTRIBUTED+BY+clause
> > [2] https://lists.apache.org/thread/z1p4f38x9ohv29y4nlq8g1wdxwfjwxd1
> >
> > Cheers,
> > Timo
> >


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Jingsong Li
Congratulations!

Thanks Jing and other release managers and all contributors.

Best,
Jingsong

On Fri, Oct 27, 2023 at 1:52 PM Zakelly Lan  wrote:
>
> Congratulations and thank you all!
>
>
> Best,
> Zakelly
>
> On Fri, Oct 27, 2023 at 12:39 PM Jark Wu  wrote:
> >
> > Congratulations and thanks release managers and everyone who has
> > contributed!
> >
> > Best,
> > Jark
> >
> > On Fri, 27 Oct 2023 at 12:25, Hang Ruan  wrote:
> >
> > > Congratulations!
> > >
> > > Best,
> > > Hang
> > >
> > > Samrat Deb  于2023年10月27日周五 11:50写道:
> > >
> > > > Congratulations on the great release
> > > >
> > > > Bests,
> > > > Samrat
> > > >
> > > > On Fri, 27 Oct 2023 at 7:59 AM, Yangze Guo  wrote:
> > > >
> > > > > Great work! Congratulations to everyone involved!
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren 
> > > wrote:
> > > > > >
> > > > > > Congratulations and big THANK YOU to everyone helping with this
> > > > release!
> > > > > >
> > > > > > Best,
> > > > > > Qingsheng
> > > > > >
> > > > > > On Fri, Oct 27, 2023 at 10:18 AM Benchao Li 
> > > > > wrote:
> > > > > >>
> > > > > >> Great work, thanks everyone involved!
> > > > > >>
> > > > > >> Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
> > > > > >> >
> > > > > >> > Thanks for the great work!
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > Rui
> > > > > >> >
> > > > > >> > On Fri, Oct 27, 2023 at 10:03 AM Paul Lam 
> > > > > wrote:
> > > > > >> >
> > > > > >> > > Finally! Thanks to all!
> > > > > >> > >
> > > > > >> > > Best,
> > > > > >> > > Paul Lam
> > > > > >> > >
> > > > > >> > > > 2023年10月27日 03:58,Alexander Fedulov <
> > > > alexander.fedu...@gmail.com>
> > > > > 写道:
> > > > > >> > > >
> > > > > >> > > > Great work, thanks everyone!
> > > > > >> > > >
> > > > > >> > > > Best,
> > > > > >> > > > Alexander
> > > > > >> > > >
> > > > > >> > > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser <
> > > > > martijnvis...@apache.org>
> > > > > >> > > > wrote:
> > > > > >> > > >
> > > > > >> > > >> Thank you all who have contributed!
> > > > > >> > > >>
> > > > > >> > > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin <
> > > > > jinfeng1...@gmail.com>
> > > > > >> > > >>
> > > > > >> > > >>> Thanks for the great work! Congratulations
> > > > > >> > > >>>
> > > > > >> > > >>>
> > > > > >> > > >>> Best,
> > > > > >> > > >>> Feng Jin
> > > > > >> > > >>>
> > > > > >> > > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu <
> > > > xbjt...@gmail.com>
> > > > > wrote:
> > > > > >> > > >>>
> > > > > >> > >  Congratulations, Well done!
> > > > > >> > > 
> > > > > >> > >  Best,
> > > > > >> > >  Leonard
> > > > > >> > > 
> > > > > >> > >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee <
> > > > > lincoln.8...@gmail.com>
> > > > > >> > >  wrote:
> > > > > >> > > 
> > > > > >> > > > Thanks for the great work! Congrats all!
> > > > > >> > > >
> > > > > >> > > > Best,
> > > > > >> > > > Lincoln Lee
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > Jing Ge  于2023年10月27日周五
> > > > 00:16写道:
> > > > > >> > > >
> > > > > >> > > >> The Apache Flink community is very happy to announce the
> > > > > release of
> > > > > >> > > > Apache
> > > > > >> > > >> Flink 1.18.0, which is the first release for the Apache
> > > > > Flink 1.18
> > > > > >> > > > series.
> > > > > >> > > >>
> > > > > >> > > >> Apache Flink® is an open-source unified stream and batch
> > > > data
> > > > > >> > >  processing
> > > > > >> > > >> framework for distributed, high-performing,
> > > > > always-available, and
> > > > > >> > > > accurate
> > > > > >> > > >> data applications.
> > > > > >> > > >>
> > > > > >> > > >> The release is available for download at:
> > > > > >> > > >> https://flink.apache.org/downloads.html
> > > > > >> > > >>
> > > > > >> > > >> Please check out the release blog post for an overview 
> > > > > >> > > >> of
> > > > the
> > > > > >> > > > improvements
> > > > > >> > > >> for this release:
> > > > > >> > > >>
> > > > > >> > > >>
> > > > > >> > > >
> > > > > >> > > 
> > > > > >> > > >>>
> > > > > >> > > >>
> > > > > >> > >
> > > > >
> > > >
> > > https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > > > > >> > > >>
> > > > > >> > > >> The full release notes are available in Jira:
> > > > > >> > > >>
> > > > > >> > > >>
> > > > > >> > > >
> > > > > >> > > 
> > > > > >> > > >>>
> > > > > >> > > >>
> > > > > >> > >
> > > > >
> > > >
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352885
> > > > > >> > > >>
> > > > > >> > > >> We would like to thank all contributors of the Apache
> > > Flink
> > > > > >> > > >> community
> > > > > >> > >  who
> > > > > >> > > >> made this release possible!
> > > > > >> > > >>
> > > > > >> > > >> Best regards,
> >

Re: flink-sql-connector-jdbc new release

2023-10-26 Thread Jingsong Li
Hi David,

Thanks for driving this.

I think https://issues.apache.org/jira/browse/FLINK-33365 should be a blocker.

Best,
Jingsong

On Thu, Oct 26, 2023 at 11:43 PM David Radley  wrote:
>
> Hi,
> I propose that we do a 3.2 release of flink-sql-connector-jdbc so that there 
> is a version matching 1.18 that includes the new dialects. I am happy to 
> drive this, some pointers to documentation on the process and the approach to 
> testing the various dialects would be great,
>
>  Kind regards, David.
>
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [DISCUSS] FLIP-376: Add DISTRIBUTED BY clause

2023-10-26 Thread Jingsong Li
Very thanks Timo for starting this discussion.

Big +1 for this.

The design looks good to me!

We can add some documentation for connector developers. For example:
for sink, If there needs some keyby, please finish the keyby by the
connector itself. SupportsBucketing is just a marker interface.

Best,
Jingsong

On Thu, Oct 26, 2023 at 5:00 PM Timo Walther  wrote:
>
> Hi everyone,
>
> I would like to start a discussion on FLIP-376: Add DISTRIBUTED BY
> clause [1].
>
> Many SQL vendors expose the concepts of Partitioning, Bucketing, and
> Clustering. This FLIP continues the work of previous FLIPs and would
> like to introduce the concept of "Bucketing" to Flink.
>
> This is a pure connector characteristic and helps both Apache Kafka and
> Apache Paimon connectors in avoiding a complex WITH clause by providing
> improved syntax.
>
> Here is an example:
>
> CREATE TABLE MyTable
>(
>  uid BIGINT,
>  name STRING
>)
>DISTRIBUTED BY (uid) INTO 6 BUCKETS
>WITH (
>  'connector' = 'kafka'
>)
>
> The full syntax specification can be found in the document. The clause
> should be optional and fully backwards compatible.
>
> Regards,
> Timo
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-376%3A+Add+DISTRIBUTED+BY+clause


Re: [DISCUSS] FLIP-377: Support configuration to disable filter push down for Table/SQL Sources

2023-10-24 Thread Jingsong Li
+1 for this FLIP.

BTW, I think we can add an option for projection push down too.

Yes, we can do all things in the connector, but a common
implementation helps a lot! And can introduce an unify option!

Best,
Jingsong

On Wed, Oct 25, 2023 at 10:07 AM Jark Wu  wrote:
>
> Thank you for updating Jiabao,
>
> The FLIP looks good to me.
>
> Best,
> Jark
>
> On Wed, 25 Oct 2023 at 00:42, Jiabao Sun 
> wrote:
>
> > Thanks Jane for the feedback.
> >
> > The default value of "table.optimizer.source.predicate" is true that means
> > by default,
> > allowing predicate pushdown to all sources is permitted.
> >
> > Therefore, disabling the pushdown filter for individual sources can take
> > effect.
> >
> >
> > Best,
> > Jiabao
> >
> >
> > > 2023年10月24日 23:52,Jane Chan  写道:
> > >
> > >>
> > >> I believe that the configuration "table.optimizer.source.predicate" has
> > a
> > >> higher priority at the planner level than the configuration at the
> > source
> > >> level,
> > >> and it seems easy to implement now.
> > >>
> > >
> > > Correct me if I'm wrong, but I think the fine-grained configuration
> > > "scan.filter-push-down.enabled" should have a higher priority because the
> > > default value of "table.optimizer.source.predicate" is true. As a result,
> > > turning off filter push-down for a specific source will not take effect
> > > unless the default value of "table.optimizer.source.predicate" is changed
> > > to false, or, alternatively, let users manually set
> > > "table.optimizer.source.predicate" to false first and then selectively
> > > enable filter push-down for the desired sources, which is less intuitive.
> > > WDYT?
> > >
> > > Best,
> > > Jane
> > >
> > > On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun  > .invalid>
> > > wrote:
> > >
> > >> Thanks Jane,
> > >>
> > >> I believe that the configuration "table.optimizer.source.predicate" has
> > a
> > >> higher priority at the planner level than the configuration at the
> > source
> > >> level,
> > >> and it seems easy to implement now.
> > >>
> > >> Best,
> > >> Jiabao
> > >>
> > >>
> > >>> 2023年10月24日 17:36,Jane Chan  写道:
> > >>>
> > >>> Hi Jiabao,
> > >>>
> > >>> Thanks for driving this discussion. I have a small question that will
> > >>> "scan.filter-push-down.enabled" take precedence over
> > >>> "table.optimizer.source.predicate" when the two parameters might
> > conflict
> > >>> each other?
> > >>>
> > >>> Best,
> > >>> Jane
> > >>>
> > >>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun  > >> .invalid>
> > >>> wrote:
> > >>>
> >  Thanks Jark,
> > 
> >  If we only add configuration without adding the enableFilterPushDown
> >  method in the SupportsFilterPushDown interface,
> >  each connector would have to handle the same logic in the applyFilters
> >  method to determine whether filter pushdown is needed.
> >  This would increase complexity and violate the original behavior of
> > the
> >  applyFilters method.
> > 
> >  On the contrary, we only need to pass the configuration parameter in
> > the
> >  newly added enableFilterPushDown method
> >  to decide whether to perform predicate pushdown.
> > 
> >  I think this approach would be clearer and simpler.
> >  WDYT?
> > 
> >  Best,
> >  Jiabao
> > 
> > 
> > > 2023年10月24日 16:58,Jark Wu  写道:
> > >
> > > Hi JIabao,
> > >
> > > I think the current interface can already satisfy your requirements.
> > > The connector can reject all the filters by returning the input
> > filters
> > > as `Result#remainingFilters`.
> > >
> > > So maybe we don't need to introduce a new method to disable
> > > pushdown, but just introduce an option for the specific connector.
> > >
> > > Best,
> > > Jark
> > >
> > > On Tue, 24 Oct 2023 at 16:38, Leonard Xu  wrote:
> > >
> > >> Thanks @Jiabao for kicking off this discussion.
> > >>
> > >> Could you add a section to explain the difference between proposed
> > >> connector level config `scan.filter-push-down.enabled` and existing
> >  query
> > >> level config `table.optimizer.source.predicate-pushdown-enabled` ?
> > >>
> > >> Best,
> > >> Leonard
> > >>
> > >>> 2023年10月24日 下午4:18,Jiabao Sun 
> > 写道:
> > >>>
> > >>> Hi Devs,
> > >>>
> > >>> I would like to start a discussion on FLIP-377: support
> > configuration
> >  to
> > >> disable filter pushdown for Table/SQL Sources[1].
> > >>>
> > >>> Currently, Flink Table/SQL does not expose fine-grained control for
> > >> users to enable or disable filter pushdown.
> > >>> However, filter pushdown has some side effects, such as additional
> > >> computational pressure on external systems.
> > >>> Moreover, Improper queries can lead to issues such as full table
> > >> scans,
> > >> which in turn can impact the stability of external systems.
> > >>>
> > >>> Suppose we have an SQL query with two sources: K

Re: [VOTE] Release 1.18.0, release candidate #3

2023-10-23 Thread Jingsong Li
+1 (binding)

- verified signatures & hash
- built from source code succeeded
- started SQL Client, used Paimon connector to write and read, the
result is expected

Best,
Jingsong

On Tue, Oct 24, 2023 at 12:15 PM Yuxin Tan  wrote:
>
> +1(non-binding)
>
> - Verified checksum
> - Build from source code
> - Verified signature
> - Started a local cluster and run Streaming & Batch wordcount job, the
> result is expected
> - Verified web PR
>
> Best,
> Yuxin
>
>
> Qingsheng Ren  于2023年10月24日周二 11:19写道:
>
> > +1 (binding)
> >
> > - Verified checksums and signatures
> > - Built from source with Java 8
> > - Started a standalone cluster and submitted a Flink SQL job that read and
> > wrote with Kafka connector and CSV / JSON format
> > - Reviewed web PR and release note
> >
> > Best,
> > Qingsheng
> >
> > On Mon, Oct 23, 2023 at 10:40 PM Leonard Xu  wrote:
> >
> > > +1 (binding)
> > >
> > > - verified signatures
> > > - verified hashsums
> > > - built from source code succeeded
> > > - checked all dependency artifacts are 1.18
> > > - started SQL Client, used MySQL CDC connector to read changelog from
> > > database , the result is expected
> > > - reviewed the web PR, left minor comments
> > > - reviewed the release notes PR, left minor comments
> > >
> > >
> > > Best,
> > > Leonard
> > >
> > > > 2023年10月21日 下午7:28,Rui Fan <1996fan...@gmail.com> 写道:
> > > >
> > > > +1(non-binding)
> > > >
> > > > - Downloaded artifacts from dist[1]
> > > > - Verified SHA512 checksums
> > > > - Verified GPG signatures
> > > > - Build the source with java-1.8 and verified the licenses together
> > > > - Verified web PR
> > > >
> > > > [1] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.0-rc3/
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Fri, Oct 20, 2023 at 10:31 PM Martijn Visser <
> > > martijnvis...@apache.org>
> > > > wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> - Validated hashes
> > > >> - Verified signature
> > > >> - Verified that no binaries exist in the source archive
> > > >> - Build the source with Maven
> > > >> - Verified licenses
> > > >> - Verified web PR
> > > >> - Started a cluster and the Flink SQL client, successfully read and
> > > >> wrote with the Kafka connector to Confluent Cloud with AVRO and Schema
> > > >> Registry enabled
> > > >>
> > > >> On Fri, Oct 20, 2023 at 2:55 PM Matthias Pohl
> > > >>  wrote:
> > > >>>
> > > >>> +1 (binding)
> > > >>>
> > > >>> * Downloaded artifacts
> > > >>> * Built Flink from sources
> > > >>> * Verified SHA512 checksums GPG signatures
> > > >>> * Compared checkout with provided sources
> > > >>> * Verified pom file versions
> > > >>> * Verified that there are no pom/NOTICE file changes since RC1
> > > >>> * Deployed standalone session cluster and ran WordCount example in
> > > batch
> > > >>> and streaming: Nothing suspicious in log files found
> > > >>>
> > > >>> On Thu, Oct 19, 2023 at 3:00 PM Piotr Nowojski  > >
> > > >> wrote:
> > > >>>
> > >  +1 (binding)
> > > 
> > >  Best,
> > >  Piotrek
> > > 
> > >  czw., 19 paź 2023 o 09:55 Yun Tang  napisał(a):
> > > 
> > > > +1 (non-binding)
> > > >
> > > >
> > > >  *   Build from source code
> > > >  *   Verify the pre-built jar packages were built with JDK8
> > > >  *   Verify FLIP-291 with a standalone cluster, and it works fine
> > > >> with
> > > > StateMachine example.
> > > >  *   Checked the signature
> > > >  *   Viewed the PRs.
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Cheng Pan 
> > > > Sent: Thursday, October 19, 2023 14:38
> > > > To: dev@flink.apache.org 
> > > > Subject: RE: [VOTE] Release 1.18.0, release candidate #3
> > > >
> > > > +1 (non-binding)
> > > >
> > > > We(the Apache Kyuubi community), verified that the Kyuubi Flink
> > > >> engine
> > > > works well[1] with Flink 1.18.0 RC3.
> > > >
> > > > [1] https://github.com/apache/kyuubi/pull/5465
> > > >
> > > > Thanks,
> > > > Cheng Pan
> > > >
> > > >
> > > > On 2023/10/19 00:26:24 Jing Ge wrote:
> > > >> Hi everyone,
> > > >>
> > > >> Please review and vote on the release candidate #3 for the version
> > > >> 1.18.0, as follows:
> > > >> [ ] +1, Approve the release
> > > >> [ ] -1, Do not approve the release (please provide specific
> > > >> comments)
> > > >>
> > > >> The complete staging area is available for your review, which
> > > >> includes:
> > > >>
> > > >> * JIRA release notes [1], and the pull request adding release note
> > > >> for
> > > >> users [2]
> > > >> * the official Apache source release and binary convenience
> > > >> releases to
> > > > be
> > > >> deployed to dist.apache.org [3], which are signed with the key
> > > >> with
> > > >> fingerprint 96AE0E32CBE6E0753CE6 [4],
> > > >> * all artifacts to be deployed to the Maven Central Repository
> > [5],
>

Re: [DISCUSS] Remove legacy Paimon (TableStore) doc link from Flink web navigation

2023-10-17 Thread Jingsong Li
Hi marton,

Thanks for driving. +1

There is a PR to remove legacy Paimon
https://github.com/apache/flink-web/pull/665 , but it hasn't been
updated for a long time.

Best,
Jingsong

On Tue, Oct 17, 2023 at 4:28 PM Márton Balassi  wrote:
>
> Hi Flink & Paimon devs,
>
> The Flink webpage documentation navigation section still lists the outdated 
> TableStore 0.3 and master docs as subproject docs (see attachment). I am all 
> for advertising Paimon as a sister project of Flink, but the current state is 
> misleading in multiple ways.
>
> I would like to remove these obsolete links if the communities agree.
>
> Best,
> Marton


Re: [VOTE] FLIP-356: Support Nested Fields Filter Pushdown

2023-09-05 Thread Jingsong Li
+1

On Wed, Sep 6, 2023 at 1:18 PM Becket Qin  wrote:
>
> Thanks for pushing the FLIP through.
>
> +1 on the updated FLIP wiki.
>
> Cheers,
>
> Jiangjie (Becket) Qin
>
> On Wed, Sep 6, 2023 at 1:12 PM Venkatakrishnan Sowrirajan 
> wrote:
>
> > Based on the recent discussions in the thread [DISCUSS] FLIP-356: Support
> > Nested Fields Filter Pushdown
> > , I made
> > some changes to the FLIP-356
> > <
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown
> > >.
> > Unless anyone else has any concerns, we can continue with this vote to
> > reach consensus.
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Tue, Sep 5, 2023 at 8:04 AM Sergey Nuyanzin 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > On Tue, Sep 5, 2023 at 4:55 PM Jiabao Sun  > > .invalid>
> > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Jiabao
> > > >
> > > >
> > > > > 2023年9月5日 下午10:33,Martijn Visser  写道:
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Tue, Sep 5, 2023 at 4:16 PM ConradJam 
> > wrote:
> > > > >
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> Yuepeng Pan  于2023年9月1日周五 15:43写道:
> > > > >>
> > > > >>> +1 (non-binding)
> > > > >>>
> > > > >>> Best,
> > > > >>> Yuepeng
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> At 2023-09-01 14:32:19, "Jark Wu"  wrote:
> > > >  +1 (binding)
> > > > 
> > > >  Best,
> > > >  Jark
> > > > 
> > > > > 2023年8月30日 02:40,Venkatakrishnan Sowrirajan 
> > 写道:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thank you all for your feedback on FLIP-356. I'd like to start a
> > > > vote.
> > > > >
> > > > > Discussion thread:
> > > > >
> > >
> > https://urldefense.com/v3/__https://lists.apache.org/thread/686bhgwrrb4xmbfzlk60szwxos4z64t7__;!!IKRxdwAv5BmarQ!eNR1R48e8jbqDCSdXqWj6bjfmP1uMn-IUIgVX3uXlgzYp_9rcf-nZOaAZ7KzFo2JwMAJPGYv8wfRxuRMAA$
> > > > > FLIP:
> > > > >
> > > > >>>
> > > > >>
> > > >
> > >
> > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!eNR1R48e8jbqDCSdXqWj6bjfmP1uMn-IUIgVX3uXlgzYp_9rcf-nZOaAZ7KzFo2JwMAJPGYv8wdkI0waFw$
> > > > >
> > > > > Regards
> > > > > Venkata krishnan
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> > > --
> > > Best regards,
> > > Sergey
> > >
> >


Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-08-27 Thread Jingsong Li
So if NestedFieldReferenceExpression doesn't need inputIndex, is there
a need to introduce a base class `ReferenceExpression`?

Best,
Jingsong

On Mon, Aug 28, 2023 at 2:09 PM Jingsong Li  wrote:
>
> Hi thanks all for your discussion.
>
> What is inputIndex in NestedFieldReferenceExpression?
>
> I know inputIndex has special usage in FieldReferenceExpression, but
> it is only for Join operators, and it is only for SQL optimization. It
> looks like there is no requirement for Nested.
>
> Best,
> Jingsong
>
> On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan
>  wrote:
> >
> > Thanks for all the feedback and discussion everyone. Looks like we have
> > reached a consensus here.
> >
> > Just to summarize:
> >
> > 1. Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*)
> > abstract class which will be extended by both *FieldReferenceExpression*
> > and *NestedFieldReferenceExpression* (to be introduced as part of this FLIP)
> > 2. No need of *supportsNestedFilters *check as the current
> > *SupportsFilterPushDown* should already ignore unknown expressions (
> > *NestedFieldReferenceExpression* for example) and return them as
> > *remainingFilters.
> > *Maybe this should be clarified explicitly in the Javadoc of
> > *SupportsFilterPushDown.
> > *I will file a separate JIRA to fix the documentation.
> > 3. Refactor *SupportsProjectionPushDown* to use *ReferenceExpression 
> > *instead
> > of existing 2-d arrays to consolidate and be consistent with other
> > Supports*PushDown APIs - *outside the scope of this FLIP*
> > 4. Similarly *SupportsAggregatePushDown* should also be evolved whenever
> > nested fields support is added to use the *ReferenceExpression - **outside
> > the scope of this FLIP*
> >
> > Does this sound good? Please let me know if I have missed anything here. If
> > there are no concerns, I will start a vote tomorrow. I will also get the
> > FLIP-356 wiki updated. Thanks everyone once again!
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Thu, Aug 24, 2023 at 8:19 PM Becket Qin  wrote:
> >
> > > Hi Jark,
> > >
> > > How about having a separate NestedFieldReferenceExpression, and
> > > > abstracting a common base class "ReferenceExpression" for
> > > > NestedFieldReferenceExpression and FieldReferenceExpression? This makes
> > > > unifying expressions in
> > > > "SupportsProjectionPushdown#applyProjections(List
> > > > ...)"
> > > > possible.
> > >
> > >
> > > I'd be fine with this. It at least provides a consistent API style /
> > > formality.
> > >
> > >  Re: Yunhong,
> > >
> > > 3. Finally, I think we need to look at the costs and benefits of unifying
> > > > the SupportsFilterPushDown and SupportsProjectionPushDown (or others)
> > > from
> > > > the perspective of interface implementers. A stable API can reduce user
> > > > development and change costs, if the current API can fully meet the
> > > > functional requirements at the framework level, I personal suggest
> > > reducing
> > > > the impact on connector developers.
> > > >
> > >
> > > I agree that the cost and benefit should be measured. And the measurement
> > > should be in the long term instead of short term. That is why we always
> > > need to align on the ideal end state first.
> > > Meeting functionality requirements is the bare minimum bar for an API.
> > > Simplicity, intuitiveness, robustness and evolvability are also important.
> > > In addition, for projects with many APIs, such as Flink, a consistent API
> > > style is also critical for the user adoption as well as bug avoidance. It
> > > is very helpful for the community to agree on some API design conventions 
> > > /
> > > principles.
> > > For example, in this particular case, via our discussion, hopefully we 
> > > sort
> > > of established the following API design conventions / principles for all
> > > the Supports*PushDown interfaces.
> > >
> > > 1. By default, expressions should be used if applicable instead of other
> > > representations.
> > > 2. In general, the pushdown method should not assume all the pushdowns 
> > > will
> > > succeed. So the applyX() method should return a boolean or List, to
> > > handle the cases that some of the pushdowns cannot be fulfilled by the
> > > implementation.
> > >

Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-08-27 Thread Jingsong Li
Hi thanks all for your discussion.

What is inputIndex in NestedFieldReferenceExpression?

I know inputIndex has special usage in FieldReferenceExpression, but
it is only for Join operators, and it is only for SQL optimization. It
looks like there is no requirement for Nested.

Best,
Jingsong

On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan
 wrote:
>
> Thanks for all the feedback and discussion everyone. Looks like we have
> reached a consensus here.
>
> Just to summarize:
>
> 1. Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*)
> abstract class which will be extended by both *FieldReferenceExpression*
> and *NestedFieldReferenceExpression* (to be introduced as part of this FLIP)
> 2. No need of *supportsNestedFilters *check as the current
> *SupportsFilterPushDown* should already ignore unknown expressions (
> *NestedFieldReferenceExpression* for example) and return them as
> *remainingFilters.
> *Maybe this should be clarified explicitly in the Javadoc of
> *SupportsFilterPushDown.
> *I will file a separate JIRA to fix the documentation.
> 3. Refactor *SupportsProjectionPushDown* to use *ReferenceExpression *instead
> of existing 2-d arrays to consolidate and be consistent with other
> Supports*PushDown APIs - *outside the scope of this FLIP*
> 4. Similarly *SupportsAggregatePushDown* should also be evolved whenever
> nested fields support is added to use the *ReferenceExpression - **outside
> the scope of this FLIP*
>
> Does this sound good? Please let me know if I have missed anything here. If
> there are no concerns, I will start a vote tomorrow. I will also get the
> FLIP-356 wiki updated. Thanks everyone once again!
>
> Regards
> Venkata krishnan
>
>
> On Thu, Aug 24, 2023 at 8:19 PM Becket Qin  wrote:
>
> > Hi Jark,
> >
> > How about having a separate NestedFieldReferenceExpression, and
> > > abstracting a common base class "ReferenceExpression" for
> > > NestedFieldReferenceExpression and FieldReferenceExpression? This makes
> > > unifying expressions in
> > > "SupportsProjectionPushdown#applyProjections(List
> > > ...)"
> > > possible.
> >
> >
> > I'd be fine with this. It at least provides a consistent API style /
> > formality.
> >
> >  Re: Yunhong,
> >
> > 3. Finally, I think we need to look at the costs and benefits of unifying
> > > the SupportsFilterPushDown and SupportsProjectionPushDown (or others)
> > from
> > > the perspective of interface implementers. A stable API can reduce user
> > > development and change costs, if the current API can fully meet the
> > > functional requirements at the framework level, I personal suggest
> > reducing
> > > the impact on connector developers.
> > >
> >
> > I agree that the cost and benefit should be measured. And the measurement
> > should be in the long term instead of short term. That is why we always
> > need to align on the ideal end state first.
> > Meeting functionality requirements is the bare minimum bar for an API.
> > Simplicity, intuitiveness, robustness and evolvability are also important.
> > In addition, for projects with many APIs, such as Flink, a consistent API
> > style is also critical for the user adoption as well as bug avoidance. It
> > is very helpful for the community to agree on some API design conventions /
> > principles.
> > For example, in this particular case, via our discussion, hopefully we sort
> > of established the following API design conventions / principles for all
> > the Supports*PushDown interfaces.
> >
> > 1. By default, expressions should be used if applicable instead of other
> > representations.
> > 2. In general, the pushdown method should not assume all the pushdowns will
> > succeed. So the applyX() method should return a boolean or List, to
> > handle the cases that some of the pushdowns cannot be fulfilled by the
> > implementation.
> >
> > Establishing such conventions and principles demands careful thinking for
> > the aspects I mentioned earlier in addition to the API functionalities.
> > This helps lower the bar of understanding, reduces the chance of having
> > loose ends in the API, and will benefit all the participants in the project
> > over time. I think this is the right way to achieve real API stability.
> > Otherwise, we may end up chasing our tails to find ways not to change the
> > existing non-ideal APIs.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Aug 25, 2023 at 9:33 AM yh z  wrote:
> >
> > > Hi, Venkat,
> > >
> > > Thanks for the FLIP, it sounds good to support nested fields filter
> > > pushdown. Based on the design of flip and the above options, I would like
> > > to make a few suggestions:
> > >
> > > 1.  At present, introducing NestedFieldReferenceExpression looks like a
> > > better solution, which can fully meet our requirements while reducing
> > > modifications to base class FieldReferenceExpression. In the long run, I
> > > tend to abstract a basic class for NestedFieldReferenceExpression and
> > > FieldReferenceExpression as 

Re: [VOTE] Release 2.0 must-have work items - Round 2

2023-07-26 Thread Jingsong Li
+1 binding

Thanks all for your work!

Best,
Jingsong

On Thu, Jul 27, 2023 at 10:52 AM Jark Wu  wrote:
>
> +1 (binding)
>
> Thanks Xintong for driving this. Thanks all for finalizing the
> SourceFunction conclusion.
>
> Best,
> Jark
>
> On Wed, 26 Jul 2023 at 22:28, Alexander Fedulov 
> wrote:
>
> > +1 (non-binding), assuming SourceFunction gets added back to the
> > doc as a "nice-to-have". I am glad we've reached a consensus here.
> > Extra thanks to Leonard for coordinating this discussion in particular.
> >
> > Best,
> > Alex
> >
> > On Wed, 26 Jul 2023 at 15:43, Jing Ge  wrote:
> >
> > > +1 (non-binding), glad to see we are now on the same page. Thank you all.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Wed, Jul 26, 2023 at 5:18 PM Yun Tang  wrote:
> > >
> > > > +1 (non-binding), thanks @xintong for driving this work.
> > > >
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Zhu Zhu 
> > > > Sent: Wednesday, July 26, 2023 16:35
> > > > To: dev@flink.apache.org 
> > > > Subject: Re: [VOTE] Release 2.0 must-have work items - Round 2
> > > >
> > > > +1 (binding)
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Leonard Xu  于2023年7月26日周三 15:40写道:
> > > > >
> > > > > Thanks @xingtong for driving the work.
> > > > >
> > > > > +1(binding)
> > > > >
> > > > > Best,
> > > > > Leonard
> > > > >
> > > > > > On Jul 26, 2023, at 3:18 PM, Konstantin Knauf <
> > > > knauf.konstan...@gmail.com> wrote:
> > > > > >
> > > > > > Hi Xingtong,
> > > > > >
> > > > > > yes, I am fine with the conclusion for SourceFunction. I chatted
> > with
> > > > > > Leonard a bit last night. Let's continue this vote.
> > > > > >
> > > > > > Thanks for the clarification,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > >
> > > > > >
> > > > > > Am Mi., 26. Juli 2023 um 04:03 Uhr schrieb Xintong Song <
> > > > > > tonysong...@gmail.com>:
> > > > > >
> > > > > >> Hi Konstantin,
> > > > > >>
> > > > > >> It seems the offline discussion has already taken place [1], and
> > > part
> > > > of
> > > > > >> the outcome is that removal of SourceFunction would be a
> > > > *nice-to-have*
> > > > > >> item for release 2.0 which may not block this *must-have* vote. Do
> > > > you have
> > > > > >> different opinions about the conclusions in [1]?
> > > > > >>
> > > > > >> If there are still concerns, and the discussion around this topic
> > > > needs to
> > > > > >> be continued, then I'd suggest (as I mentioned in [2]) not to
> > > further
> > > > block
> > > > > >> this vote (i.e. the decision on other must-have items). Release
> > 2.0
> > > > still
> > > > > >> has a long way to go, and it is likely we need to review and
> > update
> > > > the
> > > > > >> list every once in a while. We can update the list with another
> > vote
> > > > if
> > > > > >> later we decide to add the removal of SourceFunction to the
> > > must-have
> > > > list.
> > > > > >>
> > > > > >> WDYT?
> > > > > >>
> > > > > >> Best,
> > > > > >>
> > > > > >> Xintong
> > > > > >>
> > > > > >>
> > > > > >> [1]
> > > https://lists.apache.org/thread/yyw52k45x2sp1jszldtdx7hc98n72w7k
> > > > > >> [2]
> > > https://lists.apache.org/thread/j5d5022ky8k5t088ffm03727o5g9x9jr
> > > > > >>
> > > > > >> On Tue, Jul 25, 2023 at 8:49 PM Konstantin Knauf <
> > kna...@apache.org
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> I assume this vote includes a decision to not removing
> > > > > >>> SourceFunction/SinkFunction in Flink 2.0 (as it has been removed
> > > > from the
> > > > > >>> table). If this is the case, I don't think, this discussion has
> > > > > >> concluded.
> > > > > >>> There are multiple contributors like myself, Martijn, Alex
> > Fedulov
> > > > and
> > > > > >>> Maximilian Michels, who have indicated they would be in favor of
> > > > > >>> deprecating/dropping them. This Source/Sink Function discussion
> > > > seems to
> > > > > >> go
> > > > > >>> in circles in general. I am wondering if it makes sense to have a
> > > > call
> > > > > >>> about this instead of repeating mailing list discussions.
> > > > > >>>
> > > > > >>> Am Di., 25. Juli 2023 um 13:38 Uhr schrieb Yu Li <
> > car...@gmail.com
> > > >:
> > > > > >>>
> > > > >  +1 (binding)
> > > > > 
> > > > >  Thanks for driving this, Xintong!
> > > > > 
> > > > >  Best Regards,
> > > > >  Yu
> > > > > 
> > > > > 
> > > > >  On Sun, 23 Jul 2023 at 18:28, Yuan Mei 
> > > > wrote:
> > > > > 
> > > > > > +1 (binding)
> > > > > >
> > > > > > Thanks for driving the discussion through and for all the
> > efforts
> > > > in
> > > > > > resolving the complexities :-)
> > > > > >
> > > > > > Best
> > > > > > Yuan
> > > > > >
> > > > > > On Thu, Jul 20, 2023 at 5:23 PM Xintong Song <
> > > > tonysong...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> I'd like to start another round of VOTE for the must-have work
> > > > > >> items

[ANNOUNCE] New Apache Flink Committer - Yong Fang

2023-07-23 Thread Jingsong Li
Hi, everyone

On behalf of the PMC, I'm very happy to announce Yong Fang (Shammon)
(zjur...@gmail.com) as a new Flink Committer.

Yong is an old flinker, he has been contributing to Flink since 2017.

He actively participated in dev discussions and answered many
questions on the user mailing list.

He contributed the JDBC Driver for Flink SQL. And continue to
contribute to Flink's lineage management and other features in the
future.

Please join me in congratulating Yong Fang for becoming a Flink Committer!

Best,
Jingsong Lee (on behalf of the Flink PMC)


Re: [DISCUSS] FLIP-346: Deprecate ManagedTable related APIs

2023-07-19 Thread Jingsong Li
+1

On Thu, Jul 20, 2023 at 12:31 PM Jane Chan  wrote:
>
> Hi, devs,
>
> I would like to start a discussion on FLIP-346: Deprecate ManagedTable
> related APIs[1].
>
> These APIs were initially designed for Flink Table Store, which has
> joined the Apache Incubator as a separate project called Apache
> Paimon(incubating).
>
> Since they are obsolete and not used by Paimon anymore, I propose to
> deprecate them in v1.18 and further remove them before v2.0.
>
> Looking forward to your feedback.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-346%3A+Deprecate+ManagedTable+related+APIs
>
> Best regards,
> Jane


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-07-18 Thread Jingsong Li
+1 binding

Thanks Dong for continuous driving.

Best,
Jingsong

On Tue, Jul 18, 2023 at 10:04 PM Jark Wu  wrote:
>
> +1 (binding)
>
> Best,
> Jark
>
> On Tue, 18 Jul 2023 at 20:30, Piotr Nowojski  wrote:
>
> > +1 (binding)
> >
> > Piotrek
> >
> > wt., 18 lip 2023 o 08:51 Jing Ge  napisał(a):
> >
> > > +1(binding)
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Tue, Jul 18, 2023 at 8:31 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > +1(binding)
> > > >
> > > > Best,
> > > > Rui Fan
> > > >
> > > >
> > > > On Tue, Jul 18, 2023 at 12:04 PM Dong Lin  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > We would like to start the vote for FLIP-309: Support using larger
> > > > > checkpointing interval when source is processing backlog [1]. This
> > FLIP
> > > > was
> > > > > discussed in this thread [2].
> > > > >
> > > > > The vote will be open until at least July 21st (at least 72 hours),
> > > > > following
> > > > > the consensus voting process.
> > > > >
> > > > > Cheers,
> > > > > Yunfeng and Dong
> > > > >
> > > > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309
> > > > >
> > > > >
> > > >
> > >
> > %3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > > > > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> > > > >
> > > >
> > >
> >


Re: [VOTE] Release 2.0 must-have work items

2023-07-11 Thread Jingsong Li
+1 to Leonard and Galen and Jing.

About Source and Sink.
We're still missing quite a bit of work, including functionality,
including ease of use, including bug fixes, and I'm not sure we'll be
completely done by 2.0.
Until that's done, we won't be in a position to clean up the old APIs.

Best,
Jingsong

On Wed, Jul 12, 2023 at 9:41 AM yuxia  wrote:
>
> Hi,Xintong.
> Sorry to disturb the voting. I just found an email[1] about DataSet API from 
> flink-user-zh channel. And I think it's not just a single case according to 
> my observation.
>
> Remove DataSet is a must have item in release-2.0. But as the user email 
> said, if we remove DataSet, how users can implement Sort/PartitionBy, etc as 
> they did with DataSet?
> Do we will also provide similar api in datastream or some other thing before 
> we remove DataSet?
> Btw, as far as I see, with regarding to replcaing DataSet with Datastream, 
> Datastream are missing many API. I think it may well take much effort to 
> fully cover the missing api.
>
> [1] https://lists.apache.org/thread/syjmt8f74gh8ok3z4lhgt95zl4dzn168
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Jing Ge" 
> 收件人: "dev" 
> 发送时间: 星期三, 2023年 7 月 12日 上午 1:23:40
> 主题: Re: [VOTE] Release 2.0 must-have work items
>
> agree with what Leonard said. There are actually more issues wrt the new
> Source and SinkV2[1]
>
> Speaking of must-have vs nice-to-have, I think it depends on the priority.
> If removing them has higher priority, we should keep related tasks as
> must-have and make sure enough effort will be put to solve those issues and
> therefore be able to remove those APIs.
>
> Best regards,
> Jing
>
> [1] https://lists.apache.org/thread/90qc9nrlzf0vbvg92klzp9ftxxc43nbk
>
> On Tue, Jul 11, 2023 at 10:26 AM Leonard Xu  wrote:
>
> > Thanks Xintong for driving this great work! But I’ve to give my
> > -1(binding) here:
> >
> > -1 to mark "deprecat SourceFunction/SinkFunction/Sinkv1" item as must to
> > have for release 2.0.
> >
> > I do a lot of connector work in the community, and I have two insights
> > from past experience:
> >
> > 1. Many developers reported that it is very difficult to migrate from
> > SourceFunction to new Source [1]. The migration of existing conenctors
> > after deprecated SourceFunction is very difficult. Some developers (Flavio
> > Pompermaier) reported that they gave up the migration because it was too
> > complicated. I believe it's not a few cases. This means that deprecating
> > SourceFunction related interfaces require community contributors to reduce
> > the migration cost before starting the migration work.
> >
> > 2. IIRC, the function of SinkV2 cannot currently cover SinkFunction as
> > described in FLIP-287[2], it means the migration path after deprecate
> > SinkFunction/Sinkv1 does not exist, thus we cannot mark the related
> > interfaces of sinkfunction/sinkv1  as deprecated in 1.18.
> >
> > Based on these two cognitions, I think we should not mark these interfaces
> > as must to have in 2.0. Maintaining the two sets of source/sink interfaces
> > is not a concern for me, users can choose the interface to implement
> > according to their energy and needs.
> >
> > Btw, some work items in 2.0 are marked as must to have, but no contributor
> > has claimed them yet. I think this is a risk and hope the Release Managers
> > could pay attention to it.
> >
> > Thank you all RMs for your work, sorry again for interrupting the vote
> >
> > Best,
> > Leonard
> >
> > [1] https://lists.apache.org/thread/sqq26s9rorynr4vx4nhxz3fmmxpgtdqp
> > [2]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> >
> > > On Jul 11, 2023, at 4:11 PM, Yuan Mei  wrote:
> > >
> > > As a second thought, I think "Eager State Declaration" is probably not a
> > > must-have.
> > >
> > > I was originally thinking it is a prerequisite for "state querying for
> > > disaggregated state management".
> > >
> > > Since disaggregated state management itself is not a must-have, "Eager
> > > State Declaration" is not as well. We can downgrade it to "nice to have"
> > if
> > > no objection.
> > >
> > > Best
> > >
> > > Yuan
> > >
> > > On Mon, Jul 10, 2023 at 7:02 PM Jing Ge 
> > wrote:
> > >
> > >> +1
> > >>
> > >> On Mon, Jul 10, 2023 at 12:52 PM Yu Li  wrote:
> > >>
> > >>> +1 (binding)
> > >>>
> > >>> Thanks for driving this and great to see us moving forward.
> > >>>
> > >>> Best Regards,
> > >>> Yu
> > >>>
> > >>>
> > >>> On Mon, 10 Jul 2023 at 11:59, Feng Wang  wrote:
> > >>>
> >  +1
> >  Thanks for driving this, looking forward to the next stage of flink.
> > 
> >  On Fri, Jul 7, 2023 at 5:31 PM Xintong Song 
> > >>> wrote:
> > 
> > > Hi all,
> > >
> > > I'd like to start the VOTE for the must-have work items for release
> > >> 2.0
> > > [1]. The corresponding discussion thread is [2].
> > >
> > > Please note that once the vote is approved, any changes to the
> > >>> must-have
> > > items (adding / removing must-have

Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website

2023-07-11 Thread Jingsong Li
It's exciting to finally have someone to refactor the Flink website.

Thanks Deepthi.

To Xintong,

> maintain two sets of website designs at the same time

If our website is not so complex, and new webui can be full feature
coverage, there would be no need to maintain two sets of web.

To Jing

+1 to consider Flink doc too.

Best,
Jingsong

On Tue, Jul 11, 2023 at 8:16 PM Jing Ge  wrote:
>
> Hi,
>
> +1, the UI design looks good!
>
> Commonly speaking, there are two parts of the whole website: Flink Web and
> Flink doc. Will the dart mode also cover Flink doc?
>
> Best regards,
> Jing
>
> On Tue, Jul 11, 2023 at 12:40 PM Matthias Pohl
>  wrote:
>
> > I also like the proposed designs. Considering that you want to touch
> > individual subpages, there are also some subpages of Flink's website not
> > being mentioned in the FLIP (e.g. roadmap [1]). What is the plan with
> > those? Are they covered by the "We recommend modifications only to the
> > design of the following pages" part but are not listed there?
> >
> > Additionally, it would be nice to get a bit more insight into the feedback
> > from users as Xintong pointed out. It would be interesting to understand
> > how the new design helps solving certain problems (besides having the
> > motivation to modernize the look and feel).
> >
> > I'm also wondering whether it's doable to do a discussion (FLIP?) per
> > subpage on the design (as proposed by Chesnay in the first discussion on
> > this topic [2] to have smaller changes rather than a single big one). But I
> > could imagine this being quite tedious because different people might
> > have different opinions on how something should be done.
> >
> > I don't have any experience with frontend design. I'm wondering how much
> > time such a redesign takes. Could it be linked to the 2.0 release?
> >
> > Thanks for the FLIP (you might want to add [2] to the FLIP's header for the
> > sake of transparency).
> > Matthias
> >
> > [1] https://flink.apache.org/roadmap/
> > [2] https://lists.apache.org/thread/c3pt00cf77lrtgt242p26lgp9l2z5yc8
> >
> > On Tue, Jul 11, 2023 at 11:39 AM Xintong Song 
> > wrote:
> >
> > > +1 in general.
> > >
> > > Thanks for proposing this contribution, Deepthi. The new design looks
> > very
> > > cool.
> > >
> > > I have a few questions, which might be entry-level given that I barely
> > know
> > > anything about the website design.
> > > - Do you think it's feasible to maintain two sets of website designs at
> > the
> > > same time? E.g., adding a button "back to previous version". I'm asking
> > > because, while the new UI might be more friendly to newcomers, the
> > original
> > > website might be more convenient for people who are already familiar with
> > > it to find things. It would be nice if we can offer both options to
> > users.
> > > - For the documentation, I wonder if it makes sense to offer the same
> > color
> > > theme as the website, to keep the experience consistent. How much effort
> > > does it require?
> > > - In the FLIP, you mentioned things like "there's a general consensus"
> > and
> > > "feedback from customers". I'm curious where these come from. Have you
> > > conducted some sort of survey? Would you mind sharing a bit more about
> > > that?
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Tue, Jul 11, 2023 at 4:57 PM Feifan Wang  wrote:
> > >
> > > > +1 , the new design looks more attractive and is well organized
> > > >
> > > > |
> > > > |
> > > > Feifan Wang
> > > > |
> > > > |
> > > > zoltar9...@163.com
> > > > |
> > > >
> > > >
> > > >  Replied Message 
> > > > | From | Leonard Xu |
> > > > | Date | 07/11/2023 16:34 |
> > > > | To | dev |
> > > > | Subject | Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website |
> > > > +1 for the redesigning, the new website looks cool.
> > > >
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > On Jul 11, 2023, at 7:55 AM, Mohan, Deepthi  > >
> > > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I’m opening this thread to discuss a proposal to redesign the Apache
> > > Flink
> > > > website: https://flink.apache.org. The approach and a few initial
> > > mockups
> > > > are included in FLIP 333 - Redesign Apache Flink website.<
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-333%3A+Redesign+Apache+Flink+website
> > > > >
> > > >
> > > > The goal is to modernize the website design to help existing and new
> > > users
> > > > easily understand Flink’s value proposition and make Flink attractive
> > to
> > > > new users. As suggested in a previous thread, there are no proposed
> > > changes
> > > > to Flink documentation.
> > > >
> > > > I look forward to your feedback and the discussion.
> > > >
> > > > Thanks,
> > > > Deepthi
> > > >
> > > >
> > > >
> > >
> >


Re: [VOTE] Release 2.0 must-have work items

2023-07-09 Thread Jingsong Li
+1

On Mon, Jul 10, 2023 at 10:46 AM Yuan Mei  wrote:
>
> +1 (binding)
>
> Thanks for driving this!
>
> Best
> Yuan
>
> On Mon, Jul 10, 2023 at 10:26 AM Jark Wu  wrote:
>
> > +1  (binding)
> >
> > Thanks for driving this. Looking forward to starting the 2.0 works.
> >
> > Best,
> > Jark
> >
> > On Fri, 7 Jul 2023 at 17:31, Xintong Song  wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start the VOTE for the must-have work items for release 2.0
> > > [1]. The corresponding discussion thread is [2].
> > >
> > > Please note that once the vote is approved, any changes to the must-have
> > > items (adding / removing must-have items, changing the priority) requires
> > > another vote. Assigning contributors / reviewers, updating descriptions /
> > > progress, changes to nice-to-have items do not require another vote.
> > >
> > > The vote will be open until at least July 12, following the consensus
> > > voting process. Votes of PMC members are binding.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >
> > > [2] https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
> > >
> >


Re: Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems Award

2023-07-03 Thread Jingsong Li
Congratulations!

Thank you! All of the Flink community!

Best,
Jingsong

On Tue, Jul 4, 2023 at 1:24 PM tison  wrote:
>
> Congrats and with honor :D
>
> Best,
> tison.
>
>
> Mang Zhang  于2023年7月4日周二 11:08写道:
>
> > Congratulations!--
> >
> > Best regards,
> > Mang Zhang
> >
> >
> >
> >
> >
> > 在 2023-07-04 01:53:46,"liu ron"  写道:
> > >Congrats everyone
> > >
> > >Best,
> > >Ron
> > >
> > >Jark Wu  于2023年7月3日周一 22:48写道:
> > >
> > >> Congrats everyone!
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> > 2023年7月3日 22:37,Yuval Itzchakov  写道:
> > >> >
> > >> > Congrats team!
> > >> >
> > >> > On Mon, Jul 3, 2023, 17:28 Jing Ge via user  > >> > wrote:
> > >> >> Congratulations!
> > >> >>
> > >> >> Best regards,
> > >> >> Jing
> > >> >>
> > >> >>
> > >> >> On Mon, Jul 3, 2023 at 3:21 PM yuxia  > >> > wrote:
> > >> >>> Congratulations!
> > >> >>>
> > >> >>> Best regards,
> > >> >>> Yuxia
> > >> >>>
> > >> >>> 发件人: "Pushpa Ramakrishnan"  > >> pushpa.ramakrish...@icloud.com>>
> > >> >>> 收件人: "Xintong Song"  > >> tonysong...@gmail.com>>
> > >> >>> 抄送: "dev" mailto:dev@flink.apache.org>>,
> > >> "User" mailto:u...@flink.apache.org>>
> > >> >>> 发送时间: 星期一, 2023年 7 月 03日 下午 8:36:30
> > >> >>> 主题: Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems
> > Award
> > >> >>>
> > >> >>> Congratulations \uD83E\uDD73
> > >> >>>
> > >> >>> On 03-Jul-2023, at 3:30 PM, Xintong Song  > >> > wrote:
> > >> >>>
> > >> >>> 
> > >> >>> Dear Community,
> > >> >>>
> > >> >>> I'm pleased to share this good news with everyone. As some of you
> > may
> > >> have already heard, Apache Flink has won the 2023 SIGMOD Systems Award
> > [1].
> > >> >>>
> > >> >>> "Apache Flink greatly expanded the use of stream data-processing."
> > --
> > >> SIGMOD Awards Committee
> > >> >>>
> > >> >>> SIGMOD is one of the most influential data management research
> > >> conferences in the world. The Systems Award is awarded to an individual
> > or
> > >> set of individuals to recognize the development of a software or
> > hardware
> > >> system whose technical contributions have had significant impact on the
> > >> theory or practice of large-scale data management systems. Winning of
> > the
> > >> award indicates the high recognition of Flink's technological
> > advancement
> > >> and industry influence from academia.
> > >> >>>
> > >> >>> As an open-source project, Flink wouldn't have come this far without
> > >> the wide, active and supportive community behind it. Kudos to all of us
> > who
> > >> helped make this happen, including the over 1,400 contributors and many
> > >> others who contributed in ways beyond code.
> > >> >>>
> > >> >>> Best,
> > >> >>> Xintong (on behalf of the Flink PMC)
> > >> >>>
> > >> >>> [1] https://sigmod.org/2023-sigmod-systems-award/
> > >> >>>
> > >>
> > >>
> >


Re: [VOTE] FLIP-321: introduce an API deprecation process

2023-07-03 Thread Jingsong Li
+1 binding

On Tue, Jul 4, 2023 at 10:40 AM Zhu Zhu  wrote:
>
> +1 (binding)
>
> Thanks,
> Zhu
>
> ConradJam  于2023年7月3日周一 22:39写道:
> >
> > +1 (no-binding)
> >
> > Matthias Pohl  于2023年7月3日周一 22:33写道:
> >
> > > Thanks, Becket
> > >
> > > +1 (binding)
> > >
> > > On Mon, Jul 3, 2023 at 10:44 AM Jing Ge 
> > > wrote:
> > >
> > > > +1(binding)
> > > >
> > > > On Mon, Jul 3, 2023 at 10:19 AM Stefan Richter
> > > >  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > >
> > > > > > On 3. Jul 2023, at 10:08, Martijn Visser 
> > > > > wrote:
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 3, 2023 at 10:03 AM Xintong Song  > > > > > wrote:
> > > > > >
> > > > > >> +1 (binding)
> > > > > >>
> > > > > >> Best,
> > > > > >>
> > > > > >> Xintong
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Sat, Jul 1, 2023 at 11:26 PM Dong Lin 
> > > wrote:
> > > > > >>
> > > > > >>> Thanks for the FLIP.
> > > > > >>>
> > > > > >>> +1 (binding)
> > > > > >>>
> > > > > >>> On Fri, Jun 30, 2023 at 5:39 PM Becket Qin 
> > > > > wrote:
> > > > > >>>
> > > > >  Hi folks,
> > > > > 
> > > > >  I'd like to start the VOTE for FLIP-321[1] which proposes to
> > > > introduce
> > > > > >> an
> > > > >  API deprecation process to Flink. The discussion thread for the
> > > FLIP
> > > > > >> can
> > > > > >>> be
> > > > >  found here[2].
> > > > > 
> > > > >  The vote will be open until at least July 4, following the
> > > consensus
> > > > > >>> voting
> > > > >  process.
> > > > > 
> > > > >  Thanks,
> > > > > 
> > > > >  Jiangjie (Becket) Qin
> > > > > 
> > > > >  [1]
> > > > > 
> > > > > 
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > > https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%253A%2BIntroduce%2Ban%2BAPI%2Bdeprecation%2Bprocess&source=gmail-imap&ust=168897655400&usg=AOvVaw24XYJrIcv_vYj1fJVQ7TNY
> > > > >  [2]
> > > > >
> > > >
> > > https://www.google.com/url?q=https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9&source=gmail-imap&ust=168897655400&usg=AOvVaw1yaMLBBkFfvbBhvyAbHYfX
> > > > >
> > > > >
> > > >
> > >


Re: [DISCUSS] Deprecate SourceFunction APIs

2023-07-03 Thread Jingsong Li
> do we have any plan to offer a lighter Source API to decrease the connector 
> development cost?

I remember mentioning it many times, but no contributor did it. ToT

Best,
Jingsong

On Tue, Jul 4, 2023 at 11:01 AM Leonard Xu  wrote:
>
> +1 to deprecate.
> +1 for David’s points.
>
> I’ve one related question, do we have any plan to offer a lighter Source API 
> to decrease the connector development cost?
>
> New Source API is good but too heavy for use cases like tests or even some 
> simple connectors.
>
> Best,
> Leonard
>
>
> > On Jun 6, 2022, at 9:51 PM, tison  wrote:
> >
> > One question from my side:
> >
> > As SourceFunction a @Public interface, we cannot remove it before doing a
> > major version bump (Flink 2.0).
> >
> > Of course it's not a blocker to make such deprecation and let the new
> > interface step in. My question is whether we have a plan to finally remove
> > the deprecated interfaces, or postpone it until a clear plan of Flink 2.0?
> >
> > Best,
> > tison.
> >
> >
> > David Anderson  于2022年6月6日周一 21:35写道:
> >
> >>>
> >>> David, can you elaborate why you need watermark generation in the source
> >>> for your data generators?
> >>
> >>
> >> The training exercises should strive to provide examples of best practices.
> >> If the exercises and their solutions use
> >>
> >> env.fromSource(source, WatermarkStrategy.noWatermarks(), "name-of-source")
> >>  .map(...)
> >>  .assignTimestampsAndWatermarks(...)
> >>
> >> this will help establish this anti-pattern as the normal way of doing
> >> things.
> >>
> >> Most new Flink users are using a KafkaSource with a noWatermarks strategy
> >> and a SimpleStringSchema, followed by a map that does the real
> >> deserialization, followed by the real watermarking -- because they aren't
> >> seeing examples that teach how these interfaces are meant to be used.
> >>
> >> When we redo the sources used in training exercises, I want to avoid these
> >> pitfalls.
> >>
> >> David
> >>
> >> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf  wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> very interesting thread. The proposal for deprecation seems to have
> >> sparked
> >>> a very important discussion. Do we what users struggle with specifically?
> >>>
> >>> Speaking for myself, when I upgrade flink-faker to the new Source API an
> >>> unbounded version of the NumberSequenceSource would have been all I
> >> needed,
> >>> but that's just the data generator use case. I think, that one could be
> >>> solved quite easily. David, can you elaborate why you need watermark
> >>> generation in the source for your data generators?
> >>>
> >>> Cheers,
> >>>
> >>> Konstantin
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski <
> >>> pnowoj...@apache.org>:
> >>>
>  Also +1 to what David has written. But it doesn't mean we should be
> >>> waiting
>  indefinitely to deprecate SourceFunction.
> 
>  Best,
>  Piotrek
> 
>  niedz., 5 cze 2022 o 16:46 Jark Wu  napisał(a):
> 
> > +1 to David's point.
> >
> > Usually, when we deprecate some interfaces, we should point users to
> >>> use
> > the recommended alternatives.
> > However, implementing the new Source interface for some simple
> >>> scenarios
>  is
> > too challenging and complex.
> > We also found it isn't easy to push the internal connector to upgrade
> >>> to
> > the new Source because
> > "FLIP-27 are hard to understand, while SourceFunction is easy".
> >
> > +1 to make implementing a simple Source easier before deprecating
> > SourceFunction.
> >
> > Best,
> > Jark
> >
> >
> > On Sun, 5 Jun 2022 at 07:29, Jingsong Lee 
>  wrote:
> >
> >> +1 to David and Ingo.
> >>
> >> Before deprecate and remove SourceFunction, we should have some
> >>> easier
> > APIs
> >> to wrap new Source, the cost to write a new Source is too high now.
> >>
> >>
> >>
> >> Ingo Bürk 于2022年6月5日 周日05:32写道:
> >>
> >>> I +1 everything David said. The new Source API raised the
> >>> complexity
> >>> significantly. It's great to have such a rich, powerful API that
> >>> can
>  do
> >>> everything, but in the process we lost the ability to onboard
> >>> people
>  to
> >>> the APIs.
> >>>
> >>>
> >>> Best
> >>> Ingo
> >>>
> >>> On 04.06.22 21:21, David Anderson wrote:
>  I'm in favor of this, but I think we need to make it easier to
> >> implement
>  data generators and test sources. As things stand in 1.15,
> >> unless
>  you
> >> can
>  be satisfied with using a NumberSequenceSource followed by a
> >> map,
> >> things
>  get quite complicated. I looked into reworking the data
> >>> generators
> > used
> >>> in
>  the training exercises, and got discouraged by the amount of
> >> work
> >>> involved.
>  (The sources used in the training want to b

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-06-28 Thread Jingsong Li
+1 binding

On Thu, Jun 29, 2023 at 11:03 AM Dong Lin  wrote:
>
> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> Flink 1.18 release will feature freeze on July 11. We hope to make this
> feature available in Flink 1.18.
>
> The vote will be open until at least July 4th (at least 72 hours), following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37


Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-27 Thread Jingsong Li
Looks good to me!

Thanks Dong, Yunfeng and all for your discussion and design.

Best,
Jingsong

On Tue, Jun 27, 2023 at 3:35 PM Jark Wu  wrote:
>
> Thank you Dong for driving this FLIP.
>
> The new design looks good to me!
>
> Best,
> Jark
>
> > 2023年6月27日 14:38,Dong Lin  写道:
> >
> > Thank you Leonard for the review!
> >
> > Hi Piotr, do you have any comments on the latest proposal?
> >
> > I am wondering if it is OK to start the voting thread this week.
> >
> > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu  wrote:
> >
> >> Thanks Dong for driving this FLIP forward!
> >>
> >> Introducing  `backlog status` concept for flink job makes sense to me as
> >> following reasons:
> >>
> >> From concept/API design perspective, it’s more general and natural than
> >> above proposals as it can be used in HybridSource for bounded records, CDC
> >> Source for history snapshot and general sources like KafkaSource for
> >> historical messages.
> >>
> >> From user cases/requirements, I’ve seen many users manually to set larger
> >> checkpoint interval during backfilling and then set a shorter checkpoint
> >> interval for real-time processing in their production environments as a
> >> flink application optimization. Now, the flink framework can make this
> >> optimization no longer require the user to set the checkpoint interval and
> >> restart the job multiple times.
> >>
> >> Following supporting using larger checkpoint for job under backlog status
> >> in current FLIP, we can explore supporting larger parallelism/memory/cpu
> >> for job under backlog status in the future.
> >>
> >> In short, the updated FLIP looks good to me.
> >>
> >>
> >> Best,
> >> Leonard
> >>
> >>
> >>> On Jun 22, 2023, at 12:07 PM, Dong Lin  wrote:
> >>>
> >>> Hi Piotr,
> >>>
> >>> Thanks again for proposing the isProcessingBacklog concept.
> >>>
> >>> After discussing with Becket Qin and thinking about this more, I agree it
> >>> is a better idea to add a top-level concept to all source operators to
> >>> address the target use-case.
> >>>
> >>> The main reason that changed my mind is that isProcessingBacklog can be
> >>> described as an inherent/nature attribute of every source instance and
> >> its
> >>> semantics does not need to depend on any specific checkpointing policy.
> >>> Also, we can hardcode the isProcessingBacklog behavior for the sources we
> >>> have considered so far (e.g. HybridSource and MySQL CDC source) without
> >>> asking users to explicitly configure the per-source behavior, which
> >> indeed
> >>> provides better user experience.
> >>>
> >>> I have updated the FLIP based on the latest suggestions. The latest FLIP
> >> no
> >>> longer introduces per-source config that can be used by end-users. While
> >> I
> >>> agree with you that CheckpointTrigger can be a useful feature to address
> >>> additional use-cases, I am not sure it is necessary for the use-case
> >>> targeted by FLIP-309. Maybe we can introduce CheckpointTrigger separately
> >>> in another FLIP?
> >>>
> >>> Can you help take another look at the updated FLIP?
> >>>
> >>> Best,
> >>> Dong
> >>>
> >>>
> >>>
> >>> On Fri, Jun 16, 2023 at 11:59 PM Piotr Nowojski 
> >>> wrote:
> >>>
>  Hi Dong,
> 
> > Suppose there are 1000 subtask and each subtask has 1% chance of being
> > "backpressured" at a given time (due to random traffic spikes). Then at
>  any
> > given time, the chance of the job
> > being considered not-backpressured = (1-0.01)^1000. Since we evaluate
> >> the
> > backpressure metric once a second, the estimated time for the job
> > to be considered not-backpressured is roughly 1 / ((1-0.01)^1000) =
> >> 23163
> > sec = 6.4 hours.
> >
> > This means that the job will effectively always use the longer
> > checkpointing interval. It looks like a real concern, right?
> 
>  Sorry I don't understand where you are getting those numbers from.
>  Instead of trying to find loophole after loophole, could you try to
> >> think
>  how a given loophole could be improved/solved?
> 
> > Hmm... I honestly think it will be useful to know the APIs due to the
> > following reasons.
> 
>  Please propose something. I don't think it's needed.
> 
> > - For the use-case mentioned in FLIP-309 motivation section, would the
>  APIs
> > of this alternative approach be more or less usable?
> 
>  Everything that you originally wanted to achieve in FLIP-309, you could
> >> do
>  as well in my proposal.
>  Vide my many mentions of the "hacky solution".
> 
> > - Can these APIs reliably address the extra use-case (e.g. allow
> > checkpointing interval to change dynamically even during the unbounded
> > phase) as it claims?
> 
>  I don't see why not.
> 
> > - Can these APIs be decoupled from the APIs currently proposed in
>  FLIP-309?
> 
>  Yes
> 
> > For example, if the APIs of this alternative approach can be decoupled
>

Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-24 Thread Jingsong Li
Thanks Becket and all for your discussion.

> 1. We say this FLIP is enforced starting release 2.0. For current 1.x APIs,
we provide a migration period with best effort, while allowing exceptions
for immediate removal in 2.0. That means we will still try with best effort
to get the ProcessFuncion API ready and deprecate the DataStream API in
1.x, but will also be allowed to remove DataStream API in 2.0 if it's not
deprecated 2 minor releases before the major version bump.

> 2. We strictly follow the process in this FLIP, and will quickly bump the
major version from 2.x to 3.0 once the migration period for DataStream API
is reached.

Sorry, I didn't read the previous detailed discussion because the
discussion list was so long.

I don't really like either of these options.

Considering that DataStream is such an important API, can we offer a third
option:

3. Maintain the DataStream API throughout 2.X and remove it until 3.x. But
there's no need to assume that 2.X is a short version, it's still a normal
major version.

Best,
Jingsong

Becket Qin 于2023年6月22日 周四16:02写道:

> Thanks much for the input, John, Stefan and Jing.
>
> I think Xingtong has well summarized the pros and cons of the two options.
> Let's collect a few more opinions here and we can move forward with the one
> more people prefer.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Jun 21, 2023 at 3:20 AM Jing Ge 
> wrote:
>
> > Hi all,
> >
> > Thanks Xingtong for the summary. If I could only choose one of the given
> > two options, I would go with option 1. I understood that option 2 worked
> > great with Kafka. But the bridge release will still confuse users and my
> > gut feeling is that many users will skip 2.0 and be waiting for 3.0 or
> even
> > 3.x. And since fewer users will use Flink 2.x, the development focus will
> > be on Flink 3.0 with the fact that the current Flink release is 1.17 and
> we
> > are preparing 2.0 release. That is weird for me.
> >
> > THB, I would not name the change from @Public to @Retired as a demotion.
> > The purpose of @Retire is to extend the API lifecycle with one more
> stage,
> > like in the real world, people born, studied, graduated, worked, and
> > retired. Afaiu from the previous discussion, there are two rules we'd
> like
> > to follow simultaneously:
> >
> > 1. Public APIs can only be changed between major releases.
> > 2. A smooth migration phase should be offered to users, i.e. at least 2
> > minor releases after APIs are marked as @deprecated. There should be new
> > APIs as the replacement.
> >
> > Agree, those rules are good to improve the user friendliness. Issues we
> > discussed are rising because we want to fulfill both of them. If we take
> > care of deprecation very seriously, APIs can be marked as @Deprecated,
> only
> > when the new APIs as the replacement provide all functionalities the
> > deprecated APIs have. In an ideal case without critical bugs that might
> > stop users adopting the new APIs. Otherwise the expected "replacement"
> will
> > not happen. Users will still stick to the deprecated APIs, because the
> new
> > APIs can not be used. For big features, it will need at least 4 minor
> > releases(ideal case), i.e. 2+ years to remove deprecated APIs:
> >
> > - 1st minor release to build the new APIs as the replacement and waiting
> > for feedback. It might be difficult to mark the old API as deprecated in
> > this release, because we are not sure if the new APIs could cover 100%
> > functionalities.
> > -  In the lucky case,  mark all old APIs as deprecated in the 2nd minor
> > release. (I would even suggest having the new APIs released at least for
> > two minor releases before marking it as deprecated to make sure they can
> > really replace the old APIs, in case we care more about smooth migration)
> > - 3rd minor release for the migration period
> > -  In another lucky case, the 4th release is a major release, the
> > deprecated APIs could be removed.
> >
> > The above described scenario works only in an ideal case. In reality, it
> > might take longer to get the new APIs ready and mark the old API
> > deprecated. Furthermore, if the 4th release is not a major release, we
> will
> > have to maintain both APIs for many further minor releases. The question
> is
> > how to know the next major release in advance, especially 4 minor
> releases'
> > period, i.e. more than 2 years in advance? Given that Flink contains many
> > modules, it is difficult to ask devs to create a 2-3 years deprecation
> plan
> > for each case. In case we want to build major releases at a fast pace,
> > let's say every two years, it means devs must plan any API deprecation
> > right after each major release. Afaiac, it is quite difficult.
> >
> > The major issue is, afaiu, if we follow rule 2, we have to keep all
> @Public
> > APIs, e.g. DataStream, that are not marked as deprecated yet, to 2.0.
> Then
> > we have to follow rule 1 to keep it unchanged until we have 3.0. That is
> > why @Retired is usef

Re: Async I/O: preserve stream order for requests on key level

2023-06-20 Thread Jingsong Li
+1 for this.

Actually, this is a headache for Flink SQL too.

There is certainly a lot of updated data (CDC changelog) in real
stream processing, The semantics here is the need to ensure the order
between keys, and different keys can be handled in disorder.

I'm very happy that the community has a similar need, and I think it's
worth refining it in Flink.

Best,
Jingsong

On Tue, Jun 20, 2023 at 10:20 PM Juho Autio
 wrote:
>
> Thank you very much! It seems like you have a quite similar goal. However,
> could you clarify: do you maintain the stream order on key level, or do you
> just limit the parallel requests per key to one without caring about the
> order?
>
> I'm not 100% sure how your implementation with futures is done. If you are
> able to share a code snippet that would be much appreciated!
>
> I'm also wondering what kind of memory implication that implementation has:
> would the futures be queued inside the operator without any limit? Would it
> be a problem if the same key has too many records within the same time
> window? But I suppose the function can be made blocking to protect against
> that.
>
> On Tue, Jun 20, 2023 at 3:34 PM Galen Warren
>  wrote:
>
> > Hi Juho -- I'm doing something similar. In my case, I want to execute async
> > requests concurrently for inputs associated with different keys but issue
> > them sequentially for any given key. The way I do it is to create a keyed
> > stream and use it as an input to an async function. In this arrangement,
> > all the inputs for a given key are handled by a single instance of the
> > async function; inside that function instance, I use a map to keep track of
> > any in-flight requests for a given key. When a new input comes in for a
> > key, if there is an existing in-flight request for that key, the future
> > that is constructed for the new request is constructed as [existing
> > request].then([new request]) so that the new one is only executed once the
> > in-flight request completes. The futures are constructed in such a way that
> > they maintain the map properly after completing.
> >
> >
> > On Mon, Jun 19, 2023 at 10:55 AM Juho Autio 
> > wrote:
> >
> > > I need to make some slower external requests in parallel, so Async I/O
> > > helps greatly with that. However, I also need to make the requests in a
> > > certain order per key. Is that possible with Async I/O?
> > >
> > > The documentation[1] talks about preserving the stream order of
> > > results, but it doesn't discuss the order of the async requests. I tried
> > to
> > > use AsyncDataStream.orderedWait, but the order of async requests seems to
> > > be random – the order of calls gets shuffled even if I
> > > use AsyncDataStream.orderedWait.
> > >
> > > If that is by design, would there be any suggestion how to work around
> > > that? I was thinking of collecting all events of the same key into a
> > > List, so that the async operator gets a list instead of individual
> > events.
> > > There are of course some downsides with using a List, so I would rather
> > > have something better.
> > >
> > > In a nutshell my code is:
> > >
> > > AsyncDataStream.orderedWait(stream.keyBy(key), asyncFunction)
> > >
> > > The asyncFunction extends RichAsyncFunction.
> > >
> > > Thanks!
> > >
> > > [1]
> > >
> > >
> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/#order-of-results
> > >
> > > (Sorry if it's not appropriate to post this type of question to the dev
> > > mailing list. I tried the Flink users list with no luck.)
> > >


Re: [VOTE] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-12 Thread Jingsong Li
+1

On Mon, Jun 12, 2023 at 10:25 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> +1 (binding)
>
> Best,
> Rui Fan
>
> On Mon, Jun 12, 2023 at 19:58 liu ron  wrote:
>
> > +1 (no-binding)
> >
> > Best,
> > Ron
> >
> > Jing Ge  于2023年6月12日周一 19:33写道:
> >
> > > +1(binding) Thanks!
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Mon, Jun 12, 2023 at 12:01 PM yuxia 
> > > wrote:
> > >
> > > > +1 (binding)
> > > > Thanks Mang driving it.
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> > > > - 原始邮件 -
> > > > 发件人: "zhangmang1" 
> > > > 收件人: "dev" 
> > > > 发送时间: 星期一, 2023年 6 月 12日 下午 5:31:10
> > > > 主题: [VOTE] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS)
> > > > statement
> > > >
> > > > Hi everyone,
> > > >
> > > > Thanks for all the feedback about FLIP-305: Support atomic for CREATE
> > > > TABLE AS SELECT(CTAS) statement[1].
> > > > [2] is the discussion thread.
> > > >
> > > > I'd like to start a vote for it. The vote will be open for at least 72
> > > > hours (until June 15th, 10:00AM GMT) unless there is an objection or an
> > > > insufficient number of votes.[1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
> > > > [2]https://lists.apache.org/thread/n6nsvbwhs5kwlj5kjgv24by2tk5mh9xd
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best regards,
> > > > Mang Zhang
> > > >
> > >
> >


Re: [VOTE] FLIP-311: Support Call Stored Procedure

2023-06-12 Thread Jingsong Li
+1

On Mon, Jun 12, 2023 at 10:32 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> +1 (binding)
>
> Best,
> Rui Fan
>
> On Mon, Jun 12, 2023 at 22:20 Benchao Li  wrote:
>
> > +1 (binding)
> >
> > yuxia  于2023年6月12日周一 17:58写道:
> >
> > > Hi everyone,
> > > Thanks for all the feedback about FLIP-311: Support Call Stored
> > > Procedure[1]. Based on the discussion [2], we have come to a consensus,
> > so
> > > I would like to start a vote.
> > > The vote will be open for at least 72 hours (until June 15th, 10:00AM
> > GMT)
> > > unless there is an objection or an insufficient number of votes.
> > >
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> > > [2] https://lists.apache.org/thread/k6s50gcgznon9v1oylyh396gb5kgrwmd
> > >
> > > Best regards,
> > > Yuxia
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >


Re: Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-06-07 Thread Jingsong Li
Thanks Mang for updating!

Looks good to me!

Best,
Jingsong

On Wed, Jun 7, 2023 at 2:31 PM Mang Zhang  wrote:
>
> Hi Jingsong,
>
> >I have some doubts about the `TwoPhaseCatalogTable`. Generally, our
> >Flink design places execution in the TableFactory or directly in the
> >Catalog, so introducing an executable table makes me feel a bit
> >strange. (Spark is this style, but Flink may not be)
> On this issue, we introduce the executable logic commit/abort a bit of 
> strange on CatalogTable.
> After an offline discussion with yuxia, I tweaked the FLIP-305[1] scenario.
> The new solution is similar to the implementation of SupportsOverwrite,
> which introduces the SupportsStaging interface and infers whether 
> DynamicTableSink supports atomic ctas based on whether it implements the 
> SupportsStaging interface,
> and if so, it will get the StagedTable object from DynamicTableSink.
>
> For more implementation details, please see the FLIP-305 document.
>
> This is my poc commits 
> https://github.com/Tartarus0zm/flink/commit/025b30ad8f1a03e7738e9bb534e6e491c31990fa
>
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
>
>
> --
>
> Best regards,
>
> Mang Zhang
>
>
>
> At 2023-05-12 13:02:14, "Jingsong Li"  wrote:
> >Hi Mang,
> >
> >Thanks for starting this FLIP.
> >
> >I have some doubts about the `TwoPhaseCatalogTable`. Generally, our
> >Flink design places execution in the TableFactory or directly in the
> >Catalog, so introducing an executable table makes me feel a bit
> >strange. (Spark is this style, but Flink may not be)
> >
> >And for `TwoPhase`, maybe `StagedXXX` like Spark is better?
> >
> >Best,
> >Jingsong
> >
> >On Wed, May 10, 2023 at 9:29 PM Mang Zhang  wrote:
> >>
> >> Hi Ron,
> >>
> >>
> >> First of all, thank you for your reply!
> >> After our offline communication, what you said is mainly in the 
> >> compilePlan scenario, but currently compilePlanSql does not support non 
> >> INSERT statements, otherwise it will throw an exception.
> >> >Unsupported SQL query! compilePlanSql() only accepts a single SQL 
> >> >statement of type INSERT
> >> But it's a good point that I will seriously consider.
> >> Non-atomic CTAS can be supported relatively easily;
> >> But atomic CTAS needs more adaptation work, so I'm going to leave it as is 
> >> and follow up with a separate issue to implement CTAS support for 
> >> compilePlanSql.
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Best regards,
> >> Mang Zhang
> >>
> >>
> >>
> >>
> >>
> >> At 2023-04-23 17:52:07, "liu ron"  wrote:
> >> >Hi, Mang
> >> >
> >> >I have a question about the implementation details. For the atomicity 
> >> >case,
> >> >since the target table is not created before the JobGraph is generated, 
> >> >but
> >> >then the target table is required to exist when optimizing plan to 
> >> >generate
> >> >the JobGraph. So how do you solve this problem?
> >> >
> >> >Best,
> >> >Ron
> >> >
> >> >yuxia  于2023年4月20日周四 09:35写道:
> >> >
> >> >> Share some insights about the new TwoPhaseCatalogTable proposed after
> >> >> offline discussion with Mang.
> >> >> The main or important reason is that the TwoPhaseCatalogTable enables
> >> >> external connectors to implement theirs own logic for commit / abort.
> >> >> In FLIP-218, for atomic CTAS, the Catalog will then just drop the table
> >> >> when the job fail. It's not ideal for it's too generic to work well.
> >> >> For example, some connectors will need to clean some temporary files in
> >> >> abort method. And the actual connector can know the specific logic for
> >> >> aborting.
> >> >>
> >> >> Best regards,
> >> >> Yuxia
> >> >>
> >> >>
> >> >> 发件人: "zhangmang1" 
> >> >> 收件人: "dev" , "Jing Ge" 
> >> >> 抄送: "ron9 liu" , "lincoln 86xy" <
> >> >> lincoln.8...@gmail.com>, luoyu...@alumni.sjtu.edu.cn
> >> >> 发送时间: 星期三, 2023年 4 月 19日 下午 3:13:36
> >> >> 主题: Re:R

Re: [VOTE] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-06-07 Thread Jingsong Li
+1

On Wed, Jun 7, 2023 at 3:03 PM Benchao Li  wrote:
>
> +1, binding
>
> Jark Wu  于2023年6月7日周三 14:44写道:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > > 2023年6月7日 14:20,liu ron  写道:
> > >
> > > Hi everyone,
> > >
> > > Thanks for all the feedback about FLIP-315: Support Operator Fusion
> > Codegen
> > > for Flink SQL[1].
> > > [2] is the discussion thread.
> > >
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours (until June 12th, 12:00AM GMT) unless there is an objection or an
> > > insufficient number of votes.
> > >
> > > [1]:
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> > > [2]: https://lists.apache.org/thread/9cnqhsld4nzdr77s2fwf00o9cb2g9fmw
> > >
> > > Best,
> > > Ron
> >
> >
>
> --
>
> Best,
> Benchao Li


Re: [VOTE] Release flink-connector-pulsar 3.0.1, release candidate #1

2023-06-06 Thread Jingsong Li
+1 (binding)

- checked NOTICE and LICENSE
- Verified signatures and hashes
- Build and compile the source code locally
- No unexpected binaries in the source release

Something can be improved:

- NOTICE file can be updated to 2014-2023 from 2014-2022.

Best,
Jingsong

On Tue, Jun 6, 2023 at 3:59 PM Jark Wu  wrote:
>
> +1 (binding)
>
> - Build and compile the source code locally: *OK*
> - Verified signatures and hashes: *OK*
> - Checked no missing artifacts in the staging area: *OK*
> - Reviewed the website release PR: *OK*
>
> Best,
> Jark
>
> On Thu, 1 Jun 2023 at 07:30, Jing Ge  wrote:
>
> > +1(non-binding)
> >
> > - verified sign
> > - verified hash
> > - checked repos
> > - checked tag. NIT: the tag link should be:
> > https://github.com/apache/flink-connector-pulsar/releases/tag/v3.0.1-rc1
> > - reviewed PR. NIT: left a comment.
> >
> > Best regards,
> > Jing
> >
> > On Wed, May 31, 2023 at 11:16 PM Neng Lu  wrote:
> >
> > > +1
> > >
> > > I verified
> > >
> > > + the release now can communicate with Pulsar using OAuth2 auth plugin
> > > + build from source and run unit tests with JDK 17 on macOS M1Max
> > >
> > >
> > > On Wed, May 31, 2023 at 4:24 AM Zili Chen  wrote:
> > >
> > > > +1
> > > >
> > > > I verified
> > > >
> > > > + LICENSE and NOTICE present
> > > > + Checksum and GPG sign matches
> > > > + No unexpected binaries in the source release
> > > > + Build from source and run unit tests with JDK 17 on macOS M1
> > > >
> > > > On 2023/05/25 16:18:51 Leonard Xu wrote:
> > > > > Hey all,
> > > > >
> > > > > Please review and vote on the release candidate #1 for the version
> > > 3.0.1
> > > > of the
> > > > > Apache Flink Pulsar Connector as follows:
> > > > >
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > > >
> > > > > The complete staging area is available for your review, which
> > includes:
> > > > > JIRA release notes [1],
> > > > > The official Apache source release to be deployed to dist.apache.org
> > > > [2], which are signed with the key with
> > > > fingerprint5B2F6608732389AEB67331F5B197E1F1108998AD [3],
> > > > > All artifacts to be deployed to the Maven Central Repository [4],
> > > > > Source code tag v3.0.1-rc1 [5],
> > > > > Website pull request listing the new release [6].
> > > > > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > > >
> > > > >
> > > > > Best,
> > > > > Leonard
> > > > >
> > > > > [1]
> > > >
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352640
> > > > > [2]
> > > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-pulsar-3.0.1-rc1/
> > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > [4]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1641/
> > > > > [5] https://github.com/apache/flink-connector-pulsar/tree/v3.0.1-rc1
> > > > > [6] https://github.com/apache/flink-web/pull/655
> > > > >
> > > > >
> > > >
> > >
> >


Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-06-04 Thread Jingsong Li
> For the state compatibility session, it seems that the checkpoint 
> compatibility would be broken just like [1] did. Could FLIP-190 [2] still be 
> helpful in this case for SQL version upgrades?

I guess this is only for batch processing. Streaming should be another story?

Best,
Jingsong

On Mon, Jun 5, 2023 at 2:07 PM Yun Tang  wrote:
>
> Hi Ron,
>
> I think this FLIP would help to improve the performance, looking forward to 
> its completion in Flink!
>
> For the state compatibility session, it seems that the checkpoint 
> compatibility would be broken just like [1] did. Could FLIP-190 [2] still be 
> helpful in this case for SQL version upgrades?
>
>
> [1] 
> https://docs.google.com/document/d/1qKVohV12qn-bM51cBZ8Hcgp31ntwClxjoiNBUOqVHsI/edit#heading=h.fri5rtpte0si
> [2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489
>
> Best
> Yun Tang
>
> 
> From: Lincoln Lee 
> Sent: Monday, June 5, 2023 10:56
> To: dev@flink.apache.org 
> Subject: Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL
>
> Hi Ron
>
> OFGC looks like an exciting optimization, looking forward to its completion
> in Flink!
> A small question, do we consider adding a benchmark for the operators to
> intuitively understand the improvement brought by each improvement?
> In addition, for the implementation plan, mentioned in the FLIP that 1.18
> will support Calc, HashJoin and HashAgg, then what will be the next step?
> and which operators do we ultimately expect to cover (all or specific ones)?
>
> Best,
> Lincoln Lee
>
>
> liu ron  于2023年6月5日周一 09:40写道:
>
> > Hi, Jark
> >
> > Thanks for your feedback, according to my initial assessment, the work
> > effort is relatively large.
> >
> > Moreover, I will add a test result of all queries to the FLIP.
> >
> > Best,
> > Ron
> >
> > Jark Wu  于2023年6月1日周四 20:45写道:
> >
> > > Hi Ron,
> > >
> > > Thanks a lot for the great proposal. The FLIP looks good to me in
> > general.
> > > It looks like not an easy work but the performance sounds promising. So I
> > > think it's worth doing.
> > >
> > > Besides, if there is a complete test graph with all TPC-DS queries, the
> > > effect of this FLIP will be more intuitive.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >
> > > On Wed, 31 May 2023 at 14:27, liu ron  wrote:
> > >
> > > > Hi, Jinsong
> > > >
> > > > Thanks for your valuable suggestions.
> > > >
> > > > Best,
> > > > Ron
> > > >
> > > > Jingsong Li  于2023年5月30日周二 13:22写道:
> > > >
> > > > > Thanks Ron for your information.
> > > > >
> > > > > I suggest that it can be written in the Motivation of FLIP.
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > > > > On Tue, May 30, 2023 at 9:57 AM liu ron  wrote:
> > > > > >
> > > > > > Hi, Jingsong
> > > > > >
> > > > > > Thanks for your review. We have tested it in TPC-DS case, and got a
> > > 12%
> > > > > > gain overall when only supporting only Calc&HashJoin&HashAgg
> > > operator.
> > > > In
> > > > > > some queries, we even get more than 30% gain, it looks like  an
> > > > effective
> > > > > > way.
> > > > > >
> > > > > > Best,
> > > > > > Ron
> > > > > >
> > > > > > Jingsong Li  于2023年5月29日周一 14:33写道:
> > > > > >
> > > > > > > Thanks Ron for the proposal.
> > > > > > >
> > > > > > > Do you have some benchmark results for the performance
> > > improvement? I
> > > > > > > am more concerned about the improvement on Flink than the data in
> > > > > > > other papers.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > > > On Mon, May 29, 2023 at 2:16 PM liu ron 
> > > wrote:
> > > > > > > >
> > > > > > > > Hi, dev
> > > > > > > >
> > > > > > > > I'd like to start a discussion about FLIP-315: Support Operator
> > > > > Fusion
>

Re: [DISCUSS] FLIP 295: Support persistence of Catalog configuration and asynchronous registration

2023-06-01 Thread Jingsong Li
Thanks Feng,

Just naming, maybe `createCatalog` in TableEnv, I can see many
functions are converted to createxxx from registerxxx.

On Fri, Jun 2, 2023 at 11:04 AM Feng Jin  wrote:
>
> Hi jark, thanks for your suggestion.
>
> > 1. How to register the CatalogStore for Table API? I think the
> CatalogStore should be immutable once TableEnv is created. Otherwise, there
> might be some data inconsistencies when CatalogStore is changed.
>
> Yes, We should initialize the CatalogStore when creating the TableEnv.
> Therefore, my current proposal is to add a method to configure the
> CatalogStore in EnvironmentSettings.
>
> // Initialize a catalog Store
> CatalogStore catalogStore = new FileCatalogStore("");
>
> // set up the Table API
> final EnvironmentSettings settings =
> EnvironmentSettings.newInstance().inBatchMode()
> .withCatalogStore(catalogStore)
> .build();
>
> final TableEnvironment tableEnv = TableEnvironment.create(settings);
>
>
> > 2. Why does the CatalogStoreFactory interface only have a default method,
> not an interface method?
>
> Sorry, While I did refer to the Catalog interface, I agree that as a new
> interface, the CatalogStoreFactory should not have a default method but an
> interface method.  I have already modified the interface.
>
> > 3. Please mention the alternative API in Javadoc for the
> deprecated`registerCatalog`.
> > 4. In the "Compatibility" section, would be better to mention the changed
> behavior of CREATE CATALOG statement if FileCatalogStore (or other
> persisted catalog store) is used.
>
> Thanks for the suggestion, I have updated the FLIP [1].
>
>
> [1].
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>
>
> Best,
> Feng
>
> On Thu, Jun 1, 2023 at 9:22 PM Jark Wu  wrote:
>
> > Hi Feng,
> >
> > This is a useful FLIP. Thanks for starting this discussion.
> > The current design looks pretty good to me. I just have some minor
> > comments.
> >
> > 1. How to register the CatalogStore for Table API? I think the CatalogStore
> > should be immutable once TableEnv is created. Otherwise, there might be
> > some data inconsistencies when CatalogStore is changed.
> >
> > 2. Why does the CatalogStoreFactory interface only have a default method,
> > not an interface method?
> >
> > 3. Please mention the alternative API in Javadoc for the deprecated
> > `registerCatalog`.
> >
> > 4. In the "Compatibility" section, would be better to mention the changed
> > behavior of CREATE CATALOG statement if FileCatalogStore (or other
> > persisted catalog store) is used.
> >
> >
> > Best,
> > Jark
> >
> > On Thu, 1 Jun 2023 at 11:26, Feng Jin  wrote:
> >
> > > Hi ,  thanks all for reviewing the flip.
> > >
> > > @Ron
> > >
> > > >  Regarding the CatalogStoreFactory#createCatalogStore method, do we
> > need
> > > to provide a default implementation?
> > >
> > > Yes, we will provide a default InMemoryCatalogStoreFactory to create an
> > > InMemoryCatalogStore.
> > >
> > > >  If we get a Catalog from CatalogStore, after initializing it, whether
> > we
> > > put it in Map catalogs again?
> > >
> > > Yes, in the current design, catalogs are stored as snapshots, and once
> > > initialized, the Catalog will be placed in the Map
> > > catalogs.
> > > Subsequently, the Map catalogs will be the primary
> > source
> > > for obtaining the corresponding Catalog.
> > >
> > > >   how about renaming them to `catalog.store.type` and
> > > `catalog.store.path`?
> > >
> > > I think it is okay. Adding "sql" at the beginning may seem a bit
> > strange. I
> > > will update the FLIP.
> > >
> > >
> > >
> > > @Shammon
> > >
> > > Thank you for the review. I have made the necessary corrections.
> > > Regarding the modifications made to the Public Interface, I have also
> > > included the relevant changes to the `TableEnvironment`.
> > >
> > >
> > > Best,
> > > Feng
> > >
> > >
> > > On Wed, May 31, 2023 at 5:02 PM Shammon FY  wrote:
> > >
> > > > Hi feng,
> > > >
> > > > Thanks for updating, I have some minor comments
> > > >
> > > > 1. The modification of `CatalogManager` should not be in `Public
> > > > Interfaces`, it is not a public interface.
> > > >
> > > > 2. `@PublicEvolving` should be added for `CatalogStore` and
> > > > `CatalogStoreFactory`
> > > >
> > > > 3. The code `Optional optionalDescriptor =
> > > > catalogStore.get(catalogName);` in the `CatalogManager` should be
> > > > `Optional optionalDescriptor =
> > > > catalogStore.get(catalogName);`
> > > >
> > > > Best,
> > > > Shammon FY
> > > >
> > > >
> > > > On Wed, May 31, 2023 at 2:24 PM liu ron  wrote:
> > > >
> > > > > Hi, Feng
> > > > >
> > > > > Thanks for driving this FLIP, this proposal is very useful for
> > catalog
> > > > > management.
> > > > > I have some small questions:
> > > > >
> > > > > 1. Regarding the CatalogStoreFactory#createCatalogStore m

Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-05-30 Thread Jingsong Li
Thanks for your explanation.

We can support Iterable in future. Current design looks good to me.

Best,
Jingsong

On Tue, May 30, 2023 at 4:56 PM yuxia  wrote:
>
> Hi, Jingsong.
> Thanks for your feedback.
>
> > Does this need to be a function call? Do you have some example?
> I think it'll be useful to support function call when user call procedure.
> The following example is from iceberg:[1]
> CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 
> 'bar'));
>
> It allows user to use `map('foo', 'bar')` to pass a map data to procedure.
>
> Another case that I can imagine may be rollback a table to the snapshot of 
> one week ago.
> Then, with function call, user may call `rollback(table_name, now() - 
> INTERVAL '7' DAY)` to acheive such purpose.
>
> Although it can be function call, the eventual parameter got by the procedure 
> will always be the literal evaluated.
>
>
> > Procedure looks like a TableFunction, do you consider using Collector
> something like TableFunction? (Supports large amount of data)
>
> Yes, I had considered it. But returns T[] is for simpility,
>
> First, regarding how to return the calling result of a procedure, it looks 
> more intuitive to me to use the return result of the `call` method instead of 
> by calling something like collector#collect.
> Introduce a collector will increase necessary complexity.
>
> Second, regarding supporting large amount of data,  acoording my 
> investagtion, I haven't seen the requirement that supports returning large 
> amount of data.
> Iceberg also return an array.[2] If you do think we should support large 
> amount of data, I think we can change to return type from T[] to Iterable
>
> [1]: https://iceberg.apache.org/docs/latest/spark-procedures/#migrate
> [2]: 
> https://github.com/apache/iceberg/blob/601c5af9b6abded79dabeba177331310d5487f43/spark/v3.2/spark/src/main/java/org/apache/spark/sql/connector/iceberg/catalog/Procedure.java#L44
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Jingsong Li" 
> 收件人: "dev" 
> 发送时间: 星期一, 2023年 5 月 29日 下午 2:42:04
> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
>
> Thanks Yuxia for the proposal.
>
> > CALL [catalog_name.][database_name.]procedure_name ([ expression [, 
> > expression]* ] )
>
> The expression can be a function call. Does this need to be a function
> call? Do you have some example?
>
> > Procedure returns T[]
>
> Procedure looks like a TableFunction, do you consider using Collector
> something like TableFunction? (Supports large amount of data)
>
> Best,
> Jingsong
>
> On Mon, May 29, 2023 at 2:33 PM yuxia  wrote:
> >
> > Hi, everyone.
> >
> > I’d like to start a discussion about FLIP-311: Support Call Stored 
> > Procedure [1]
> >
> > Stored procedure provides a convenient way to encapsulate complex logic to 
> > perform data manipulation or administrative tasks in external storage 
> > systems. It's widely used in traditional databases and popular compute 
> > engines like Trino for it's convenience. Therefore, we propose adding 
> > support for call stored procedure in Flink to enable better integration 
> > with external storage systems.
> >
> > With this FLIP, Flink will allow connector developers to develop their own 
> > built-in stored procedures, and then enables users to call these predefiend 
> > stored procedures.
> >
> > Looking forward to your feedbacks.
> >
> > [1]: 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> >
> > Best regards,
> > Yuxia


Re: [DISCUSS] Hive dialect shouldn't fall back to Flink's default dialect

2023-05-30 Thread Jingsong Li
+1, the fallback looks weird now, it is outdated.

But, it is good to provide an option. I don't know if there are some
users who depend on this fallback.

Best,
Jingsong

On Tue, May 30, 2023 at 1:47 PM Rui Li  wrote:
>
> +1, the fallback was just intended as a temporary workaround to run 
> catalog/module related statements with hive dialect.
>
> On Mon, May 29, 2023 at 3:59 PM Benchao Li  wrote:
>>
>> Big +1 on this, thanks yuxia for driving this!
>>
>> yuxia  于2023年5月29日周一 14:55写道:
>>
>> > Hi, community.
>> >
>> > I want to start the discussion about Hive dialect shouldn't fall back to
>> > Flink's default dialect.
>> >
>> > Currently, when the HiveParser fail to parse the sql in Hive dialect,
>> > it'll fall back to Flink's default parser[1] to handle flink-specific
>> > statements like "CREATE CATALOG xx with (xx);".
>> >
>> > As I‘m involving with Hive dialect and have some communication with
>> > community users who use Hive dialectrecently,  I'm thinking throw exception
>> > directly instead of falling back to Flink's default dialect when fail to
>> > parse the sql in Hive dialect
>> >
>> > Here're some reasons:
>> >
>> > First of all, it'll hide some error with Hive dialect. For example, we
>> > found we can't use Hive dialect any more with Flink sql client in release
>> > validation phase[2], finally we find a modification in Flink sql client
>> > cause it, but our test case can't find it earlier for although HiveParser
>> > faill to parse it but then it'll fall back to default parser and pass test
>> > case successfully.
>> >
>> > Second, conceptually, Hive dialect should be do nothing with Flink's
>> > default dialect. They are two totally different dialect. If we do need a
>> > dialect mixing Hive dialect and default dialect , may be we need to propose
>> > a new hybrid dialect and announce the hybrid behavior to users.
>> > Also, It made some users confused for the fallback behavior. The fact
>> > comes from I had been ask by community users. Throw an excpetioin directly
>> > when fail to parse the sql statement in Hive dialect will be more 
>> > intuitive.
>> >
>> > Last but not least, it's import to decouple Hive with Flink planner[3]
>> > before we can externalize Hive connector[4]. If we still fall back to Flink
>> > default dialct, then we will need depend on `ParserImpl` in Flink planner,
>> > which will block us removing the provided dependency of Hive dialect as
>> > well as externalizing Hive connector.
>> >
>> > Although we hadn't announced the fall back behavior ever, but some users
>> > may implicitly depend on this behavior in theirs sql jobs. So, I hereby
>> > open the dicussion about abandoning the fall back behavior to make Hive
>> > dialect clear and isoloted.
>> > Please remember it won't break the Hive synatax but the syntax specified
>> > to Flink may fail after then. But for the failed sql, you can use `SET
>> > table.sql-dialect=default;` to switch to Flink dialect.
>> > If there's some flink-specific statements we found should be included in
>> > Hive dialect to be easy to use, I think we can still add them as specific
>> > cases to Hive dialect.
>> >
>> > Look forwards to your feedback. I'd love to listen the feedback from
>> > community to take the next steps.
>> >
>> > [1]:
>> > https://github.com/apache/flink/blob/678370b18e1b6c4a23e5ce08f8efd05675a0cc17/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParser.java#L348
>> > [2]:https://issues.apache.org/jira/browse/FLINK-26681
>> > [3]:https://issues.apache.org/jira/browse/FLINK-31413
>> > [4]:https://issues.apache.org/jira/browse/FLINK-30064
>> >
>> >
>> >
>> > Best regards,
>> > Yuxia
>> >
>>
>>
>> --
>>
>> Best,
>> Benchao Li
>
>
>
> --
> Best regards!
> Rui Li


Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-05-29 Thread Jingsong Li
Thanks Ron for your information.

I suggest that it can be written in the Motivation of FLIP.

Best,
Jingsong

On Tue, May 30, 2023 at 9:57 AM liu ron  wrote:
>
> Hi, Jingsong
>
> Thanks for your review. We have tested it in TPC-DS case, and got a 12%
> gain overall when only supporting only Calc&HashJoin&HashAgg operator. In
> some queries, we even get more than 30% gain, it looks like  an effective
> way.
>
> Best,
> Ron
>
> Jingsong Li  于2023年5月29日周一 14:33写道:
>
> > Thanks Ron for the proposal.
> >
> > Do you have some benchmark results for the performance improvement? I
> > am more concerned about the improvement on Flink than the data in
> > other papers.
> >
> > Best,
> > Jingsong
> >
> > On Mon, May 29, 2023 at 2:16 PM liu ron  wrote:
> > >
> > > Hi, dev
> > >
> > > I'd like to start a discussion about FLIP-315: Support Operator Fusion
> > > Codegen for Flink SQL[1]
> > >
> > > As main memory grows, query performance is more and more determined by
> > the
> > > raw CPU costs of query processing itself, this is due to the query
> > > processing techniques based on interpreted execution shows poor
> > performance
> > > on modern CPUs due to lack of locality and frequent instruction
> > > mis-prediction. Therefore, the industry is also researching how to
> > improve
> > > engine performance by increasing operator execution efficiency. In
> > > addition, during the process of optimizing Flink's performance for TPC-DS
> > > queries, we found that a significant amount of CPU time was spent on
> > > virtual function calls, framework collector calls, and invalid
> > > calculations, which can be optimized to improve the overall engine
> > > performance. After some investigation, we found Operator Fusion Codegen
> > > which is proposed by Thomas Neumann in the paper[2] can address these
> > > problems. I have finished a PoC[3] to verify its feasibility and
> > validity.
> > >
> > > Looking forward to your feedback.
> > >
> > > [1]:
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> > > [2]: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
> > > [3]: https://github.com/lsyldliu/flink/tree/OFCG
> > >
> > > Best,
> > > Ron
> >


Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-05-28 Thread Jingsong Li
Thanks Yuxia for the proposal.

> CALL [catalog_name.][database_name.]procedure_name ([ expression [, 
> expression]* ] )

The expression can be a function call. Does this need to be a function
call? Do you have some example?

> Procedure returns T[]

Procedure looks like a TableFunction, do you consider using Collector
something like TableFunction? (Supports large amount of data)

Best,
Jingsong

On Mon, May 29, 2023 at 2:33 PM yuxia  wrote:
>
> Hi, everyone.
>
> I’d like to start a discussion about FLIP-311: Support Call Stored Procedure 
> [1]
>
> Stored procedure provides a convenient way to encapsulate complex logic to 
> perform data manipulation or administrative tasks in external storage 
> systems. It's widely used in traditional databases and popular compute 
> engines like Trino for it's convenience. Therefore, we propose adding support 
> for call stored procedure in Flink to enable better integration with external 
> storage systems.
>
> With this FLIP, Flink will allow connector developers to develop their own 
> built-in stored procedures, and then enables users to call these predefiend 
> stored procedures.
>
> Looking forward to your feedbacks.
>
> [1]: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
>
> Best regards,
> Yuxia


Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-05-28 Thread Jingsong Li
Thanks Ron for the proposal.

Do you have some benchmark results for the performance improvement? I
am more concerned about the improvement on Flink than the data in
other papers.

Best,
Jingsong

On Mon, May 29, 2023 at 2:16 PM liu ron  wrote:
>
> Hi, dev
>
> I'd like to start a discussion about FLIP-315: Support Operator Fusion
> Codegen for Flink SQL[1]
>
> As main memory grows, query performance is more and more determined by the
> raw CPU costs of query processing itself, this is due to the query
> processing techniques based on interpreted execution shows poor performance
> on modern CPUs due to lack of locality and frequent instruction
> mis-prediction. Therefore, the industry is also researching how to improve
> engine performance by increasing operator execution efficiency. In
> addition, during the process of optimizing Flink's performance for TPC-DS
> queries, we found that a significant amount of CPU time was spent on
> virtual function calls, framework collector calls, and invalid
> calculations, which can be optimized to improve the overall engine
> performance. After some investigation, we found Operator Fusion Codegen
> which is proposed by Thomas Neumann in the paper[2] can address these
> problems. I have finished a PoC[3] to verify its feasibility and validity.
>
> Looking forward to your feedback.
>
> [1]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL
> [2]: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
> [3]: https://github.com/lsyldliu/flink/tree/OFCG
>
> Best,
> Ron


Re: Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) statement

2023-05-11 Thread Jingsong Li
Hi Mang,

Thanks for starting this FLIP.

I have some doubts about the `TwoPhaseCatalogTable`. Generally, our
Flink design places execution in the TableFactory or directly in the
Catalog, so introducing an executable table makes me feel a bit
strange. (Spark is this style, but Flink may not be)

And for `TwoPhase`, maybe `StagedXXX` like Spark is better?

Best,
Jingsong

On Wed, May 10, 2023 at 9:29 PM Mang Zhang  wrote:
>
> Hi Ron,
>
>
> First of all, thank you for your reply!
> After our offline communication, what you said is mainly in the compilePlan 
> scenario, but currently compilePlanSql does not support non INSERT 
> statements, otherwise it will throw an exception.
> >Unsupported SQL query! compilePlanSql() only accepts a single SQL statement 
> >of type INSERT
> But it's a good point that I will seriously consider.
> Non-atomic CTAS can be supported relatively easily;
> But atomic CTAS needs more adaptation work, so I'm going to leave it as is 
> and follow up with a separate issue to implement CTAS support for 
> compilePlanSql.
>
>
>
>
>
>
> --
>
> Best regards,
> Mang Zhang
>
>
>
>
>
> At 2023-04-23 17:52:07, "liu ron"  wrote:
> >Hi, Mang
> >
> >I have a question about the implementation details. For the atomicity case,
> >since the target table is not created before the JobGraph is generated, but
> >then the target table is required to exist when optimizing plan to generate
> >the JobGraph. So how do you solve this problem?
> >
> >Best,
> >Ron
> >
> >yuxia  于2023年4月20日周四 09:35写道:
> >
> >> Share some insights about the new TwoPhaseCatalogTable proposed after
> >> offline discussion with Mang.
> >> The main or important reason is that the TwoPhaseCatalogTable enables
> >> external connectors to implement theirs own logic for commit / abort.
> >> In FLIP-218, for atomic CTAS, the Catalog will then just drop the table
> >> when the job fail. It's not ideal for it's too generic to work well.
> >> For example, some connectors will need to clean some temporary files in
> >> abort method. And the actual connector can know the specific logic for
> >> aborting.
> >>
> >> Best regards,
> >> Yuxia
> >>
> >>
> >> 发件人: "zhangmang1" 
> >> 收件人: "dev" , "Jing Ge" 
> >> 抄送: "ron9 liu" , "lincoln 86xy" <
> >> lincoln.8...@gmail.com>, luoyu...@alumni.sjtu.edu.cn
> >> 发送时间: 星期三, 2023年 4 月 19日 下午 3:13:36
> >> 主题: Re:Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS
> >> SELECT(CTAS) statement
> >>
> >> hi, Jing
> >> Thank you for your reply.
> >> >1. It looks like you found another way to design the atomic CTAS with new
> >> >serializable TwoPhaseCatalogTable instead of making Catalog serializable
> >> as
> >> >described in FLIP-218. Did I understand correctly?
> >> Yes, when I was implementing the FLIP-218 solution, I encountered problems
> >> with Catalog/CatalogTable serialization deserialization, for example, after
> >> deserialization CatalogTable could not be converted to Hive Table. Also,
> >> Catalog serialization is still a heavy operation, but it may not actually
> >> be necessary, we just need Create Table.
> >> Therefore, the TwoPhaseCatalogTable program is proposed, which also
> >> facilitates the implementation of the subsequent data lake, ReplaceTable
> >> and other functions.
> >>
> >> >2. I am a little bit confused about the isStreamingMode parameter of
> >> >Catalog#twoPhaseCreateTable(...), since it is the selector argument(code
> >> >smell) we should commonly avoid in the public interface. According to the
> >> >FLIP,  isStreamingMode will be used by the Catalog to determine whether to
> >> >support atomic or not. With this selector argument, there will be two
> >> >different logics built within one method and it is hard to follow without
> >> >reading the code or the doc carefully(another concern is to keep the doc
> >> >and code alway be consistent) i.e. sometimes there will be no difference
> >> by
> >> >using true/false isStreamingMode, sometimes they are quite different -
> >> >atomic vs. non-atomic. Another question is, before we call
> >> >Catalog#twoPhaseCreateTable(...), we have to know the value of
> >> >isStreamingMode. In case only non-atomic is supported for streaming mode,
> >> >we could just follow FLIP-218 instead of (twistedly) calling
> >> >Catalog#twoPhaseCreateTable(...) with a false isStreamingMode. Did I miss
> >> >anything here?
> >> Here's what I think about this issue, atomic CTAS wants to be the default
> >> behavior and only fall back to non-atomic CTAS if it's completely
> >> unattainable. Atomic CTAS will bring a better experience to users.
> >> Flink is already a stream batch unified engine, In our company kwai, many
> >> users are also using flink to do batch data processing, but still running
> >> in Stream mode.
> >> The boundary between stream and batch is gradually blurred, stream mode
> >> jobs may also FINISH, so I added the isStreamingMode parameter, this
> >> provides different atomicity implementations in Batch and Stream modes.
> >> Not only to

Re: [DISCUSS] Release Flink 1.16.2

2023-05-11 Thread Jingsong Li
+1 for releasing 1.16.2

Best,
Jingsong

On Thu, May 11, 2023 at 1:28 PM Gyula Fóra  wrote:
>
> +1 for the release
>
> Gyula
>
> On Thu, 11 May 2023 at 05:08, weijie guo  wrote:
>
> > [1]
> >
> > https://issues.apache.org/jira/browse/FLINK-31092?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.16.2%20%20and%20resolution%20%20!%3D%20%20Unresolved%20order%20by%20priority%20DESC
> >
> > [2]
> >
> > https://issues.apache.org/jira/browse/FLINK-31092?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.16.2%20and%20resolution%20%20!%3D%20Unresolved%20%20and%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20by%20priority%20%20DESC
> >
> > [3] https://issues.apache.org/jira/browse/FLINK-31293
> >
> > [4] https://issues.apache.org/jira/browse/FLINK-32027
> >
> > [5] https://issues.apache.org/jira/projects/FLINK/versions/12352765
> >
> >
> >
> >
> > weijie guo  于2023年5月11日周四 11:06写道:
> >
> > > Hi all,
> > >
> > >
> > > I would like to discuss creating a new 1.16 patch release (1.16.2). The
> > > last 1.16 release is over three months old, and since then, 99 tickets
> > have
> > >  been closed [1], of which 30 are blocker/critical [2].  Some
> > > of them are quite important, such as FLINK-31293 [3] and FLINK-32027 [4].
> > >
> > >
> > >
> > > I am not aware of any unresolved blockers and there are no in-progress
> > tickets [5].
> > > Please let me know if there are any issues you'd like to be included in
> > > this release but still not merged.
> > >
> > >
> > >
> > > If the community agrees to create this new patch release, I could
> > volunteer as the release manager
> > >  and Xintong can help with actions that require a PMC role.
> > >
> > > Best regards,
> > >
> > > Weijie
> > >
> >


Re: [DISCUSS] Release Flink 1.17.1

2023-05-11 Thread Jingsong Li
+1 for releasing 1.17.1

Best,
Jingsong

On Thu, May 11, 2023 at 1:29 PM Gyula Fóra  wrote:
>
> +1 for the release
>
> Gyula
>
> On Thu, 11 May 2023 at 05:35, Yun Tang  wrote:
>
> > +1 for release flink-1.17.1
> >
> > The blocker issue might cause silent incorrect data, it's better to have a
> > fix release ASAP.
> >
> >
> > Best
> > Yun Tang
> > 
> > From: weijie guo 
> > Sent: Thursday, May 11, 2023 11:08
> > To: dev@flink.apache.org ; tonysong...@gmail.com <
> > tonysong...@gmail.com>
> > Subject: [DISCUSS] Release Flink 1.17.1
> >
> > Hi all,
> >
> >
> > I would like to discuss creating a new 1.17 patch release (1.17.1). The
> > last 1.17 release is nearly two months old, and since then, 66 tickets have
> > been closed [1], of which 14 are blocker/critical [2].  Some of them are
> > quite important, such as FLINK-31293 [3] and  FLINK-32027 [4].
> >
> >
> > I am not aware of any unresolved blockers and there are no in-progress
> > tickets [5].
> > Please let me know if there are any issues you'd like to be included in
> > this release but still not merged.
> >
> >
> > If the community agrees to create this new patch release, I could
> > volunteer as the release manager
> >  and Xintong can help with actions that require a PMC role.
> >
> >
> > Thanks,
> >
> > Weijie
> >
> >
> > [1]
> >
> > https://issues.apache.org/jira/browse/FLINK-32027?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.1%20%20and%20resolution%20%20!%3D%20%20Unresolved%20order%20by%20priority%20DESC
> >
> > [2]
> >
> > https://issues.apache.org/jira/browse/FLINK-31273?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.1%20and%20resolution%20%20!%3D%20Unresolved%20%20and%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20by%20priority%20%20DESC
> >
> > [3] https://issues.apache.org/jira/browse/FLINK-31293
> >
> > [4] https://issues.apache.org/jira/browse/FLINK-32027
> >
> > [5] https://issues.apache.org/jira/projects/FLINK/versions/12352886
> >


Re: Re: [VOTE] FLIP-302: Support TRUNCATE TABLE statement in batch mode

2023-04-17 Thread Jingsong Li
+1

On Tue, Apr 18, 2023 at 9:39 AM Aitozi  wrote:
>
> +1
>
> Best,
> Aitozi
>
> ron  于2023年4月18日周二 09:18写道:
> >
> > +1
> >
> >
> > > -原始邮件-
> > > 发件人: "Lincoln Lee" 
> > > 发送时间: 2023-04-18 09:08:08 (星期二)
> > > 收件人: dev@flink.apache.org
> > > 抄送:
> > > 主题: Re: [VOTE] FLIP-302: Support TRUNCATE TABLE statement in batch mode
> > >
> > > +1 (binding)
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > yuxia  于2023年4月17日周一 23:54写道:
> > >
> > > > Hi all.
> > > >
> > > > Thanks for all the feedback on FLIP-302: Support TRUNCATE TABLE 
> > > > statement
> > > > in batch mode [1].
> > > > Based on the discussion [2], we have come to a consensus, so I would 
> > > > like
> > > > to start a vote.
> > > >
> > > > The vote will last for at least 72 hours unless there is an objection or
> > > > insufficient votes.
> > > >
> > > > [1]:
> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement+in+batch+mode
> > > > [2]: [ https://lists.apache.org/thread/m4r3wrd7p96wdst3nz3ncqzog6kf51cf 
> > > > |
> > > > https://lists.apache.org/thread/m4r3wrd7p96wdst3nz3ncqzog6kf51cf ]
> > > >
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> >
> >
> > --
> > Best,
> > Ron


Re: [DISCUSS] Release Paimon 0.4

2023-04-12 Thread Jingsong Li
Oh sorry to Flink devs,

This email should be sent to paimon dev, please ignore it.

Best,
Jingsong

On Thu, Apr 13, 2023 at 10:56 AM Jingsong Li  wrote:
>
> Hi everyone,
>
> I'm going to check out the 0.4 branch out next Monday, and won't merge
> major refactoring-related PRs into master branch until next Monday.
>
> Blockers:
> - Entire Database Sync CC @Caizhi Weng
> - CDC Ingestion mysql DATETIME(6) cast error [1]
> - MySqlSyncTableAction should support case ignore mode [2]
>
> If you have other blockers, please let us know.
>
> [1] https://github.com/apache/incubator-paimon/issues/860
> [2] https://github.com/apache/incubator-paimon/issues/890
>
> Best,
> Jingsong


[DISCUSS] Release Paimon 0.4

2023-04-12 Thread Jingsong Li
Hi everyone,

I'm going to check out the 0.4 branch out next Monday, and won't merge
major refactoring-related PRs into master branch until next Monday.

Blockers:
- Entire Database Sync CC @Caizhi Weng
- CDC Ingestion mysql DATETIME(6) cast error [1]
- MySqlSyncTableAction should support case ignore mode [2]

If you have other blockers, please let us know.

[1] https://github.com/apache/incubator-paimon/issues/860
[2] https://github.com/apache/incubator-paimon/issues/890

Best,
Jingsong


Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Jingsong Li
Hi Yun,

It looks like this doc needs permission to read? [1]

[1] 
https://docs.google.com/document/d/1NJJQ30P27BmUvD7oa4FChvkYxMEgjRPTVdO1dHLl_9I/edit#

Best,
Jingsong

On Fri, Apr 7, 2023 at 4:34 PM Piotr Nowojski  wrote:
>
> Hi,
>
> +1 To what Yun Tang wrote. We don't seem to have access to the design doc.
> Could you make it publicly visible or copy out its content to another
> document?
>
> Thanks for your answers Zakelly.
>
> (1)
> Yes, the current mechanism introduced in FLINK-24611 allows for checkpoint
> N, to only re-use shared state handles that have been already referenced by
> checkpoint N-1. But why do we need to break this assumption? In your step,
> "d.", TM could adhere to that assumption, and instead of reusing File-2, it
> could either re-use File-1, File-3 or create a new file.
>
> (2)
> Can you elaborate a bit more on this? As far as I recall, the purpose of
> the `RecoverableWriter` is to support exactly the things described in this
> FLIP, so what's the difference? If you are saying that for this FLIP you
> can implement something more efficiently for a given FileSystem, then why
> can it not be done the same way for the `RecoverableWriter`?
>
> Best,
> Piotrek
>
> czw., 6 kwi 2023 o 17:24 Yun Tang  napisał(a):
>
> > Hi Zakelly,
> >
> > Thanks for driving this work!
> >
> > I'm not sure did you ever read the discussion between Stephan, Roman,
> > Piotr, Yuan and I in the design doc [1] in nearly two years ago.
> >
> > From my understanding, your proposal is also a mixed state ownership: some
> > states are owned by the TM while some are owned by the JM. If my memory is
> > correct, we did not take the option-3 or option-5 in the design doc [1] for
> > the code complexity when implements the 1st version of changelog
> > state-backend.
> >
> > Could you also compare the current FLIP with the proposals in the design
> > doc[1]? From my understanding, we should at least consider to comapre with
> > option-3 and option-5 as they are all mixed solutions.
> >
> >
> > [1]
> > https://docs.google.com/document/d/1NJJQ30P27BmUvD7oa4FChvkYxMEgjRPTVdO1dHLl_9I/edit#
> >
> > Best
> > Yun Tang
> >
> > --
> > *From:* Zakelly Lan 
> > *Sent:* Thursday, April 6, 2023 16:38
> > *To:* dev@flink.apache.org 
> > *Subject:* Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for
> > Checkpoints
> >
> > Hi Piotr,
> >
> > Thanks for all the feedback.
> >
> > (1) Thanks for the reminder. I have just seen the FLINK-24611, the delayed
> > deletion by JM resolves some sync problems between JM and TM, but I'm
> > afraid it is still not feasible for the file sharing in this FLIP.
> > Considering a concurrent checkpoint scenario as follows:
> >a. Checkpoint 1 finishes. 1.sst, 2.sst and 3.sst are written in file 1,
> > and 4.sst is written in file 2.
> >b. Checkpoint 2 starts based on checkpoint 1, including 1.sst, 2.sst
> > and 5.sst.
> >c. Checkpoint 3 starts based on checkpoint 1, including 1.sst, 2.sst
> > and 5.sst as well.
> >d. Checkpoint 3 reuses the file 2, TM writes 5.sst on it.
> >e. Checkpoint 2 creates a new file 3, TM writes 5.sst on it.
> >f. Checkpoint 2 finishes, checkpoint 1 is subsumed and the file 2 is
> > deleted, while checkpoint 3 still needs file 2.
> >
> > I attached a diagram to describe the scenario.
> > [image: concurrent cp.jpg]
> > The core issue is that this FLIP introduces a mechanism that allows
> > physical files to be potentially used by the next several checkpoints. JM
> > is uncertain whether there will be a TM continuing to write to a specific
> > file. So in this FLIP, TMs take the responsibility to delete the physical
> > files.
> >
> > (2) IIUC, the RecoverableWriter is introduced to persist data in the "in
> > progress" files after each checkpoint, and the implementation may be based
> > on the file sync in some file systems. However, since the sync is a heavy
> > operation for DFS, this FLIP wants to use flush instead of the sync with
> > the best effort. This only fits the case that the DFS is considered
> > reliable. The problems they want to solve are different.
> >
> > (3) Yes, if files are managed by JM via the shared registry, this problem
> > is solved. And as I mentioned in (1), there are some other corner cases
> > hard to resolve via the shared registry.
> >
> > The goal of this FLIP is to have a common way of merging files in all use
> > cases. For shared state it merges at subtask level, while for private state
> > (and changelog files, as I replied to Yanfei), files are merged at TM
> > level. So it is not contrary to the current plan for the unaligned
> > checkpoint state (FLINK-26803). You are right that the unaligned checkpoint
> > state would be merged with the operator's state file, so overall, it is
> > slightly better than what's currently done.
> >
> >
> > Thanks again for the valuable comments!
> >
> > Best regards,
> > Zakelly
> >
> >
> >
> > On Wed, Apr 5, 2023 at 8:43 PM Piotr Nowojski 
> > wrot

Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement

2023-04-06 Thread Jingsong Li
+1 for voting.

Best,
Jingsong

On Thu, Apr 6, 2023 at 4:52 PM yuxia  wrote:
>
> Hi everyone.
>
> If there are no other questions or concerns for the FLIP[1], I'd like to 
> start the vote next Monday (4.10).
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "yuxia" 
> 收件人: "dev" 
> 发送时间: 星期五, 2023年 3 月 24日 上午 11:27:42
> 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
>
> Thanks all for your feedback.
>
> @Shammon FY
> My gut feeling is that the end user shouldn't care about whether it'll delete 
> direcotry or move to Trash directory with the TRUNCATE TABLE statement. They 
> only need to know it will delete all rows from a table.
> To me, I think delete directory or move to trash is more likely to be a 
> behavior of external storage level instead of SQL statement level. In Hive, 
> if user configure Trash, it will then move files to trash for DROP statment.
> Also, hardly did I see such usage with TRUNCATE TABLE statement in other 
> engines. What's more, to support it, we have to extend the TRUNCATE TABLE 
> synax which won't then compliant with SQL standard. I really don't want to do 
> that and I believe it'll make user confused if we do so.
>
> @Hang
> `TRUNCATE TABLE` is meant to delete all rows of a base table. So, it makes no 
> sense that table source implements it.
> If user use TRUNCATE TABLE statement to truncate a table, the planner will 
> only try to
> find the DynamicTableSink for the corresponding table.
>
> @Ran Tao
> 1: Thanks for you reminder. I said it won't support view in the FLIP, but 
> forget to said temporary table is also not supported. Now, I add this part to 
> this FLIP.
>
> 2: Yes, I also considered to incldue it in this FLIP before. But as far as I 
> see, I haven't seen much usage of truncate table with partition. It's not as 
> useful as truncate table. So, I tend to keep this FLIP simple in here without 
> supporting truncate table with partition.
> Also, seems for `truncate table with partition`, differnet engines may have 
> differernt syntax;
> Hive[1]/Spark[2] use the following syntax:
> TRUNCATE TABLE table_name [PARTITION partition_spec]
>
> SqlServer[3] use the follwoing syntax:
> TRUNCATE TABLE { database_name.schema_name.table_name | 
> schema_name.table_name | table_name } [ WITH ( PARTITIONS ( { 
>  |  }
> So, I'm tend to be cautious about it.
>
> But I'm open to this. If there's any feedback or strong requirement, I don't 
> mind to add it in this FLIP.
> If we do need it in some day, I can propose it in a new FLIP. It won't break 
> the current design.
>
> As for concrete syntax in the FLIP, I think the current one is the concrete 
> syntax, we don't allow TABLE keyword to be optional.
>
> 3: Thanks for your reminder, I have updadted the FLIP for this.
>
>
> [1]https://cwiki.apache.org/confluence/display/hive/languagemanual+ddl#LanguageManualDDL-TruncateTable
> [2]https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-ddl-truncate-table.html
> [3]https://learn.microsoft.com/en-us/sql/t-sql/statements/truncate-table-transact-sql?view=sql-server-ver16
>
>
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Ran Tao" 
> 收件人: "dev" 
> 发送时间: 星期四, 2023年 3 月 23日 下午 6:28:17
> 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
>
> Hi, yuxia.
>
> Thanks for starting the discussion.
> I think it's a nice improvement to support TRUNCATE TABLE statement because
> many other mature engines supports it.
>
> I have some questions.
> 1. because table has different types, whether we will support view or
> temporary tables?
>
> 2. some other engines such as spark and hive support TRUNCATE TABLE with
> partition. whether we will support?
> btw, i think you need give the TRUNCATE TABLE concrete syntax in the FLIP
> because some engines has different syntaxes.
> for example, hive allow TRUNCATE TABLE be TRUNCATE [TABLE] which means
> TABLE keyword can be optional.
>
> 3. The Proposed Changes try to use SqlToOperationConverter and run in
> TableEnvironmentImpl#executeInternal.
> I think it's out of date, the community is refactoring the conversion logic
> from SqlNode to operation[1] and executions in TableEnvironmentImpl[2].
> I suggest you can use new way to support it.
>
> [1] https://issues.apache.org/jira/browse/FLINK-31464
> [2] https://issues.apache.org/jira/browse/FLINK-31368
>
> Best Regards,
> Ran Tao
> https://github.com/chucheng92
>
>
> yuxia  于2023年3月22日周三 21:13写道:
>
> > Hi, devs.
> >
> > I'd like to start a discussion about FLIP-302: Support TRUNCATE TABLE
> > statement [1].
> >
> > The TRUNCATE TABLE statement is a SQL command that allows users to quickly
> > and efficiently delete all rows from a table without dropping the table
> > itself. This statement is commonly used in data warehouse, where large data
> > sets are frequently loaded and unloaded from tables.
> > So, this FLIP is meant to support TRUNCATE TABLE statement. 

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Jingsong Li
Congratulations!

I believe Paimon will work with Flink to build the best streaming data
warehouse.

Best,
Jingsong

On Tue, Mar 28, 2023 at 8:27 AM Shammon FY  wrote:
>
> Congratulations!
>
>
> Best,
> Shammon FY
>
> On Mon, Mar 27, 2023 at 11:37 PM Samrat Deb  wrote:
>
> > congratulations
> >
> > Bests,
> > Samrat
> > On Mon, 27 Mar 2023 at 8:24 PM, Alexander Fedulov <
> > alexander.fedu...@gmail.com> wrote:
> >
> > > Great to see this, congratulations!
> > >
> > > Best,
> > > Alex
> > >
> > > On Mon, 27 Mar 2023 at 11:24, Yu Li  wrote:
> > >
> > > > Dear Flinkers,
> > > >
> > > >
> > > >
> > > > As you may have noticed, we are pleased to announce that Flink Table
> > > Store has joined the Apache Incubator as a separate project called Apache
> > > Paimon(incubating) [1] [2] [3]. The new project still aims at building a
> > > streaming data lake platform for high-speed data ingestion, change data
> > > tracking and efficient real-time analytics, with the vision of
> > supporting a
> > > larger ecosystem and establishing a vibrant and neutral open source
> > > community.
> > > >
> > > >
> > > >
> > > > We would like to thank everyone for their great support and efforts for
> > > the Flink Table Store project, and warmly welcome everyone to join the
> > > development and activities of the new project. Apache Flink will continue
> > > to be one of the first-class citizens supported by Paimon, and we believe
> > > that the Flink and Paimon communities will maintain close cooperation.
> > > >
> > > >
> > > > 亲爱的Flinkers,
> > > >
> > > >
> > > > 正如您可能已经注意到的,我们很高兴地宣布,Flink Table Store 已经正式加入 Apache
> > > > 孵化器独立孵化 [1] [2] [3]。新项目的名字是
> > > > Apache
> > >
> > Paimon(incubating),仍致力于打造一个支持高速数据摄入、流式数据订阅和高效实时分析的新一代流式湖仓平台。此外,新项目将支持更加丰富的生态,并建立一个充满活力和中立的开源社区。
> > > >
> > > >
> > > > 在这里我们要感谢大家对 Flink Table Store 项目的大力支持和投入,并热烈欢迎大家加入新项目的开发和社区活动。Apache
> > > Flink
> > > > 将继续作为 Paimon 支持的主力计算引擎之一,我们也相信 Flink 和 Paimon 社区将继续保持密切合作。
> > > >
> > > >
> > > > Best Regards,
> > > >
> > > > Yu (on behalf of the Apache Flink PMC and Apache Paimon PPMC)
> > > >
> > > >
> > > > 致礼,
> > > >
> > > > 李钰(谨代表 Apache Flink PMC 和 Apache Paimon PPMC)
> > > >
> > > >
> > > > [1] https://paimon.apache.org/
> > > >
> > > > [2] https://github.com/apache/incubator-paimon
> > > >
> > > > [3]
> > https://cwiki.apache.org/confluence/display/INCUBATOR/PaimonProposal
> > > >
> > >
> >


Re: [VOTE] FLIP-296: Extend watermark-related features for SQL

2023-03-20 Thread Jingsong Li
+1 binding

Jark Wu 于2023年3月20日 周一21:19写道:

> +1 (binding)
>
> Best,
> Jark
>
> > 2023年3月15日 17:42,Yun Tang  写道:
> >
> > +1 (binding)
> >
> > Thanks Kui for driving this work.
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Kui Yuan 
> > Sent: Wednesday, March 15, 2023 16:45
> > To: dev@flink.apache.org 
> > Subject: [VOTE] FLIP-296: Extend watermark-related features for SQL
> >
> > Hi all,
> >
> > I want to start the vote of FLIP-296: Extend watermark-related features
> for
> > SQL [1].The FLIP was discussed in this thread [2].  The goal of the FLIP
> is
> > to extend those watermark-related features already implemented on the
> > datastream api for SQL users.
> >
> > The vote will last for at least 72 hours unless there is an objection or
> > insufficient votes. Thank you all. [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-296%3A+Extend+watermark-related+features+for+SQL
> > [2] https://lists.apache.org/thread/d681bx4t935c30zl750gy6d41tfypbph
> >  Best,
> > Kui Yuan
>
>


Re: [VOTE] FLIP-300: Add targetColumns to DynamicTableSink#Context to solve the null overwrite problem of partial-insert

2023-03-14 Thread Jingsong Li
+1 binding

On Tue, Mar 14, 2023 at 10:54 AM Samrat Deb  wrote:
>
> +1 (non binding)
> Thanks for driving it .
>
> Bests,
> Samrat
>
> On Tue, 14 Mar 2023 at 7:41 AM, Jark Wu  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > > 2023年3月13日 23:25,Aitozi  写道:
> > >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Aitozi
> > >
> > > Jing Ge  于2023年3月13日周一 22:10写道:
> > >
> > >> +1 (binding)
> > >>
> > >> Best Regards,
> > >> Jing
> > >>
> > >> On Mon, Mar 13, 2023 at 1:57 PM Hang Ruan 
> > wrote:
> > >>
> > >>> +1 (non-binding)
> > >>>
> > >>> Best,
> > >>> Hang
> > >>>
> > >>> yuxia  于2023年3月13日周一 20:52写道:
> > >>>
> >  +1 (binding)
> >  Thanks Lincoln Lee for driving it.
> > 
> >  Best regards,
> >  Yuxia
> > 
> >  - 原始邮件 -
> >  发件人: "Lincoln Lee" 
> >  收件人: "dev" 
> >  发送时间: 星期一, 2023年 3 月 13日 下午 8:17:52
> >  主题: [VOTE] FLIP-300: Add targetColumns to DynamicTableSink#Context to
> >  solve the null overwrite problem of partial-insert
> > 
> >  Dear Flink developers,
> > 
> >  Thanks for all your feedback for FLIP-300: Add targetColumns to
> >  DynamicTableSink#Context to solve the null overwrite problem of
> >  partial-insert[1] on the discussion thread[2].
> > 
> >  I'd like to start a vote for it. The vote will be open for at least 72
> >  hours unless there is an objection or not enough votes.
> > 
> >  [1]
> > 
> > >>>
> > >>
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
> >  [2] https://lists.apache.org/thread/bk8x0nqg4oc62jqryj9ntzzlpj062wd9
> > 
> > 
> >  Best,
> >  Lincoln Lee
> > 
> > >>>
> > >>
> >
> >


Re: [VOTE] FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-03-13 Thread Jingsong Li
+1 (binding)

Not block voting.
But comment about `Public Interface` in the FLIP: these classes may
not be `Public interface`? I think they are just internal
implementations.

Best,
Jingsong

On Mon, Mar 13, 2023 at 2:10 PM Benchao Li  wrote:
>
> +1 (binding)
>
> Shammon FY  于2023年3月13日周一 13:47写道:
>
> > Hi Devs,
> >
> > I'd like to start the vote on FLIP-293: Introduce Flink Jdbc Driver For Sql
> > Gateway [1].
> >
> > The FLIP was discussed in thread [2], and it aims to introduce Flink Jdbc
> > Driver module in Flink.
> >
> > The vote will last for at least 72 hours (03/16, 15:00 UTC+8) unless there
> > is an objection or insufficient vote. Thank you all.
> >
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway
> > [2] https://lists.apache.org/thread/d1owrg8zh77v0xygcpb93fxt0jpjdkb3
> >
> >
> > Best,
> > Shammon.FY
> >
>
>
> --
>
> Best,
> Benchao Li


Re: Re: [ANNOUNCE] New Apache Flink Committer - Yuxia Luo

2023-03-12 Thread Jingsong Li
Congratulations, Yuxia!

On Mon, Mar 13, 2023 at 11:49 AM Juntao Hu  wrote:
>
> Congratulations, Yuxia!
>
> Best,
> Juntao
>
>
> Wencong Liu  于2023年3月13日周一 11:33写道:
>
> > Congratulations, Yuxia!
> >
> > Best,
> > Wencong Liu
> >
> >
> > At 2023-03-13 11:20:21, "Qingsheng Ren"  wrote:
> > >Congratulations, Yuxia!
> > >
> > >Best,
> > >Qingsheng
> > >
> > >On Mon, Mar 13, 2023 at 10:27 AM Jark Wu  wrote:
> > >
> > >> Hi, everyone
> > >>
> > >> On behalf of the PMC, I'm very happy to announce Yuxia Luo as a new
> > Flink
> > >> Committer.
> > >>
> > >> Yuxia has been continuously contributing to the Flink project for almost
> > >> two
> > >> years, authored and reviewed hundreds of PRs over this time. He is
> > >> currently
> > >> the core maintainer of the Hive component, where he contributed many
> > >> valuable
> > >> features, including the Hive dialect with 95% compatibility and small
> > file
> > >> compaction.
> > >> In addition, Yuxia driven FLIP-282 (DELETE & UPDATE API) to better
> > >> integrate
> > >> Flink with data lakes. He actively participated in dev discussions and
> > >> answered
> > >> many questions on the user mailing list.
> > >>
> > >> Please join me in congratulating Yuxia Luo for becoming a Flink
> > Committer!
> > >>
> > >> Best,
> > >> Jark Wu (on behalf of the Flink PMC)
> > >>
> >


Re: [DISCUSS] FLIP-300: Add targetColumns to DynamicTableSink#Context to solve the null overwrite problem of partial-insert

2023-03-06 Thread Jingsong Li
Wow, we have 300 FLIPs...

Thanks Lincoln,

Have you considered returning an Optional?

Empty array looks a little weird to me.

Best,
Jingsong

On Tue, Mar 7, 2023 at 10:32 AM Aitozi  wrote:
>
> Hi Lincoln,
> Thank you for sharing this FLIP. Overall, it looks good to me. I have
> one question: with the introduction of this interface,
> will any existing Flink connectors need to be updated in order to take
> advantage of its capabilities? For example, HBase.
>
> yuxia  于2023年3月7日周二 10:01写道:
>
> > Thanks. It makes sense to me.
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "Lincoln Lee" 
> > 收件人: "dev" 
> > 发送时间: 星期一, 2023年 3 月 06日 下午 10:26:26
> > 主题: Re: [DISCUSS] FLIP-300: Add targetColumns to DynamicTableSink#Context
> > to solve the null overwrite problem of partial-insert
> >
> > hi yuxia,
> >
> > Thanks for your feedback and tracking the issue of update statement! I've
> > updated the FLIP[1] and also the poc[2].
> > Since the bug and flip are orthogonal, we can focus on finalizing the api
> > changes first, and then work on the flip implementation and bugfix
> > separately, WDYT?
> >
> > [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
> > [2] https://github.com/apache/flink/pull/22041
> >
> > Best,
> > Lincoln Lee
> >
> >
> > yuxia  于2023年3月6日周一 21:21写道:
> >
> > > Hi, Lincoln.
> > > Thanks for bringing this up. +1 for this FLIP, it's helpful for external
> > > storage system to implement partial update.
> > > The FLIP looks good to me. I only want to add one comment, update
> > > statement also doesn't support updating nested column, I have created
> > > FLINK-31344[1] to track it.
> > > Maybe we also need to explain it in this FLIP.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-31344
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > - 原始邮件 -
> > > 发件人: "Lincoln Lee" 
> > > 收件人: "dev" 
> > > 发送时间: 星期五, 2023年 3 月 03日 下午 12:22:19
> > > 主题: [DISCUSS] FLIP-300: Add targetColumns to DynamicTableSink#Context to
> > > solve the null overwrite problem of partial-insert
> > >
> > > Hi everyone,
> > >
> > > This FLIP[1] aims to support connectors in avoiding overwriting
> > non-target
> > > columns with null values when processing partial column updates, we
> > propose
> > > adding information on the target column list to DynamicTableSink#Context.
> > >
> > > FLINK-18726[2] supports inserting statements with specified column list,
> > it
> > > fills null values (or potentially declared default values in the future)
> > > for columns not appearing in the column list of insert statement to the
> > > target table.
> > > But this behavior does not satisfy some partial column update
> > requirements
> > > of some storage systems which allow storing null values. The problem is
> > > that connectors cannot distinguish whether the null value of a column is
> > > really from the user's data or whether it is a null value populated
> > because
> > > of partial insert behavior.
> > >
> > > Looking forward to your comments or feedback.
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
> > > [2] https://issues.apache.org/jira/browse/FLINK-18726
> > >
> > > Best,
> > > Lincoln Lee
> > >
> >


Re: [DISCUSS] FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-03-02 Thread Jingsong Li
Hi, Shammon,

I took a look at JDBC `ResultSet` and `Statement`.  They are
complicated and have many interfaces. Some of the interfaces may not
be very suitable for streaming.

I think maybe we can just implement JDBC for batch/olap only. It is
hard to have an integration for JDBC and streaming...

Do you need to use JDBC in streaming mode? Or do we just implement
JDBC for batch only first?

Best,
Jingsong


On Thu, Mar 2, 2023 at 6:22 PM Shammon FY  wrote:
>
> Hi
>
> Thanks for the feedback from Jingsong and Benchao.
>
> For @Jingsong
> > If the user does not cast into a FlinkResultSet, will there be
> serious consequences here (RowKind is ignored)?
>
> I agree with you that it's indeed a big deal if users ignore the row kind
> when they must know it. One idea that comes to my mind is we can add an
> option such as `table.result.changelog-mode`, users can set it through
> connection properties or set dynamic parameters. The option value can be
> `insert-only`, `upset` or `all` and the default value is `insert-only`.
>
> If the result does not conform to the changelog mode, the jdbc driver
> throws an exception. What do you think?
>
>
> For @Benchao
> > Besides `java.sql.Driver`, have you considered also adding support for
> `javax.sql.DataSource` interface?
>
> I missed the `javax.sql.DataSource` and I have added it to the FLIP, thanks
> Benchao
>
>
> Best,
> Shammon
>
> On Wed, Mar 1, 2023 at 7:57 PM Benchao Li  wrote:
>
> > +1 for the FLIP, thanks Shammon for driving this.
> >
> > JDBC is quite useful in OLAP scenarios, supporting JDBC would enable Flink
> > to be used with existing tools, such as Tableau.
> >
> > Regarding the JDBC interfaces listed in the FLIP, I think they looks good
> > already. Besides `java.sql.Driver`, have you considered also adding support
> > for `javax.sql.DataSource` interface?
> >
> > Jingsong Li  于2023年3月1日周三 17:53写道:
> >
> > > Thanks Shammon for driving.
> > >
> > > Big +1 for this.
> > >
> > > I heard that many users want to use FlinkGateway + JDBC to do some
> > > queries, but at present, only Hive JDBC can be used. It is Hive
> > > dialect by default, and the experience is also different from
> > > FlinkSQL. We need to have our own JDBC.
> > >
> > > I took a look at your `Public Interface` part, only
> > > `FlinkResultSet.getRowKind` is a true new interface, others are just
> > > implementations.
> > >
> > > If the user does not cast into a FlinkResultSet, will there be serious
> > > consequences here (RowKind is ignored)?
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Wed, Mar 1, 2023 at 4:59 PM Shammon FY  wrote:
> > > >
> > > > Hi devs,
> > > >
> > > > I'd like to start a discussion about FLIP-293: Introduce Flink Jdbc
> > > Driver
> > > > For Sql Gateway[1].
> > > >
> > > > FLIP-275[2] supports remote sql client based on gateway, users can
> > > interact
> > > > with gateway by flink console. However, for users who create session
> > > > clusters with Flink, they'd like to use Jdbc Driver to interact with
> > the
> > > > gateway in their applications, such as olap queries..
> > > >
> > > > I have discussed this proposal with @shengkaifang and @jinsonglee. In
> > > this
> > > > FLIP, we'd like to introduce Jdbc Driver for gateway. Users can use
> > Jdbc
> > > > Driver to submit their queries and get results like a database in their
> > > > applications.
> > > >
> > > > Looking forward to your feedback, thanks.
> > > >
> > > >
> > > > [1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway
> > > > [2]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-275%3A+Support+Remote+SQL+Client+Based+on+SQL+Gateway
> > > >
> > > >
> > > > Best,
> > > > Shammon
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >


Re: [VOTE] Flink minor version support policy for old releases

2023-03-02 Thread Jingsong Li
+1 (binding)

On Thu, Mar 2, 2023 at 1:30 PM Yu Li  wrote:
>
> +1 (binding)
>
> Best Regards,
> Yu
>
>
> On Thu, 2 Mar 2023 at 09:53, Jark Wu  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > > 2023年3月2日 05:03,Gyula Fóra  写道:
> > >
> > > +1 (binding)
> > >
> > > Gyula
> > >
> > > On Wed, Mar 1, 2023 at 9:57 PM Thomas Weise  wrote:
> > >
> > >> +1 (binding)
> > >>
> > >> Thanks,
> > >> Thomas
> > >>
> > >> On Tue, Feb 28, 2023 at 6:53 AM Sergey Nuyanzin 
> > >> wrote:
> > >>
> > >>> +1 (non-binding)
> > >>>
> > >>> Thanks for driving this Danny.
> > >>>
> > >>> On Tue, Feb 28, 2023 at 9:41 AM Samrat Deb 
> > >> wrote:
> > >>>
> >  +1 (non binding)
> > 
> >  Thanks for driving it
> > 
> >  Bests,
> >  Samrat
> > 
> >  On Tue, 28 Feb 2023 at 1:36 PM, Junrui Lee 
> > >> wrote:
> > 
> > > Thanks Danny for driving it.
> > >
> > > +1 (non-binding)
> > >
> > > Best regards,
> > > Junrui
> > >
> > > yuxia  于2023年2月28日周二 14:04写道:
> > >
> > >> Thanks Danny for driving it.
> > >>
> > >> +1 (non-binding)
> > >>
> > >> Best regards,
> > >> Yuxia
> > >>
> > >> - 原始邮件 -
> > >> 发件人: "Weihua Hu" 
> > >> 收件人: "dev" 
> > >> 发送时间: 星期二, 2023年 2 月 28日 下午 12:48:09
> > >> 主题: Re: [VOTE] Flink minor version support policy for old releases
> > >>
> > >> Thanks, Danny.
> > >>
> > >> +1 (non-binding)
> > >>
> > >> Best,
> > >> Weihua
> > >>
> > >>
> > >> On Tue, Feb 28, 2023 at 12:38 PM weijie guo <
> > >>> guoweijieres...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Thanks Danny for bring this.
> > >>>
> > >>> +1 (non-binding)
> > >>>
> > >>> Best regards,
> > >>>
> > >>> Weijie
> > >>>
> > >>>
> > >>> Jing Ge  于2023年2月27日周一 20:23写道:
> > >>>
> >  +1 (non-binding)
> > 
> >  BTW, should we follow the content style [1] to describe the new
> >  rule
> > >>> using
> >  1.2.x, 1.1.y, 1.1.z?
> > 
> >  [1]
> > > https://flink.apache.org/downloads/#update-policy-for-old-releases
> > 
> >  Best regards,
> >  Jing
> > 
> >  On Mon, Feb 27, 2023 at 1:06 PM Matthias Pohl
> >   wrote:
> > 
> > > Thanks, Danny. Sounds good to me.
> > >
> > > +1 (non-binding)
> > >
> > > On Wed, Feb 22, 2023 at 10:11 AM Danny Cranmer <
> > >>> dannycran...@apache.org>
> > > wrote:
> > >
> > >> I am starting a vote to update the "Update Policy for old
> > > releases"
> > >>> [1]
> > > to
> > >> include additional bugfix support for end of life versions.
> > >>
> > >> As per the discussion thread [2], the change we are voting
> > >> on
> >  is:
> > >> - Support policy: updated to include: "Upon release of a
> > >> new
> > > Flink
> >  minor
> > >> version, the community will perform one final bugfix
> > >> release
> >  for
> >  resolved
> > >> critical/blocker issues in the Flink minor version losing
> > > support."
> > >> - Release process: add a step to start the discussion
> > >> thread
> >  for
> > >> the
> > > final
> > >> patch version, if there are resolved critical/blocking
> > >> issues
> >  to
> > >>> flush.
> > >>
> > >> Voting schema: since our bylaws [3] do not cover this
> >  particular
> > > scenario,
> > >> and releases require PMC involvement, we will use a
> > >> consensus
> > > vote
> > >>> with
> > > PMC
> > >> binding votes.
> > >>
> > >> Thanks,
> > >> Danny
> > >>
> > >> [1]
> > >
> > >>
> > >>> https://flink.apache.org/downloads.html#update-policy-for-old-releases
> > >> [2]
> > >> https://lists.apache.org/thread/szq23kr3rlkm80rw7k9n95js5vqpsnbv
> > >> [3]
> > > https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws
> > >>
> > >
> > 
> > >>>
> > >>
> > >
> > 
> > >>>
> > >>>
> > >>> --
> > >>> Best regards,
> > >>> Sergey
> > >>>
> > >>
> >
> >


Re: [DISCUSS] FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-03-01 Thread Jingsong Li
Thanks Shammon for driving.

Big +1 for this.

I heard that many users want to use FlinkGateway + JDBC to do some
queries, but at present, only Hive JDBC can be used. It is Hive
dialect by default, and the experience is also different from
FlinkSQL. We need to have our own JDBC.

I took a look at your `Public Interface` part, only
`FlinkResultSet.getRowKind` is a true new interface, others are just
implementations.

If the user does not cast into a FlinkResultSet, will there be serious
consequences here (RowKind is ignored)?

Best,
Jingsong

On Wed, Mar 1, 2023 at 4:59 PM Shammon FY  wrote:
>
> Hi devs,
>
> I'd like to start a discussion about FLIP-293: Introduce Flink Jdbc Driver
> For Sql Gateway[1].
>
> FLIP-275[2] supports remote sql client based on gateway, users can interact
> with gateway by flink console. However, for users who create session
> clusters with Flink, they'd like to use Jdbc Driver to interact with the
> gateway in their applications, such as olap queries..
>
> I have discussed this proposal with @shengkaifang and @jinsonglee. In this
> FLIP, we'd like to introduce Jdbc Driver for gateway. Users can use Jdbc
> Driver to submit their queries and get results like a database in their
> applications.
>
> Looking forward to your feedback, thanks.
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-275%3A+Support+Remote+SQL+Client+Based+on+SQL+Gateway
>
>
> Best,
> Shammon


Re: [DISCUSS] Deprecate deserialize method in DeserializationSchema

2023-02-28 Thread Jingsong Li
- `T deserialize(byte[] message)` is widely used and it is a public
api. It is very friendly for single record deserializers.
- `void deserialize(byte[] message, Collector out)` supports
multiple records.

I think we can just keep them as they are.

Best,
Jingsong


On Tue, Feb 28, 2023 at 3:08 PM Hang Ruan  wrote:
>
> Hi, Shammon,
>
> I think the method `void deserialize(byte[] message, Collector out)`
> with a default implementation encapsulate how to deal with null for
> developers. If we remove the `T deserialize(byte[] message)`, the
> developers have to remember to handle null. Maybe we will get duplicate
> code among them.
> And I find there are only 5 implementations override the method `void
> deserialize(byte[] message, Collector out)`. Other implementations reuse
> the same code to handle null.
> I don't know the benefits of removing this method. Looking forward to other
> people's opinions.
>
> Best,
> Hang
>
>
>
> Shammon FY  于2023年2月28日周二 14:14写道:
>
> > Hi devs
> >
> > Currently there are two deserialization methods in `DeserializationSchema`
> > 1. `T deserialize(byte[] message)`, only deserialize one record from
> > binary, if there is no record it should return null.
> > 2. `void deserialize(byte[] message, Collector out)`, supports
> > deserializing none, one or multiple records gracefully, it can completely
> > replace method `T deserialize(byte[] message)`.
> >
> > The deserialization logic in the above two methods is basically coincident,
> > we recommend users use the second method to deserialize data. To improve
> > code maintainability, I'd like to mark the first function as `@Deprecated`,
> > and remove it when it is no longer used in the future.
> >
> > I have created an issue[1] to track it, looking forward to your feedback,
> > thanks
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-31251
> >
> >
> > Best,
> > Shammon
> >


Re: [Discuss] Some questions on flink-table-store micro benchmark

2023-02-27 Thread Jingsong Li
Hi Yun and Shammon,

> track the performance changes of the micro benchmark

I think we can create a github action for this. Print results everyday.

Best,
Jingsong

On Mon, Feb 27, 2023 at 5:59 PM Shammon FY  wrote:
>
> Hi jingsong
>
> Getting rid of JMH is a good idea. For the second point, how can we track
> the performance changes of the micro benchmark? What do you think?
>
> Best,
> Shammon
>
> On Mon, Feb 27, 2023 at 10:57 AM Jingsong Li  wrote:
>
> > Thanks Yun.
> >
> > Another way is we can get rid of JMH, something like Spark
> > `org.apache.spark.benchmark.Benchmark` can replace JMH.
> >
> > Best,
> > Jingsong
> >
> > On Mon, Feb 27, 2023 at 1:24 AM Yun Tang  wrote:
> > >
> > > Hi dev,
> > >
> > > I just noticed that flink-table-store had introduced the micro benchmark
> > module [1] to test the basic performance. And I have two questions here.
> > > First of all, we might not be able to keep the micro benchmark, which is
> > based on JMH, in the main repo of flink-table-store. This is because JMH is
> > under GPL license, which is not compliant with Apache-2 license. That's why
> > Flink moved the flink-benchmark out [2].
> > >
> > > Secondly, I try to run the micro benchmark locally but it seems failed,
> > I just wonder can we make the flink-table-store's micro benchmark could be
> > periodically executed just as flink-benchmarks did on
> > http://codespeed.dak8s.net:8080? Please correct me if such daily
> > benchmark has been set up.
> > > Moreover, maybe we can also consider to integrate this micro benchmark
> > notification in the Slack channel just as what flink-benchmarks did.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-29636
> > > [2] https://issues.apache.org/jira/browse/FLINK-2973
> > >
> > > Best
> > > Yun Tang
> >


Re: [Discuss] Some questions on flink-table-store micro benchmark

2023-02-26 Thread Jingsong Li
Thanks Yun.

Another way is we can get rid of JMH, something like Spark
`org.apache.spark.benchmark.Benchmark` can replace JMH.

Best,
Jingsong

On Mon, Feb 27, 2023 at 1:24 AM Yun Tang  wrote:
>
> Hi dev,
>
> I just noticed that flink-table-store had introduced the micro benchmark 
> module [1] to test the basic performance. And I have two questions here.
> First of all, we might not be able to keep the micro benchmark, which is 
> based on JMH, in the main repo of flink-table-store. This is because JMH is 
> under GPL license, which is not compliant with Apache-2 license. That's why 
> Flink moved the flink-benchmark out [2].
>
> Secondly, I try to run the micro benchmark locally but it seems failed, I 
> just wonder can we make the flink-table-store's micro benchmark could be 
> periodically executed just as flink-benchmarks did on 
> http://codespeed.dak8s.net:8080? Please correct me if such daily benchmark 
> has been set up.
> Moreover, maybe we can also consider to integrate this micro benchmark 
> notification in the Slack channel just as what flink-benchmarks did.
>
> [1] https://issues.apache.org/jira/browse/FLINK-29636
> [2] https://issues.apache.org/jira/browse/FLINK-2973
>
> Best
> Yun Tang


Re: [ANNOUNCE] Flink project website is now powered by Hugo

2023-02-23 Thread Jingsong Li
Thanks Martijn!

On Fri, Feb 24, 2023 at 9:46 AM yuxia  wrote:
>
> Thanks Martijn for your work.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Jing Ge" 
> 收件人: "dev" 
> 发送时间: 星期五, 2023年 2 月 24日 上午 5:20:30
> 主题: Re: [ANNOUNCE] Flink project website is now powered by Hugo
>
> Congrats Martijn! You have made great progress. Thanks for your effort!
>
> Best regards,
> Jing
>
> On Thu, Feb 23, 2023 at 8:47 PM Konstantin Knauf  wrote:
>
> > Thanks, Martijn. That was a lot of work.
> >
> > Am Do., 23. Feb. 2023 um 16:33 Uhr schrieb Maximilian Michels <
> > m...@apache.org>:
> >
> > > Congrats! Great work. This was a long time in the making!
> > >
> > > -Max
> > >
> > > On Thu, Feb 23, 2023 at 3:28 PM Martijn Visser  > >
> > > wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > The project website at https://flink.apache.org is now powered by Hugo
> > > [1]
> > > > which is the same system as the documentation.
> > > >
> > > > The theme is the same as the documentation website, so there's no
> > > redesign
> > > > involved.
> > > >
> > > > If you encounter any issues, please create a Jira ticket and feel free
> > to
> > > > ping me in it.
> > > >
> > > > Thanks to all that have been involved with testing and reviewing!
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-22922
> > >
> >
> >
> > --
> > https://twitter.com/snntrable
> > https://github.com/knaufk
> >


Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-23 Thread Jingsong Li
Hi Jing Ge,

First, flink-table-common contains all common classes of Flink Table,
I think it is hard to bypass its dependence.

Secondly, almost all methods in Catalog looks useful to me, so if we
are following LoD, we should add all methods again to
TableEnvironment. I think it is redundant.

And, this API chain does not look deep.
- "tEnv.getCatalog(tEnv.getCurrentCatalog()).get().listDatabases()"
looks a little complicated. The complex part is ahead.
- If we have a method to get Catalog directly, can be simplify to
"tEnv.catalog().listDatabase()", this is simple.

Best,
Jingsong

On Thu, Feb 23, 2023 at 4:47 PM Jing Ge  wrote:
>
> Hi Jingson,
>
> Thanks for the knowledge sharing. IMHO, it looks more like a design
> guideline question than just avoiding public API change. Please correct me
> if I'm wrong.
>
> Catalog is in flink-table-common module and TableEnvironment is in
> flink-table-api-java. Depending on how and where those features proposed in
> this FLIP will be used, we'd better reduce the dependency chain and follow
> the Law of Demeter(LoD, clean code) [1]. Adding a new method in
> TableEnvironment is therefore better than calling an API chain. It is also
> more user friendly for the caller, because there is no need to understand
> the internal structure of the called API. The downside of doing this is
> that we might have another issue with the current TableEnvironment design -
> the TableEnvironment interface got enlarged with more wrapper methods. This
> is a different issue that could be solved with improved abstraction design
> in the future. After considering pros and cons, if we want to add those
> features now, I would prefer following LoD than API chain calls. WDYT?
>
> Best regards,
> Jing
>
> [1]
> https://hackernoon.com/object-oriented-tricks-2-law-of-demeter-4ecc9becad85
>
> On Thu, Feb 23, 2023 at 6:26 AM Ran Tao  wrote:
>
> > Hi Jingsong. thanks. i got it.
> > In this way, there is no need to introduce new API changes.
> >
> > Best Regards,
> > Ran Tao
> >
> >
> > Jingsong Li  于2023年2月23日周四 12:26写道:
> >
> > > Hi Ran,
> > >
> > > I mean we can just use
> > > TableEnvironment.getCatalog(getCurrentCatalog).get().listDatabases().
> > >
> > > We don't need to provide new apis just for utils.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Thu, Feb 23, 2023 at 12:11 PM Ran Tao  wrote:
> > > >
> > > > Hi Jingsong, thanks.
> > > >
> > > > The implementation of these statements in TableEnvironmentImpl is
> > called
> > > > through the catalog api.
> > > >
> > > > but it does support some new override methods on the catalog api side,
> > > and
> > > > I will update it later. Thank you.
> > > >
> > > > e.g.
> > > > TableEnvironmentImpl
> > > > @Override
> > > > public String[] listDatabases() {
> > > > return catalogManager
> > > > .getCatalog(catalogManager.getCurrentCatalog())
> > > > .get()
> > > > .listDatabases()
> > > > .toArray(new String[0]);
> > > > }
> > > >
> > > > Best Regards,
> > > > Ran Tao
> > > >
> > > >
> > > > Jingsong Li  于2023年2月23日周四 11:47写道:
> > > >
> > > > > Thanks for the proposal.
> > > > >
> > > > > +1 for the proposal.
> > > > >
> > > > > I am confused about "Proposed TableEnvironment SQL API Changes", can
> > > > > we just use catalog api for this requirement?
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > > > > On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau 
> > > wrote:
> > > > > >
> > > > > > Hi Ran:
> > > > > > Thanks for driving the FLIP. the google doc looks really good. it
> > is
> > > > > > important to improve user interactive experience. +1 to support
> > this
> > > > > > feature.
> > > > > >
> > > > > > Jing Ge  于2023年2月23日周四 00:51写道:
> > > > > >
> > > > > > > Hi Ran,
> > > > > > >
> > > > > > > Thanks for driving the FLIP.  It looks overall good. Would you
> > > like to
> > > > > add
> > > > > > > a description of useLike and 

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jingsong Li
Hi Ran,

I mean we can just use
TableEnvironment.getCatalog(getCurrentCatalog).get().listDatabases().

We don't need to provide new apis just for utils.

Best,
Jingsong

On Thu, Feb 23, 2023 at 12:11 PM Ran Tao  wrote:
>
> Hi Jingsong, thanks.
>
> The implementation of these statements in TableEnvironmentImpl is called
> through the catalog api.
>
> but it does support some new override methods on the catalog api side, and
> I will update it later. Thank you.
>
> e.g.
> TableEnvironmentImpl
> @Override
> public String[] listDatabases() {
> return catalogManager
> .getCatalog(catalogManager.getCurrentCatalog())
> .get()
> .listDatabases()
> .toArray(new String[0]);
> }
>
> Best Regards,
> Ran Tao
>
>
> Jingsong Li  于2023年2月23日周四 11:47写道:
>
> > Thanks for the proposal.
> >
> > +1 for the proposal.
> >
> > I am confused about "Proposed TableEnvironment SQL API Changes", can
> > we just use catalog api for this requirement?
> >
> > Best,
> > Jingsong
> >
> > On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau  wrote:
> > >
> > > Hi Ran:
> > > Thanks for driving the FLIP. the google doc looks really good. it is
> > > important to improve user interactive experience. +1 to support this
> > > feature.
> > >
> > > Jing Ge  于2023年2月23日周四 00:51写道:
> > >
> > > > Hi Ran,
> > > >
> > > > Thanks for driving the FLIP.  It looks overall good. Would you like to
> > add
> > > > a description of useLike and notLike? I guess useLike true is for
> > "LIKE"
> > > > and notLike true is for "NOT LIKE" but I am not sure if I understood it
> > > > correctly. Furthermore, does it make sense to support "ILIKE" too?
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
> > > >
> > > > > Currently flink sql auxiliary statements has supported some good
> > features
> > > > > such as catalog/databases/table support.
> > > > >
> > > > > But these features are not very complete compared with other popular
> > > > > engines such as spark, presto, hive and commercial engines such as
> > > > > snowflake.
> > > > >
> > > > > For example, many engines support show operation with filtering
> > except
> > > > > flink, and support describe other object(flink only support describe
> > > > > table).
> > > > >
> > > > > I wonder can we add these useful features for flink?
> > > > > You can find details in this doc.[1] or FLIP.[2]
> > > > >
> > > > > Also, please let me know if there is a mistake. Looking forward to
> > your
> > > > > reply.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > > > [2]
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > > > >
> > > > > Best Regards,
> > > > > Ran Tao
> > > > >
> > > >
> >


Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jingsong Li
Thanks for the proposal.

+1 for the proposal.

I am confused about "Proposed TableEnvironment SQL API Changes", can
we just use catalog api for this requirement?

Best,
Jingsong

On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau  wrote:
>
> Hi Ran:
> Thanks for driving the FLIP. the google doc looks really good. it is
> important to improve user interactive experience. +1 to support this
> feature.
>
> Jing Ge  于2023年2月23日周四 00:51写道:
>
> > Hi Ran,
> >
> > Thanks for driving the FLIP.  It looks overall good. Would you like to add
> > a description of useLike and notLike? I guess useLike true is for "LIKE"
> > and notLike true is for "NOT LIKE" but I am not sure if I understood it
> > correctly. Furthermore, does it make sense to support "ILIKE" too?
> >
> > Best regards,
> > Jing
> >
> > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
> >
> > > Currently flink sql auxiliary statements has supported some good features
> > > such as catalog/databases/table support.
> > >
> > > But these features are not very complete compared with other popular
> > > engines such as spark, presto, hive and commercial engines such as
> > > snowflake.
> > >
> > > For example, many engines support show operation with filtering except
> > > flink, and support describe other object(flink only support describe
> > > table).
> > >
> > > I wonder can we add these useful features for flink?
> > > You can find details in this doc.[1] or FLIP.[2]
> > >
> > > Also, please let me know if there is a mistake. Looking forward to your
> > > reply.
> > >
> > > [1]
> > >
> > >
> > https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > [2]
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > >
> > > Best Regards,
> > > Ran Tao
> > >
> >


Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Jingsong Li
Thanks for your proposal.

+1 to yuxia, consider watermark-related hints as option hints.

Personally, I am cautious about adding SQL syntax, WATERMARK_PARAMS is
also SQL syntax to some extent.

We can use OPTIONS to meet this requirement if possible.

Best,
Jingsong

On Thu, Feb 23, 2023 at 10:41 AM yuxia  wrote:
>
> Hi, Yuan Kui.
> Thanks for driving it.
> IMO, the 'OPTIONS' hint may be not only specific to the connector options. 
> Just as a reference, we also have `sink.parallelism`[1] as a connector 
> options. It enables
> user to specific the writer's parallelism dynamically per-query.
>
> Personally, I perfer to consider watermark-related hints as option hints. So, 
> user can define a default watermark strategy for the table, and if user 
> dosen't needed to changes it, they need to do nothing in their query instead 
> of specific it ervery time.
>
> [1] 
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
>
> Best regards,
> Yuxia
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "kui yuan" 
> 收件人: "dev" 
> 抄送: "Jark Wu" 
> 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
> 主题: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL
>
> Hi all,
>
> Thanks for the lively discussion and I will respond to these questions one
> by one. However, there are also some common questions and I will answer
> together.
>
> @郑 Thanks for your reply. The features mentioned in this flip are only for
> those source connectors that implement the SupportsWatermarkPushDown
> interface, generating watermarks in other graph locations is not in the
> scope of this discussion. Perhaps another flip can be proposed later to
> implement this feature.
>
> @Shammon Thanks for your reply. In Flip-296, a rejected alternative is
> adding watermark related options in the connector options,we believe that
> we should not bind the watermark-related options to a connector to ensure
> semantic clarity.
>
> > What will happen if we add watermark related options in `the connector
> > options`? Will the connector ignore these options or throw an exception?
> > How can we support this?
>
> If user defines different watermark configurations for one table in two
> places, I tend to prefer the first place would prevail, but  we can also
> throw exception or just print logs to prompt the user, which are
> implementation details.
>
> > If one table is used by two operators with different watermark params,
> > what will happen?
>
> @Martijn Thanks for your reply.  I'm sorry that we are not particularly
> accurate, this hint is mainly for SQL,  not table API.
>
> > While the FLIP talks about watermark options for Table API & SQL, I only
> > see proposed syntax for SQL, not for the Table API. What is your proposal
> > for the Table API
>
> @Jane Thanks for your reply. For the first question, If the user uses this
> hint on those sourse that does not implement the SupportsWatermarkPushDown
> interface, it will be completely invalid. The task will run as normal as if
> the hint had not been used.
>
> > What's the behavior if there are multiple table sources, among which
> > some do not support `SupportsWatermarkPushDown`?
>
> @Jane feedback that 'WATERMARK_PARAMS' is difficult to remember, perhaps
> the naming issue can be put to the end of the discussion, because more
> people like @Martijn @Shuo are considering whether these configurations
> should be put into the DDL or the 'OPTIONS' hint. Here's what I
> think, Putting these configs into DDL or putting them into 'OPTIONS' hint
> is actually the same thing, because the 'OPTIONS' hint is mainly used to
> configure the properties of conenctor. The reason why I want to use a new
> hint is to make sure the semantics clear, in my opinion the configuration
> of watermark should not be mixed up with connector. However, a new hint
> does make it more difficult to use to some extent, for example, when a user
> uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this point, maby
> it is more appropriate to use uniform 'OPTIONS' hint.
> On the other hand, if we enrich more watermark option keys in 'OPTIONS'
> hints, The question will be what we treat the definatrions of  'OPTIONS'
> hint, is this only specific to the connector options or could be more?
> Maybe @Jark could share more insights here. In my opion, 'OPTIONS' is only
> related to the connector options, which is not like the gernal watermark
> options.
>
>
>
> Shuo Cheng  于2023年2月22日周三 19:17写道:
>
> > Hi Kui,
> >
> > Thanks for driving the discussion. It's quite useful to introduce Watermark
> > options. I have some questions:
> >
> > What kind of hints is "WATERMARK_PARAMS"?
> > Currently, we have two kinds of hints in Flink: Dynamic Table Options &
> > Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more like
> > Dynamic Table Options. So two questions arise here:
> >
> > 1)  Are these watermark options to be exposed as connector WITH options? Aa
> > described in SQL Hin

Re: [ANNOUNCE] New Apache Flink Committer - Jing Ge

2023-02-13 Thread Jingsong Li
Congratulations Jing!

Best,
Jingsong

On Tue, Feb 14, 2023 at 3:50 PM godfrey he  wrote:
>
> Hi everyone,
>
> On behalf of the PMC, I'm very happy to announce Jing Ge as a new Flink
> committer.
>
> Jing has been consistently contributing to the project for over 1 year.
> He authored more than 50 PRs and reviewed more than 40 PRs
> with mainly focus on connector, test, and document modules.
> He was very active on the mailing list (more than 90 threads) last year,
> which includes participating in a lot of dev discussions (30+),
> providing many effective suggestions for FLIPs and answering
> many user questions. He was the Flink Forward 2022 keynote speaker
> to help promote Flink and  a trainer for Flink troubleshooting and performance
> tuning of Flink Forward 2022 training program.
>
> Please join me in congratulating Jing for becoming a Flink committer!
>
> Best,
> Godfrey


[ANNOUNCE] Apache Flink Table Store 0.3.0 released

2023-01-13 Thread Jingsong Li
The Apache Flink community is very happy to announce the release of
Apache Flink Table Store 0.3.0.

Apache Flink Table Store is a unified storage to build dynamic tables
for both streaming and batch processing in Flink, supporting
high-speed data ingestion and timely data query.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/news/2023/01/13/release-table-store-0.3.0.html

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink Table Store can be found at:
https://central.sonatype.dev/search?q=flink-table-store

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352111

We would like to thank all contributors of the Apache Flink community
who made this release possible!

Best,
Jingsong Lee


[RESULT][VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-13 Thread Jingsong Li
I'm happy to announce that we have unanimously approved this release.

There are 3 approving votes, 3 of which are binding:
* Yu Li (binding)
* Jark Wu (binding)
* Jingsong Lee (binding)

There are no disapproving votes.

Thank you for verifying the release candidate. I will now proceed to
finalize the release and announce it once everything is published.

Best,
Jingsong


Re: [VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-13 Thread Jingsong Li
+1 (binding)

Best,
Jingsong

On Fri, Jan 13, 2023 at 5:16 PM Jark Wu  wrote:
>
> +1 (binding)
>
> - Build and compile the source code locally: *OK*
> - Verified signatures and hashes: *OK*
> - Checked no missing artifacts in the staging area: *OK*
> - Reviewed the website release PR: *OK*
> - Checked the licenses: *OK*
> - Went through the quick start: *OK*
>   * Verified with both flink 1.14.5 and 1.15.1 using
>   * Verified web UI and log output, nothing unexpected
>
> Best,
> Jark
>
> On Thu, 12 Jan 2023 at 20:52, Jingsong Li  wrote:
>
> > Thanks Yu, I have created table-store-0.3.1 and move these jiras to 0.3.1.
> >
> > Best,
> > Jingsong
> >
> > On Thu, Jan 12, 2023 at 7:41 PM Yu Li  wrote:
> > >
> > > Thanks for the quick action Jingsong! Here is my vote with the new
> > staging
> > > directory:
> > >
> > > +1 (binding)
> > >
> > >
> > > - Checked release notes: *Action Required*
> > >
> > >   * The fix version of FLINK-30620 and FLINK-30628 are 0.3.0 but still
> > > open, please confirm whether this should be included or we should move it
> > > out of 0.3.0
> > >
> > > - Checked sums and signatures: *OK*
> > >
> > > - Checked the jars in the staging repo: *OK*
> > >
> > > - Checked source distribution doesn't include binaries: *OK*
> > >
> > > - Maven clean install from source: *OK*
> > >
> > > - Checked version consistency in pom files: *OK*
> > >
> > > - Went through the quick start: *OK*
> > >
> > >   * Verified with flink 1.14.6, 1.15.3 and 1.16.0
> > >
> > > - Checked the website updates: *OK*
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Thu, 12 Jan 2023 at 15:36, Jingsong Li 
> > wrote:
> > >
> > > > Thanks Yu for your validation.
> > > >
> > > > I created a new staging directory [1]
> > > >
> > > > [1]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1577/
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Thu, Jan 12, 2023 at 3:07 PM Yu Li  wrote:
> > > > >
> > > > > Hi Jingsong,
> > > > >
> > > > > It seems the given staging directory [1] is not exposed, could you
> > double
> > > > > check and republish if necessary? Thanks.
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > > [1]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1576/
> > > > >
> > > > >
> > > > > On Tue, 10 Jan 2023 at 16:53, Jingsong Li 
> > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > Please review and vote on the release candidate #1 for the version
> > > > > > 0.3.0 of Apache Flink Table Store, as follows:
> > > > > >
> > > > > > [ ] +1, Approve the release
> > > > > > [ ] -1, Do not approve the release (please provide specific
> > comments)
> > > > > >
> > > > > > **Release Overview**
> > > > > >
> > > > > > As an overview, the release consists of the following:
> > > > > > a) Table Store canonical source distribution to be deployed to the
> > > > > > release repository at dist.apache.org
> > > > > > b) Table Store binary convenience releases to be deployed to the
> > > > > > release repository at dist.apache.org
> > > > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > > > >
> > > > > > **Staging Areas to Review**
> > > > > >
> > > > > > The staging areas containing the above mentioned artifacts are as
> > > > follows,
> > > > > > for your review:
> > > > > > * All artifacts for a) and b) can be found in the corresponding dev
> > > > > > repository at dist.apache.org [2]
> > > > > > * All artifacts for c) can be found at the Apache Nexus Repository
> > [3]
> > > > > >
> > > > > > All artifacts are signed with the key
> > > > > > 2C2B6A653B07086B65E4369F7C76245E0A318150 [4]
> > > > > >
> > > > > > Other links for your review:
> > > > > > * JIRA release notes [5]
> > > > > > * source code tag "release-0.3.0-rc1" [6]
> > > > > > * PR to update the website Downloads page to include Table Store
> > links
> > > > [7]
> > > > > >
> > > > > > **Vote Duration**
> > > > > >
> > > > > > The voting time will run for at least 72 hours.
> > > > > > It is adopted by majority approval, with at least 3 PMC affirmative
> > > > votes.
> > > > > >
> > > > > > Best,
> > > > > > Jingsong Lee
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Table+Store+Release
> > > > > > [2]
> > > > > >
> > > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-table-store-0.3.0-rc1/
> > > > > > [3]
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1576/
> > > > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > > > [5]
> > > > > >
> > > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352111
> > > > > > [6]
> > https://github.com/apache/flink-table-store/tree/release-0.3.0-rc1
> > > > > > [7] https://github.com/apache/flink-web/pull/601
> > > > > >
> > > >
> >


Re: [VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-12 Thread Jingsong Li
Thanks Yu, I have created table-store-0.3.1 and move these jiras to 0.3.1.

Best,
Jingsong

On Thu, Jan 12, 2023 at 7:41 PM Yu Li  wrote:
>
> Thanks for the quick action Jingsong! Here is my vote with the new staging
> directory:
>
> +1 (binding)
>
>
> - Checked release notes: *Action Required*
>
>   * The fix version of FLINK-30620 and FLINK-30628 are 0.3.0 but still
> open, please confirm whether this should be included or we should move it
> out of 0.3.0
>
> - Checked sums and signatures: *OK*
>
> - Checked the jars in the staging repo: *OK*
>
> - Checked source distribution doesn't include binaries: *OK*
>
> - Maven clean install from source: *OK*
>
> - Checked version consistency in pom files: *OK*
>
> - Went through the quick start: *OK*
>
>   * Verified with flink 1.14.6, 1.15.3 and 1.16.0
>
> - Checked the website updates: *OK*
>
> Best Regards,
> Yu
>
>
> On Thu, 12 Jan 2023 at 15:36, Jingsong Li  wrote:
>
> > Thanks Yu for your validation.
> >
> > I created a new staging directory [1]
> >
> > [1]
> > https://repository.apache.org/content/repositories/orgapacheflink-1577/
> >
> > Best,
> > Jingsong
> >
> > On Thu, Jan 12, 2023 at 3:07 PM Yu Li  wrote:
> > >
> > > Hi Jingsong,
> > >
> > > It seems the given staging directory [1] is not exposed, could you double
> > > check and republish if necessary? Thanks.
> > >
> > > Best Regards,
> > > Yu
> > >
> > > [1]
> > https://repository.apache.org/content/repositories/orgapacheflink-1576/
> > >
> > >
> > > On Tue, 10 Jan 2023 at 16:53, Jingsong Li 
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Please review and vote on the release candidate #1 for the version
> > > > 0.3.0 of Apache Flink Table Store, as follows:
> > > >
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > **Release Overview**
> > > >
> > > > As an overview, the release consists of the following:
> > > > a) Table Store canonical source distribution to be deployed to the
> > > > release repository at dist.apache.org
> > > > b) Table Store binary convenience releases to be deployed to the
> > > > release repository at dist.apache.org
> > > > c) Maven artifacts to be deployed to the Maven Central Repository
> > > >
> > > > **Staging Areas to Review**
> > > >
> > > > The staging areas containing the above mentioned artifacts are as
> > follows,
> > > > for your review:
> > > > * All artifacts for a) and b) can be found in the corresponding dev
> > > > repository at dist.apache.org [2]
> > > > * All artifacts for c) can be found at the Apache Nexus Repository [3]
> > > >
> > > > All artifacts are signed with the key
> > > > 2C2B6A653B07086B65E4369F7C76245E0A318150 [4]
> > > >
> > > > Other links for your review:
> > > > * JIRA release notes [5]
> > > > * source code tag "release-0.3.0-rc1" [6]
> > > > * PR to update the website Downloads page to include Table Store links
> > [7]
> > > >
> > > > **Vote Duration**
> > > >
> > > > The voting time will run for at least 72 hours.
> > > > It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > [1]
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Table+Store+Release
> > > > [2]
> > > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-table-store-0.3.0-rc1/
> > > > [3]
> > > >
> > https://repository.apache.org/content/repositories/orgapacheflink-1576/
> > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [5]
> > > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352111
> > > > [6] https://github.com/apache/flink-table-store/tree/release-0.3.0-rc1
> > > > [7] https://github.com/apache/flink-web/pull/601
> > > >
> >


Re: [VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-11 Thread Jingsong Li
Thanks Yu for your validation.

I created a new staging directory [1]

[1] https://repository.apache.org/content/repositories/orgapacheflink-1577/

Best,
Jingsong

On Thu, Jan 12, 2023 at 3:07 PM Yu Li  wrote:
>
> Hi Jingsong,
>
> It seems the given staging directory [1] is not exposed, could you double
> check and republish if necessary? Thanks.
>
> Best Regards,
> Yu
>
> [1] https://repository.apache.org/content/repositories/orgapacheflink-1576/
>
>
> On Tue, 10 Jan 2023 at 16:53, Jingsong Li  wrote:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version
> > 0.3.0 of Apache Flink Table Store, as follows:
> >
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > **Release Overview**
> >
> > As an overview, the release consists of the following:
> > a) Table Store canonical source distribution to be deployed to the
> > release repository at dist.apache.org
> > b) Table Store binary convenience releases to be deployed to the
> > release repository at dist.apache.org
> > c) Maven artifacts to be deployed to the Maven Central Repository
> >
> > **Staging Areas to Review**
> >
> > The staging areas containing the above mentioned artifacts are as follows,
> > for your review:
> > * All artifacts for a) and b) can be found in the corresponding dev
> > repository at dist.apache.org [2]
> > * All artifacts for c) can be found at the Apache Nexus Repository [3]
> >
> > All artifacts are signed with the key
> > 2C2B6A653B07086B65E4369F7C76245E0A318150 [4]
> >
> > Other links for your review:
> > * JIRA release notes [5]
> > * source code tag "release-0.3.0-rc1" [6]
> > * PR to update the website Downloads page to include Table Store links [7]
> >
> > **Vote Duration**
> >
> > The voting time will run for at least 72 hours.
> > It is adopted by majority approval, with at least 3 PMC affirmative votes.
> >
> > Best,
> > Jingsong Lee
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Table+Store+Release
> > [2]
> > https://dist.apache.org/repos/dist/dev/flink/flink-table-store-0.3.0-rc1/
> > [3]
> > https://repository.apache.org/content/repositories/orgapacheflink-1576/
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352111
> > [6] https://github.com/apache/flink-table-store/tree/release-0.3.0-rc1
> > [7] https://github.com/apache/flink-web/pull/601
> >


[VOTE] Apache Flink Table Store 0.3.0, release candidate #1

2023-01-10 Thread Jingsong Li
Hi everyone,

Please review and vote on the release candidate #1 for the version
0.3.0 of Apache Flink Table Store, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Release Overview**

As an overview, the release consists of the following:
a) Table Store canonical source distribution to be deployed to the
release repository at dist.apache.org
b) Table Store binary convenience releases to be deployed to the
release repository at dist.apache.org
c) Maven artifacts to be deployed to the Maven Central Repository

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as follows,
for your review:
* All artifacts for a) and b) can be found in the corresponding dev
repository at dist.apache.org [2]
* All artifacts for c) can be found at the Apache Nexus Repository [3]

All artifacts are signed with the key
2C2B6A653B07086B65E4369F7C76245E0A318150 [4]

Other links for your review:
* JIRA release notes [5]
* source code tag "release-0.3.0-rc1" [6]
* PR to update the website Downloads page to include Table Store links [7]

**Vote Duration**

The voting time will run for at least 72 hours.
It is adopted by majority approval, with at least 3 PMC affirmative votes.

Best,
Jingsong Lee

[1] 
https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Table+Store+Release
[2] https://dist.apache.org/repos/dist/dev/flink/flink-table-store-0.3.0-rc1/
[3] https://repository.apache.org/content/repositories/orgapacheflink-1576/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352111
[6] https://github.com/apache/flink-table-store/tree/release-0.3.0-rc1
[7] https://github.com/apache/flink-web/pull/601


Re: [ANNOUNCE] New Apache Flink Committer - Lincoln Lee

2023-01-09 Thread Jingsong Li
Congratulations, Lincoln!

Best,
Jingsong

On Tue, Jan 10, 2023 at 11:56 AM Leonard Xu  wrote:
>
> Congratulations, Lincoln!
>
> Impressive work in streaming semantics, well deserved!
>
>
> Best,
> Leonard
>
>
> > On Jan 10, 2023, at 11:52 AM, Jark Wu  wrote:
> >
> > Hi everyone,
> >
> > On behalf of the PMC, I'm very happy to announce Lincoln Lee as a new Flink
> > committer.
> >
> > Lincoln Lee has been a long-term Flink contributor since 2017. He mainly
> > works on Flink
> > SQL parts and drives several important FLIPs, e.g., FLIP-232 (Retry Async
> > I/O), FLIP-234 (
> > Retryable Lookup Join), FLIP-260 (TableFunction Finish). Besides, He also
> > contributed
> > much to Streaming Semantics, including the non-determinism problem and the
> > message
> > ordering problem.
> >
> > Please join me in congratulating Lincoln for becoming a Flink committer!
> >
> > Cheers,
> > Jark Wu
>


Re: [VOTE] FLIP-282: Introduce Delete & Update API

2023-01-09 Thread Jingsong Li
+1 binding

On Mon, Jan 9, 2023 at 6:14 PM Samrat Deb  wrote:
>
> +1 (non binding )
>
> thank you for driving
>
>
>
> On Mon, 9 Jan 2023 at 3:36 PM, yuxia  wrote:
>
> > Hi, all.
> >
> > I'd like to start a vote on FLIP-282: Introduce Delete & Update API[1].
> > You can find the discussion on it in here[2].
> >
> > The vote will last for at least 72 hours (Jan 12th at 10:00 AM GMT )
> > unless there is objection or insufficient votes.
> >
> > [1] [
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061
> > |
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061
> > ]
> > [2] [ https://lists.apache.org/thread/6h64v7v6gj916pkvmc3ql3vxxccr46r3 |
> > https://lists.apache.org/thread/6h64v7v6gj916pkvmc3ql3vxxccr46r3 ]
> >
> > Best regards,
> > Yuxia
> >
> >


Re: [VOTE] FLIP-280: Introduce EXPLAIN PLAN_ADVICE to provide SQL advice

2023-01-09 Thread Jingsong Li
+1 (binding)

On Mon, Jan 9, 2023 at 6:19 PM Jane Chan  wrote:
>
> Hi all,
>
> Thanks for all the feedback so far.
> Based on the discussion[1], we have come to a consensus, so I would like to
> start a vote on FLIP-280: Introduce EXPLAIN PLAN_ADVICE to provide SQL
> advice[2].
>
> The vote will last for at least 72 hours (Jan 12th at 10:00 GMT)
> unless there is an objection or insufficient votes.
>
> [1] https://lists.apache.org/thread/5xywxv7g43byoh0jbx1b6qo6gx6wjkcz
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-280%3A+Introduce+EXPLAIN+PLAN_ADVICE+to+provide+SQL+advice
>
> Best,
> Jane Chan


Re: [DISCUSS] FLIP-280: Introduce a new explain mode to provide SQL advice

2023-01-06 Thread Jingsong Li
re another advisor?
>
>
>  After reconsideration, I would like to let PlanAdvisor be an internal
> interface, which is different from implementing a custom connector/format.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-280%3A+Introduce+EXPLAIN+PLAN_ADVICE+to+provide+SQL+advice
> [2] https://dev.mysql.com/doc/refman/8.0/en/explain.html#explain-analyze
> [3] https://www.postgresql.org/docs/current/sql-explain.html
> [4] https://prestodb.io/docs/current/sql/explain-analyze.html
> [5] https://docs.pingcap.com/tidb/dev/sql-statement-explain-analyze
>
> Best regards,
> Jane
>
> On Tue, Jan 3, 2023 at 6:20 PM Jingsong Li  wrote:
>
> > Thanks Jane for the FLIP! It looks very nice!
> >
> > Can you give examples of other systems for the syntax?
> > In other systems, is EXPLAIN ANALYZE already PHYSICAL_PLAN?
> >
> > `EXPLAIN ANALYZED_PHYSICAL_PLAN ` looks a bit strange, and even
> > stranger that it contains `advice`.
> >
> > The purpose of FLIP seems to be a bit more to `advice`, so can we just
> > introduce a syntax for `advice`?
> >
> > Best,
> > Jingsong
> >
> > On Tue, Jan 3, 2023 at 3:40 PM godfrey he  wrote:
> > >
> > > Thanks for driving this discussion.
> > >
> > > Do we really need to expose `PlanAnalyzerFactory` as public interface?
> > > I prefer we only expose ExplainDetail#ANALYZED_PHYSICAL_PLAN and the
> > > analyzed result.
> > > Which is enough for users and consistent with the results of `explain`
> > method.
> > >
> > > The classes about plan analyzer are in table planner module, which
> > > does not public api
> > > (public interfaces should be defined in flink-table-api-java module).
> > > And PlanAnalyzer is depend on RelNode, which is internal class of
> > > planner, and not expose to users.
> > >
> > > Bests,
> > > Godfrey
> > >
> > >
> > > Shengkai Fang  于2023年1月3日周二 13:43写道:
> > > >
> > > > Sorry for the missing answer about the configuration of the Analyzer.
> > Users
> > > > may don't need to configure this with SQL statements. In the SQL
> > Gateway,
> > > > users can configure the endpoints with the option
> > `sql-gateway.endpoint.type`
> > > > in the flink-conf.
> > > >
> > > > Best,
> > > > Shengkai
> > > >
> > > > Shengkai Fang  于2023年1月3日周二 12:26写道:
> > > >
> > > > > Hi, Jane.
> > > > >
> > > > > Thanks for bringing this to the discussion. I have some questions
> > about
> > > > > the FLIP:
> > > > >
> > > > > 1. `PlanAnalyzer#analyze` uses the FlinkRelNode as the input. Could
> > you
> > > > > share some thoughts about the motivation? In my experience, users
> > mainly
> > > > > care about 2 things when they develop their job:
> > > > >
> > > > > a. Why their SQL can not work? For example, their streaming SQL
> > contains
> > > > > an OVER window but their ORDER key is not ROWTIME. In this case, we
> > may
> > > > > don't have a physical node or logical node because, during the
> > > > > optimization, the planner has already thrown the exception.
> > > > >
> > > > > b. Many users care about whether their state is compatible after
> > upgrading
> > > > > their Flink version. In this case, I think the old execplan and the
> > SQL
> > > > > statement are the user's input.
> > > > >
> > > > > So, I think we should introduce methods like
> > `PlanAnalyzer#analyze(String
> > > > > sql)` and `PlanAnalyzer#analyze(String sql, ExecnodeGraph)` here.
> > > > >
> > > > > 2. I am just curious how other people add the rules to the Advisor.
> > When
> > > > > rules increases, all these rules should be added to the Flink
> > codebase?
> > > > > 3. How do users configure another advisor?
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > >
> > > > >
> > > > > Jane Chan  于2022年12月28日周三 12:30写道:
> > > > >
> > > > >> Hi @yuxia, Thank you for reviewing the FLIP and raising questions.
> > > > >>
> > > > >> 1: Is the PlanAnalyzerFactory also expected to be implemented by
> > users
> > 

Re: [DISCUSS] FLIP-282: Introduce Delete & Update API

2023-01-05 Thread Jingsong Li
Thanks yuxia for your explanation.

But what I mean is that this may lead to confusion for implementers
and users. You can use comments to explain it. However, a good
interface can make the mechanism clearer through code design.

So here, I still think an independent SupportsXX interface can make
the behavior more clear.

Best,
Jingsong

On Wed, Jan 4, 2023 at 10:56 AM yuxia  wrote:
>
> Hi, Jingsong, thanks for your comments.
>
> ## About RowLevelDeleteMode
> That's really a good suggestion, now I have updated the FLIP to make 
> RowLevelDeleteMode a higer level.
>
> ## About scope of addContextParameter
> Sorry for the confusing, now I have updated the FLIP to add more comments for 
> it. The scope for the parameters is limited to the phase
> that Flink translates ranslates physical RelNode to ExecNode.
> It's possible to see all the other sources and sinks in a topo. For the 
> order, if only one sink, the sink will be last one to see the parametes,
> the order for the sources is consistent to the order the table source nodes 
> are translated to ExecNode.
> If there're multiple sinks for the case of StatementSet, the sink may also 
> see the parameters added by the table sources that belong the statment
> added earlier.
>
> ## About scope of getScanPurpose
> Yes, all sources wil see this method. But it won't bring any compatibility 
> issues for in here we just tell the source scan
> what's scan purpose without touching any other logic. If sources ignore this 
> method, it just works as normally. So I think there's
> no necessary to add a new interface like SupportsXX.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Jingsong Li" 
> 收件人: "dev" 
> 发送时间: 星期二, 2023年 1 月 03日 下午 12:12:12
> 主题: Re: [DISCUSS] FLIP-282: Introduce Delete & Update API
>
> Thanks yuxia for the FLIP! It looks really good!
>
> I have three comments:
>
> ## RowLevelDeleteMode
>
> Can RowLevelDeleteMode be a higher level?
> `SupportsRowLevelDelete.RowLevelDeleteMode` is better than
> `SupportsRowLevelDelete.RowLevelDeleteInfo.RowLevelDeleteMode`.
> Same as `RowLevelUpdateMode`.
>
> ## Scope of addContextParameter
>
> I see that some of your comments are for sink, but can you make it
> clearer here? What exactly is its scope? For example, is it possible
> to see all the other sources and sinks in a topo? What is the order of
> seeing?
>
> ## Scope of getScanPurpose
>
> Will all sources see this method? Will there be compatibility issues?
> If sources ignore this method, will this cause strange phenomena?
>
> What I mean is: should another SupportsXX be created here to provide
> delete and update.
>
> Best,
> Jingsong
>
> On Thu, Dec 29, 2022 at 6:23 PM yuxia  wrote:
> >
> > Hi, Lincoln Lee;
> > 1: Yes,  it's a typo; Thanks for pointing out. I have fixed the typo.
> > 2: For stream users,  assuming for delete, they will receive 
> > TableException("DELETE TABLE is not supported for streaming mode now"); 
> > Update is similar. I also update them to the FLIP.
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "Lincoln Lee" 
> > 收件人: "dev" 
> > 发送时间: 星期三, 2022年 12 月 28日 上午 9:50:50
> > 主题: Re: [DISCUSS] FLIP-282: Introduce Delete & Update API
> >
> > Hi yuxia,
> >
> > Thanks for the proposal! I think it'll be very useful for users in batch
> > scenarios to cooperate with external systems.
> >
> > For the flip I have two questions:
> > 1. Is it a typo the default method 'default ScanPurpose getScanPurpose();'
> > without implementation in interface ScanContext?
> > 2. For stream users, what exceptions will be received for this unsupported
> > operations?
> >
> > Best,
> > Lincoln Lee
> >
> >
> > yuxia  于2022年12月26日周一 20:24写道:
> >
> > > Hi, devs.
> > >
> > > I'd like to start a discussion about FLIP-282: Introduce Delete & Update
> > > API[1].
> > >
> > > Row-Level SQL Delete & Update are becoming more and more important in
> > > modern big data workflow. The use cases include deleting a set of rows for
> > > regulatory compliance, updating a set of rows for data correction, etc.
> > > So, in this FLIP, I want to introduce Delete & Update API to Flink in
> > > batch mode. With these interfaces, the external connectors will have
> > > ability to delete & update existing data in the corresponding storages.
> > >
> > > Looking forwards to your feedback.
> > >
> > > [1]:
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061
> > >
> > >
> > > Best regards,
> > > Yuxia
> > >
> > >


Re: [DISCUSS] FLIP-280: Introduce a new explain mode to provide SQL advice

2023-01-03 Thread Jingsong Li
Thanks Jane for the FLIP! It looks very nice!

Can you give examples of other systems for the syntax?
In other systems, is EXPLAIN ANALYZE already PHYSICAL_PLAN?

`EXPLAIN ANALYZED_PHYSICAL_PLAN ` looks a bit strange, and even
stranger that it contains `advice`.

The purpose of FLIP seems to be a bit more to `advice`, so can we just
introduce a syntax for `advice`?

Best,
Jingsong

On Tue, Jan 3, 2023 at 3:40 PM godfrey he  wrote:
>
> Thanks for driving this discussion.
>
> Do we really need to expose `PlanAnalyzerFactory` as public interface?
> I prefer we only expose ExplainDetail#ANALYZED_PHYSICAL_PLAN and the
> analyzed result.
> Which is enough for users and consistent with the results of `explain` method.
>
> The classes about plan analyzer are in table planner module, which
> does not public api
> (public interfaces should be defined in flink-table-api-java module).
> And PlanAnalyzer is depend on RelNode, which is internal class of
> planner, and not expose to users.
>
> Bests,
> Godfrey
>
>
> Shengkai Fang  于2023年1月3日周二 13:43写道:
> >
> > Sorry for the missing answer about the configuration of the Analyzer. Users
> > may don't need to configure this with SQL statements. In the SQL Gateway,
> > users can configure the endpoints with the option 
> > `sql-gateway.endpoint.type`
> > in the flink-conf.
> >
> > Best,
> > Shengkai
> >
> > Shengkai Fang  于2023年1月3日周二 12:26写道:
> >
> > > Hi, Jane.
> > >
> > > Thanks for bringing this to the discussion. I have some questions about
> > > the FLIP:
> > >
> > > 1. `PlanAnalyzer#analyze` uses the FlinkRelNode as the input. Could you
> > > share some thoughts about the motivation? In my experience, users mainly
> > > care about 2 things when they develop their job:
> > >
> > > a. Why their SQL can not work? For example, their streaming SQL contains
> > > an OVER window but their ORDER key is not ROWTIME. In this case, we may
> > > don't have a physical node or logical node because, during the
> > > optimization, the planner has already thrown the exception.
> > >
> > > b. Many users care about whether their state is compatible after upgrading
> > > their Flink version. In this case, I think the old execplan and the SQL
> > > statement are the user's input.
> > >
> > > So, I think we should introduce methods like `PlanAnalyzer#analyze(String
> > > sql)` and `PlanAnalyzer#analyze(String sql, ExecnodeGraph)` here.
> > >
> > > 2. I am just curious how other people add the rules to the Advisor. When
> > > rules increases, all these rules should be added to the Flink codebase?
> > > 3. How do users configure another advisor?
> > >
> > > Best,
> > > Shengkai
> > >
> > >
> > >
> > > Jane Chan  于2022年12月28日周三 12:30写道:
> > >
> > >> Hi @yuxia, Thank you for reviewing the FLIP and raising questions.
> > >>
> > >> 1: Is the PlanAnalyzerFactory also expected to be implemented by users
> > >> just
> > >> > like DynamicTableSourceFactory or other factories? If so, I notice that
> > >> in
> > >> > the code of PlanAnalyzerManager#registerAnalyzers, the code is as
> > >> follows:
> > >> > FactoryUtil.discoverFactory(classLoader, PlanAnalyzerFactory.class,
> > >> > StreamPlanAnalyzerFactory.STREAM_IDENTIFIER)); IIUC, it'll always find
> > >> the
> > >> > factory with the name StreamPlanAnalyzerFactory.STREAM_IDENTIFIER; Is
> > >> it a
> > >> > typo or by design ?
> > >>
> > >>
> > >> This is a really good open question. For the short answer, yes, it is by
> > >> design. I'll explain the consideration in more detail.
> > >>
> > >> The standard procedure to create a custom table source/sink is to
> > >> implement
> > >> the factory and the source/sink class. There is a strong 1v1 relationship
> > >> between the factory and the source/sink.
> > >>
> > >> SQL
> > >>
> > >> DynamicTableSourceFactory
> > >>
> > >> Source
> > >>
> > >> create table … with (‘connector’ = ‘foo’)
> > >>
> > >> #factoryIdentifer.equals(“foo”)
> > >>
> > >> FooTableSource
> > >>
> > >>
> > >> *Apart from that, the custom function module is another kind of
> > >> implementation. The factory creates a collection of functions. This is a
> > >> 1vN relationship between the factory and the functions.*
> > >>
> > >> SQL
> > >>
> > >> ModuleFactory
> > >>
> > >> Function
> > >>
> > >> load module ‘bar’
> > >>
> > >> #factoryIdentifier.equals(“bar”)
> > >>
> > >> A collection of functions
> > >>
> > >> Back to the plan analyzers, if we choose the first style, we also need to
> > >> expose a new SQL syntax to users, like "CREATE ANALYZER foo WITH ..." to
> > >> specify the factory identifier. But I think it is too heavy because an
> > >> analyzer is an auxiliary tool to help users write better queries, and 
> > >> thus
> > >> it should be exposed at the API level other than the user syntax level.
> > >>
> > >> As a result, I propose to follow the second style. Then we don't need to
> > >> introduce new syntax to create analyzers. Let StreamPlanAnalyzerFactory 
> > >> be
> > >> the default factory to create analyzers under

Re: [DISCUSS] Extending the feature freezing date of Flink 1.17

2023-01-03 Thread Jingsong Li
+1

On Tue, Jan 3, 2023 at 4:40 PM Matthias Pohl
 wrote:
>
> +1 for extending to Jan 31
>
> On Tue, Jan 3, 2023 at 8:33 AM Yu Li  wrote:
>
> > +1 for the proposal (extending the 1.17 feature freeze date to Jan 31st).
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 3 Jan 2023 at 15:11, Zhu Zhu  wrote:
> >
> > > +1 to extend the feature freeze date to Jan 31st.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > David Anderson  于2023年1月3日周二 11:41写道:
> > > >
> > > > I'm also in favor of extending the feature freeze to Jan 31st.
> > > >
> > > > David
> > > >
> > > > On Thu, Dec 29, 2022 at 9:01 AM Leonard Xu  wrote:
> > > >
> > > > > Thanks Qingsheng for the proposal, the pandemic has really impacted
> > > > > development schedules.
> > > > >
> > > > > Jan 31st makes sense to me.
> > > > >
> > > > >
> > > > > Best,
> > > > > Leonard
> > > > >
> > > > >
> > >
> >


  1   2   3   4   5   6   7   >