from:"Feng Jin"

[jira] [Created] (FLINK-36242) Fix unstable test in MaterializedTableITCase

2024-09-09 Thread Feng Jin (Jira)

Feng Jin created FLINK-36242:


 Summary: Fix unstable test in MaterializedTableITCase
 Key: FLINK-36242
 URL: https://issues.apache.org/jira/browse/FLINK-36242
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway, Tests
Reporter: Feng Jin


h3. Error message:
{code:java}
Aug 21 09:32:20 09:32:20.322 [ERROR] Tests run: 18, Failures: 0, Errors: 1, 
Skipped: 0, Time elapsed: 69.61 s <<< FAILURE! – in 
org.apache.flink.table.gateway.service.MaterializedTableStatementITCase
Aug 21 09:32:20 09:32:20.322 [ERROR] 
org.apache.flink.table.gateway.service.MaterializedTableStatementITCase.testDropMaterializedTableWithDeletedRefreshWorkflowInFullMode
 – Time elapsed: 0.415 s <<< ERROR!
Aug 21 09:32:20 org.apache.flink.table.gateway.api.utils.SqlGatewayException: 
Failed to getTable.
Aug 21 09:32:20 at 
org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.getTable(SqlGatewayServiceImpl.java:300)
Aug 21 09:32:20 at 
org.apache.flink.table.gateway.AbstractMaterializedTableStatementITCase.after(AbstractMaterializedTableStatementITCase.java:196)
Aug 21 09:32:20 at java.lang.reflect.Method.invoke(Method.java:498)
Aug 21 09:32:20 at 
java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
Aug 21 09:32:20 at 
java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
Aug 21 09:32:20 at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
Aug 21 09:32:20 at 
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
Aug 21 09:32:20 at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Aug 21 09:32:20 Caused by: org.apache.flink.table.api.TableException: Cannot 
find table '`test_catalog12`.`test_db`.`users_shops`' in any of the catalogs 
[test_catalog6, test_catalog7, test_catalog4, test_catalog5, test_catalog8, 
test_catalog9, test_catalog12, test_catalog11, test_catalog2, test_catalog10, 
test_catalog3, test_catalog1, default_catalog], nor as a temporary table.
Aug 21 09:32:20 at 
org.apache.flink.table.catalog.CatalogManager.lambda$getTableOrError$4(CatalogManager.java:673)
Aug 21 09:32:20 at java.util.Optional.orElseThrow(Optional.java:290)
Aug 21 09:32:20 at 
org.apache.flink.table.catalog.CatalogManager.getTableOrError(CatalogManager.java:670)
Aug 21 09:32:20 at 
org.apache.flink.table.gateway.service.operation.OperationExecutor.getTable(OperationExecutor.java:297)
Aug 21 09:32:20 at 
org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.getTable(SqlGatewayServiceImpl.java:297)
Aug 21 09:32:20 ... 7 more

{code}
 

{color:#00}Corresponding test link{color}

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=61530&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4&s=ae4f8708-9994-57d3-c2d7-b892156e7812]
h3. Problem:

 
{color:#00}As shown in the error message above, in the test method 
afterEach, we will list all materialized tables and then drop them to prevent 
any remaining refresh tasks.{color}
{color:#00}However, since the Drop Materialized Table has already been 
deleted, it causes the error mentioned above.{color}

{color:#00}* Why does listing tables show they exist when dropping tables 
results in an error stating they do not exist?{color}
{color:#00}1. In the test 
dropMaterializedTableWithDeletedRefreshWorkflowInFullMode, we manually dropped 
the Materialized Table.{color}
{color:#00}2. Despite manual deletion, background refresh tasks are still 
being submitted. This leads to data continuing to be written into the 
corresponding table data directory.{color}
{color:#00}3. TestFileSystemCatalog lists all directories for existing 
tables during listTable. If a directory exists, it returns true. However, 
during dropTable it checks if both table and schema files exist simultaneously. 
This inconsistency caused the mentioned issue.{color}

h3. {color:#00}Solution:{color}


{color:#00}1. Fix TestFileSystemCatalog logic for listTable by checking not 
only directories but also schema file existence.{color}
{color:#00}2. To further avoid this problem, change all tables in 
MaterializedTableITCase to be manually dropped instead.{color}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: .ClassNotFoundException: org.apache.flink.table.data.util.DataFormatConverters$RowConverter

2024-09-06 Thread Feng Jin

`org.apache.flink.table.data.util.DataFormatConverters$RowConverter` This
class is in the flink-table-runtime package. Can you confirm if this
package exists in your cluster ?


Best,
Feng


On Fri, Sep 6, 2024 at 11:47 AM Taher Koitawala  wrote:

> Thank you, I tried with the new libraries you suggested. i am using flink
> 1.18 and java 11
>
>   
> org.apache.iceberg
> iceberg-flink
> 1.6.1
> 
> 
> 
> 
> org.apache.iceberg
> iceberg-flink-runtime-1.18
> 1.6.1
> 
>
> I get the same error.
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/flink/table/data/util/DataFormatConverters$RowConverter
> at org.apache.iceberg.flink.sink.FlinkSink.forRow(FlinkSink.java:115)
> at org.example.Main.flinkProcessing(Main.java:96)
> at org.example.Main.main(Main.java:35)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.flink.table.data.util.DataFormatConverters$RowConverter
> at
>
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> at
>
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527)
> ... 3 more
>
> On Thu, Sep 5, 2024 at 10:46 PM Feng Jin  wrote:
>
> > Hi, Taher
> >
> > The version of iceberg connector you are using is not compatible with
> Flink
> > 1.18, you should use
> >
> >
> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-flink-runtime-1.18
> >
> >
> > Best,
> > Feng
> >
> >
> >
> > On Thu, Sep 5, 2024 at 9:21 PM Taher Koitawala 
> wrote:
> >
> > > Hi All,
> > >  I am using flink 1.18.1 with iceberg.
> > >
> > > I get the following errors
> > >
> > > Exception in thread "main" java.lang.NoClassDefFoundError:
> > > org/apache/flink/table/data/util/DataFormatConverters$RowConverter
> > > at org.apache.iceberg.flink.sink.FlinkSink.forRow(FlinkSink.java:105)
> > > at org.example.Main.flinkProcessing(Main.java:95)
> > > at org.example.Main.main(Main.java:35)
> > > Caused by: java.lang.ClassNotFoundException:
> > > org.apache.flink.table.data.util.DataFormatConverters$RowConverter
> > > at
> > >
> > >
> >
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> > > at
> > >
> > >
> >
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> > > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527)
> > > ... 3 more
> > >
> > >
> > >
> > > I have already added these to the pom. What am i doing wrong?
> > >
> > >   
> > > org.apache.iceberg
> > > iceberg-flink
> > > 1.6.1
> > > 
> > > 
> > > 
> > > org.apache.iceberg
> > > iceberg-flink-runtime
> > > 0.12.1
> > > 
> > > 
> > > 
> > > org.apache.flink
> > > flink-table-common
> > > 1.18.1
> > > 
> > >
> >
>

Re: [VOTE] FLIP-473: Introduce New SQL Operators Based on Asynchronous State APIs

2024-08-21 Thread Feng Jin

+1 (non-binding)

Best,
Feng


On Thu, Aug 22, 2024 at 9:42 AM Ron Liu  wrote:

> +1(binding)
>
> Best,
> Ron
>
> Zakelly Lan  于2024年8月21日周三 11:16写道：
>
> > +1 (binding)
> >
> > Best,
> > Zakelly
> >
> > On Wed, Aug 21, 2024 at 10:33 AM Xuyang  wrote:
> >
> > > Hi, everyone.
> > >
> > > I would like to start a vote on FLIP-473: Introduce New SQL Operators
> > Based
> > >
> > > on Asynchronous State APIs [1]. The discussion thread can be found here
> > > [2].
> > >
> > > The vote will be open for at least 72 hours unless there are any
> > objections
> > >
> > > or insufficient votes.
> > >
> > >
> > >
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-473+Introduce+New+SQL+Operators+Based+on+Asynchronous+State+APIs
> > >
> > > [2] https://lists.apache.org/thread/k6x03x7vjtn3gl1vrknkx8zvyn319bk9
> > >
> > >
> > >
> > >
> > > --
> > >
> > > Best！
> > > Xuyang
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Feng Jin

Congratulations, Xuannan!

Best,
Feng


On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote:

> Congratulations, Xuannan!
>
> Best,
> Rui
>
> On Sun, Aug 18, 2024 at 2:20 PM Leonard Xu  wrote:
>
> > Congratulations!  Xuannan
> >
> >
> > Best,
> > Leonard
> >
> >
>

Re: Re: Re:Re: [DISCUSS] FLIP-473: Introduce New SQL Operators Based on Asynchronous State APIs

2024-08-13 Thread Feng Jin

Hi, Xuyang

Thank you for initiating this FLIP. I believe this is a significant feature
for the future of Flink SQL, but I also share concerns about the
maintenance costs related to the correctness, state compatibility, and
performance of the two implementations.
I fully support ensuring state compatibility, functional correctness, and
performance regression detection through HarnessTests and IT Tests. 1. I
think relevant testing is a critical part. Is there a more detailed design
plan for the HarnessTests and performance regression test? 2. Regarding the
final performance comparison between thenCombine and combineAll, is there a
more specific conclusion? Which one should we use, or are both options
viable?

Best,
Feng.

On Mon, Aug 12, 2024 at 11:23 AM Xuyang  wrote:

> Hi, David. Thank you for your review. Let me address your questions:
>
> >1. Is there a way to enforce this as an invariant during build time,
> >perhaps through a generic test framework that switches between the sync
> and
> >async versions of all operators and verifies checkpoint compatibility?
>
> Yes, we need to incorporate a corresponding testing framework to verify
> the state compatibility during
>
> transitions between sync and async state operators. In my proposal, it
> resembles the existing RestoreTestBase,
>
> with the testing logic structured as follows: a. Start with the sync state
> operator to consume data; b. Execute a
>
> checkpoint; c. Restart with the async state operator and recover data from
> the checkpoint for re-consumption;
>
> d. Validate the correctness of the results. Additionally, we could also
> consider scenarios where the async state
>
> operator starts consuming data initially, followed by the restart with the
> sync state operator. I have updated this part
>
> to the section `TEST PLAN` in flip.
>
> >2. If the only difference between them is state handling, could they
> >potentially be implemented as the same operator with two different
> >interfaces? My main concern is code reuse—it’s crucial to avoid
> duplicating
> >code to ensure both implementations stay aligned. Additionally, could
> >feature parity be verified at the test suite level (similar to the first
> >question)? Perhaps we could create a single parameterized test suite that
> >runs against both versions?
>
> IIUC, your focus aligns with the roadmap’s mention of “Refactoring the
> sync and async state operators,
>
> leveraging shared logical calculations while abstracting the state access
> details.” Due to the intricate details
>
> of the code implementation, this was not elaborated in the flip. I share
> your vision of designing reusable business
>
> logic classes (such as JoinHelper) alongside different operator interfaces
> (SyncStateJoinOperator and
>
> AsyncStateJoinOperator), consolidating the reusable logic within the class
> JoinHelper.
>
> For synchronous operators, we already have harness tests to validate data
> correctness, and there will also be
>
> dedicated harness tests for asynchronous operators. Using a parameterized
> test suite for both harnesses is indeed feasible.
>
>
>
>
> --
>
> Best！
> Xuyang
>
>
>
>
>
> 在 2024-08-09 20:49:49，"David Morávek"  写道：
> >Hi Xuyang,
> >
> >Thank you for looking into this—great work! The overall direction seems
> >solid. I have two minor questions:
> >
> >In theory, the implementation of AsyncStateOperator and SyncStateOperator
> >> differs only in their state handling. Their state schemas, business
> logic,
> >> and other aspects remain the same. Therefore, within the same Flink
> >> version, when the SQL and other Flink configurations remain unchanged,
> or
> >> when using the same compiled plan, users can freely switch between
> >> AsyncStateOperator and SyncStateOperator by toggling the configuration
> >> table.exec.async-state.enabled, as they are fully compatible.
> >
> >
> >1. Is there a way to enforce this as an invariant during build time,
> >perhaps through a generic test framework that switches between the sync
> and
> >async versions of all operators and verifies checkpoint compatibility?
> >
> >2. If the only difference between them is state handling, could they
> >potentially be implemented as the same operator with two different
> >interfaces? My main concern is code reuse—it’s crucial to avoid
> duplicating
> >code to ensure both implementations stay aligned. Additionally, could
> >feature parity be verified at the test suite level (similar to the first
> >question)? Perhaps we could create a single parameterized test suite that
> >runs against both versions?
> >
> >Best,
> >D.
> >
> >On Thu, Aug 8, 2024 at 2:23 PM Xuyang  wrote:
> >
> >> Hi, everyone.
> >>
> >> I have updated the FLIP. Here are the newly added sections:
> >>
> >>
> >>  In theory, the implementation of AsyncStateOperator and
> SyncStateOperator
> >> differs only in their state handling.
> >>
> >> Their state schemas, business logic, and others are the same. Therefore,
> >> within the same Flink versi

Re: [ANNOUNCE] Apache Flink 1.20.0 released

2024-08-02 Thread Feng Jin

Congratulations!

Thanks to release managers and everyone involved!

Best,
Feng Jin

On Fri, Aug 2, 2024 at 5:45 PM Yubin Li  wrote:

> Congrats!
> Thanks to release managers and everyone involved for the excellent work !
>
> Best,
> Yubin Li
>
> On Fri, Aug 2, 2024 at 5:41 PM Paul Lam  wrote:
> >
> > Congrats!
> >
> > Best,
> > Paul Lam
> >
> > > 2024年8月2日 17:26，Ahmed Hamdy  写道：
> > >
> > > Congratulations!
> > > Thanks Weijie and the release managers for the huge efforts.
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Fri, 2 Aug 2024 at 10:15, Zakelly Lan 
> wrote:
> > >
> > >> Congratulations! Thanks to release managers and everyone involved!
> > >>
> > >>
> > >> Best,
> > >> Zakelly
> > >>
> > >> On Fri, Aug 2, 2024 at 5:05 PM weijie guo 
> > >> wrote:
> > >>
> > >>> The Apache Flink community is very happy to announce the release of
> > >> Apache
> > >>>
> > >>> Flink 1.20.0, which is the first release for the Apache Flink 1.20
> > >> series.
> > >>>
> > >>>
> > >>> Apache Flink® is an open-source stream processing framework for
> > >>>
> > >>> distributed, high-performing, always-available, and accurate data
> > >> streaming
> > >>>
> > >>> applications.
> > >>>
> > >>>
> > >>> The release is available for download at:
> > >>>
> > >>> https://flink.apache.org/downloads.html
> > >>>
> > >>>
> > >>> Please check out the release blog post for an overview of the
> > >> improvements
> > >>> for this release:
> > >>>
> > >>>
> > >>>
> > >>
> https://flink.apache.org/2024/08/02/announcing-the-release-of-apache-flink-1.20/
> > >>>
> > >>>
> > >>> The full release notes are available in Jira:
> > >>>
> > >>>
> > >>>
> > >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354210
> > >>>
> > >>>
> > >>> We would like to thank all contributors of the Apache Flink
> community who
> > >>>
> > >>> made this release possible!
> > >>>
> > >>>
> > >>> Best,
> > >>>
> > >>> Robert, Rui, Ufuk, Weijie
> > >>>
> > >>
> >
>

Re: [VOTE] Release 1.20.0, release candidate #1

2024-07-21 Thread Feng Jin

Hi, weijie

-1 (non-binding)

During our testing process, we discovered a critical bug that impacts the
correctness of the materialized table.
A fix pr [1] is now prepared and will be merged within the next two days.

I apologize for any inconvenience during the release process.


[1]. https://issues.apache.org/jira/browse/FLINK-35872


Best,

Feng


On Fri, Jul 19, 2024 at 5:45 PM Xintong Song  wrote:

> +1 (binding)
>
> - reviewed flink-web PR
> - verified checksum and signature
> - verified source archives don't contain binaries
> - built from source
> - tried example jobs on a standalone cluster, and everything looks fine
>
> Best,
>
> Xintong
>
>
>
> On Thu, Jul 18, 2024 at 4:25 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > +1(binding)
> >
> > - Reviewed the flink-web PR (Left some comments)
> > - Checked Github release tag
> > - Verified signatures
> > - Verified sha512 (hashsums)
> > - The source archives don't contain any binaries
> > - Build the source with Maven 3 and java8 (Checked the license as well)
> > - Start the cluster locally with jdk8, and run the StateMachineExample
> job,
> > it works fine.
> >
> > Best,
> > Rui
> >
> > On Mon, Jul 15, 2024 at 10:59 PM weijie guo 
> > wrote:
> >
> > > Hi everyone,
> > >
> > >
> > > Please review and vote on the release candidate #1 for the version
> > 1.20.0,
> > >
> > > as follows:
> > >
> > >
> > > [ ] +1, Approve the release
> > >
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > >
> > > * JIRA release notes [1], and the pull request adding release note for
> > >
> > > users [2]
> > >
> > > * the official Apache source release and binary convenience releases to
> > be
> > >
> > > deployed to dist.apache.org [3], which are signed with the key with
> > >
> > > fingerprint 8D56AE6E7082699A4870750EA4E8C4C05EE6861F  [4],
> > >
> > > * all artifacts to be deployed to the Maven Central Repository [5],
> > >
> > > * source code tag "release-1.20.0-rc1" [6],
> > >
> > > * website pull request listing the new release and adding announcement
> > blog
> > >
> > > post [7].
> > >
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > >
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354210
> > >
> > > [2] https://github.com/apache/flink/pull/25091
> > >
> > > [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.20.0-rc1/
> > >
> > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >
> > > [5]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1744/
> > >
> > > [6] https://github.com/apache/flink/releases/tag/release-1.20.0-rc1
> > >
> > > [7] https://github.com/apache/flink-web/pull/751
> > >
> > >
> > > Best,
> > >
> > > Robert, Rui, Ufuk, Weijie
> > >
> >
>

[jira] [Created] (FLINK-35872) Fix the incorrect partition generation during period refresh in Full Mode

2024-07-21 Thread Feng Jin (Jira)

Feng Jin created FLINK-35872:


 Summary: Fix the incorrect partition generation during period 
refresh in Full Mode
 Key: FLINK-35872
 URL: https://issues.apache.org/jira/browse/FLINK-35872
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Gateway
Affects Versions: 1.20.0
Reporter: Feng Jin


During the period refresh, we currently default to using scheduler time to 
generate the target partition, but we should use the T - 1 partition instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-465: Introduce DESCRIBE FUNCTION

2024-07-04 Thread Feng Jin

+1 (non-binding)

Best,
Feng Jin

On Thu, Jul 4, 2024 at 6:02 PM Martijn Visser 
wrote:

> +1 (binding)
>
> On Thu, Jul 4, 2024 at 5:39 AM Yanquan Lv  wrote:
>
> > Hi Natea, thanks for driving it.
> > +1 (non-binding).
> >
> > Jim Hughes  于2024年7月4日周四 04:41写道：
> >
> > > Hi Natea,
> > >
> > > Looks good to me!
> > >
> > > +1 (non-binding).
> > >
> > > Cheers,
> > >
> > > Jim
> > >
> > > On Wed, Jul 3, 2024 at 3:16 PM Natea Eshetu Beshada
> > >  wrote:
> > >
> > > > Sorry I forgot to include the FLIP [1] and the mailing thread
> > discussion
> > > > link [2] in my previous email.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-465%3A+Introduce+DESCRIBE+FUNCTION
> > > > [2] https://lists.apache.org/thread/s46ftnmz4ggmmssgyx6vfhqjttsk9lph
> > > >
> > > > Thanks,
> > > > Natea
> > > >
> > > > On Wed, Jul 3, 2024 at 12:06 PM Natea Eshetu Beshada <
> > > > nbesh...@confluent.io>
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > I would like to start a vote on FLIP-465 [1]. It proposes adding
> SQL
> > > > > syntax that would allow users to describe the metadata of a given
> > > > function.
> > > > >
> > > > > The vote will be open for at least 72 hours (Saturday, July 6th of
> > July
> > > > > 2024,
> > > > > 12:30 PST) unless there is an objection or insufficient votes.
> > > > >
> > > > > Thanks,
> > > > > Natea
> > > > >
> > > > >
> > > >
> > >
> >
>

[jira] [Created] (FLINK-35734) Not override the user-defined checkpoint interval in continuous mode.

2024-06-30 Thread Feng Jin (Jira)

Feng Jin created FLINK-35734:


 Summary: Not override the user-defined checkpoint interval in 
continuous mode.
 Key: FLINK-35734
 URL: https://issues.apache.org/jira/browse/FLINK-35734
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin


{color:#00}Currently, in continuous mode, the checkpoint interval is set 
based on freshness by default. However, if the user explicitly sets a 
checkpoint interval, we should follow the user's setting.{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-456: Introduce DESCRIBE FUNCTION

2024-06-30 Thread Feng Jin

Hi, Natea,


Thank you for initiating this FLIP.

Have you considered displaying parameter names? This would make it easier
for users to use named parameters.


Best,
Feng


On Sun, Jun 30, 2024 at 3:49 PM Yanquan Lv  wrote:

> Hi, Natea. This FLIP looks good from my side.
> I also look forward to showing the return type, as well as auxiliary
> information such as comments or usage, although the current implementation
> does not include this part.
>
>
> Natea Eshetu Beshada  于2024年6月26日周三
> 03:01写道：
>
> > Oh no haha, yes thanks for pointing that out Jim! Of course I make a typo
> > on an editable format like email :)
> >
> > On Tue, Jun 25, 2024 at 11:20 AM Jim Hughes  >
> > wrote:
> >
> > > Hi Natea,
> > >
> > > Looks good.  As a note, in the title of this email, a number got
> > switched!
> > > FLIP-456 is about compiled plans for batch operators. :)
> > >
> > > The link below is correct.
> > >
> > > Cheers,
> > >
> > > Jim
> > >
> > > On Tue, Jun 25, 2024 at 1:29 PM Natea Eshetu Beshada
> > >  wrote:
> > >
> > > > Hello all,
> > > >
> > > > I would like to kickstart the discussion of FLIP-465: Introduce
> > DESCRIBE
> > > > FUNCTION [1].
> > > >
> > > > The proposal is to add SQL syntax that would allow users to describe
> > the
> > > > metadata of a given function.
> > > >
> > > > I look forward to hearing feedback from the community.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-465%3A+Introduce+DESCRIBE+FUNCTION
> > > >
> > > > Thanks,
> > > > Natea
> > > >
> > >
> >
>

Re: [VOTE] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-17 Thread Feng Jin

+1 (non-binding)


Best,
Feng Jin


On Tue, Jun 18, 2024 at 10:24 AM Lincoln Lee  wrote:

> +1 (binding)
>
>
> Best,
> Lincoln Lee
>
>
> Xintong Song  于2024年6月17日周一 13:39写道：
>
> > +1 (binding)
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Mon, Jun 17, 2024 at 11:41 AM Zhanghao Chen <
> zhanghao.c...@outlook.com>
> > wrote:
> >
> > > +1 (unbinding)
> > >
> > > Best,
> > > Zhanghao Chen
> > > 
> > > From: weijie guo 
> > > Sent: Monday, June 17, 2024 10:13
> > > To: dev 
> > > Subject: [VOTE] FLIP-462: Support Custom Data Distribution for Input
> > > Stream of Lookup Join
> > >
> > > Hi everyone,
> > >
> > >
> > > Thanks for all the feedback about the FLIP-462: Support Custom Data
> > > Distribution for Input Stream of Lookup Join [1]. The discussion
> > > thread is here [2].
> > >
> > >
> > > The vote will be open for at least 72 hours unless there is an
> > > objection or insufficient votes.
> > >
> > >
> > > Best,
> > >
> > > Weijie
> > >
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join
> > >
> > >
> > > [2] https://lists.apache.org/thread/kds2zrcdmykrz5lmn0hf9m4phdl60nfb
> > >
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Zhongqiang Gong

2024-06-17 Thread Feng Jin

Congratulations Zhongqiang !!!

Best regards
Feng Jin

On Mon, Jun 17, 2024 at 7:01 PM Feifan Wang  wrote:

> Congratulations Zhongqiang !
>
>
> ——
>
> Best regards,
>
> Feifan Wang
>
>
>
>
> At 2024-06-17 11:20:30, "Leonard Xu"  wrote:
> >Hi everyone,
> >On behalf of the PMC, I'm happy to announce that Zhongqiang Gong has
> become a new Flink Committer!
> >
> >Zhongqiang has been an active Flink community member since November 2021,
> contributing numerous PRs to both the Flink and Flink CDC repositories. As
> a core contributor to Flink CDC, he developed the Oracle and SQL Server CDC
> Connectors and managed essential website and CI migrations during the
> donation of Flink CDC to Apache Flink.
> >
> >Beyond his technical contributions, Zhongqiang actively participates in
> discussions on the Flink dev mailing list and responds to threads on the
> user and user-zh mailing lists. As an Apache StreamPark (incubating)
> Committer, he promotes Flink SQL and Flink CDC technologies at meetups and
> within the StreamPark community.
> >
> >Please join me in congratulating Zhongqiang Gong for becoming an Apache
> Flink committer!
> >
> >Best,
> >Leonard (on behalf of the Flink PMC)
>

Re: [ANNOUNCE] New Apache Flink Committer - Hang Ruan

2024-06-17 Thread Feng Jin

Congratulations Hang!!!.

Best Regards
Feng Jin

On Mon, Jun 17, 2024 at 7:08 PM Ahmed Hamdy  wrote:

> Congratulations Hang.
> Regards Ahmed
>
> On Mon, 17 Jun 2024, 14:01 Feifan Wang,  wrote:
>
> > Congratulations Hang !
> >
> >
> > ——
> >
> > Best regards,
> >
> > Feifan Wang
> >
> >
> >
> >
> > At 2024-06-17 11:17:13, "Leonard Xu"  wrote:
> > >Hi everyone,
> > >On behalf of the PMC, I'm happy to let you know that Hang Ruan has
> become
> > a new Flink Committer !
> > >
> > >Hang Ruan has been continuously contributing to the Flink project since
> > August 2021. Since then, he has continuously contributed to Flink, Flink
> > CDC, and various Flink connector repositories, including
> > flink-connector-kafka, flink-connector-elasticsearch,
> flink-connector-aws,
> > flink-connector-rabbitmq, flink-connector-pulsar, and
> > flink-connector-mongodb. Hang Ruan focuses on the improvements related to
> > connectors and catalogs and initiated FLIP-274. He is most recognized as
> a
> > core contributor and maintainer for the Flink CDC project, contributing
> > many features such as MySQL CDC newly table addition and the Schema
> > Evolution feature.
> > >
> > >Beyond his technical contributions, Hang Ruan is an active member of the
> > Flink community. He regularly engages in discussions on the Flink dev
> > mailing list and the user-zh and user mailing lists, participates in FLIP
> > discussions, assists with user Q&A, and consistently volunteers for
> release
> > verifications.
> > >
> > >Please join me in congratulating Hang Ruan for becoming an Apache Flink
> > committer!
> > >
> > >Best,
> > >Leonard (on behalf of the Flink PMC)
> >
>

Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-05 Thread Feng Jin

Congratulations, Rui!


Best,
Feng Jin

On Wed, Jun 5, 2024 at 8:23 PM Yanfei Lei  wrote:

> Congratulations, Rui!
>
> Best,
> Yanfei
>
> Luke Chen  于2024年6月5日周三 20:08写道：
> >
> > Congrats, Rui!
> >
> > Luke
> >
> > On Wed, Jun 5, 2024 at 8:02 PM Jiabao Sun  wrote:
> >
> > > Congrats, Rui. Well-deserved!
> > >
> > > Best,
> > > Jiabao
> > >
> > > Zhanghao Chen  于2024年6月5日周三 19:29写道：
> > >
> > > > Congrats, Rui!
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > From: Piotr Nowojski 
> > > > Sent: Wednesday, June 5, 2024 18:01
> > > > To: dev ; rui fan <1996fan...@gmail.com>
> > > > Subject: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui
> > > >
> > > > Hi everyone,
> > > >
> > > > On behalf of the PMC, I'm very happy to announce another new Apache
> Flink
> > > > PMC Member - Fan Rui.
> > > >
> > > > Rui has been active in the community since August 2019. During this
> time
> > > he
> > > > has contributed a lot of new features. Among others:
> > > >   - Decoupling Autoscaler from Kubernetes Operator, and supporting
> > > > Standalone Autoscaler
> > > >   - Improvements to checkpointing, flamegraphs, restart strategies,
> > > > watermark alignment, network shuffles
> > > >   - Optimizing the memory and CPU usage of large operators, greatly
> > > > reducing the risk and probability of TaskManager OOM
> > > >
> > > > He reviewed a significant amount of PRs and has been active both on
> the
> > > > mailing lists and in Jira helping to both maintain and grow Apache
> > > Flink's
> > > > community. He is also our current Flink 1.20 release manager.
> > > >
> > > > In the last 12 months, Rui has been the most active contributor in
> the
> > > > Flink Kubernetes Operator project, while being the 2nd most active
> Flink
> > > > contributor at the same time.
> > > >
> > > > Please join me in welcoming and congratulating Fan Rui!
> > > >
> > > > Best,
> > > > Piotrek (on behalf of the Flink PMC)
> > > >
> > >
>

Re: Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie Guo

2024-06-04 Thread Feng Jin

Congratulations, Weijie!


Best,
Feng Jin

On Tue, Jun 4, 2024 at 9:03 PM Wencong Liu  wrote:

> Congratulations, Weijie!
>
>
>
>
> Best,
>
> Wencong
>
>
>
>
> At 2024-06-04 20:46:52, "Lijie Wang"  wrote:
> >Congratulations, Weijie!
> >
> >Best,
> >Lijie
> >
> >Zakelly Lan  于2024年6月4日周二 20:45写道：
> >
> >> Congratulations, Weijie!
> >>
> >> Best,
> >> Zakelly
> >>
> >> On Tue, Jun 4, 2024 at 7:49 PM Sergey Nuyanzin 
> >> wrote:
> >>
> >> > Congratulations Weijio Guo!
> >> >
> >> > On Tue, Jun 4, 2024, 13:45 Jark Wu  wrote:
> >> >
> >> > > Congratulations, Weijie!
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > > On Tue, 4 Jun 2024 at 19:10, spoon_lz  wrote:
> >> > >
> >> > > > Congratulations, Weijie!
> >> > > >
> >> > > >
> >> > > >
> >> > > > Regards,
> >> > > > Zhuo.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >  Replied Message 
> >> > > > | From | Aleksandr Pilipenko |
> >> > > > | Date | 06/4/2024 18:59 |
> >> > > > | To |  |
> >> > > > | Subject | Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie
> Guo |
> >> > > > Congratulations, Weijie!
> >> > > >
> >> > > > Best,
> >> > > > Aleksandr
> >> > > >
> >> > > > On Tue, 4 Jun 2024 at 11:42, Abdulquddus Babatunde Ekemode <
> >> > > > abdulqud...@aligence.io> wrote:
> >> > > >
> >> > > > Congratulations! I wish you all the best.
> >> > > >
> >> > > > Best Regards,
> >> > > > Abdulquddus
> >> > > >
> >> > > > On Tue, 4 Jun 2024 at 13:14, Ahmed Hamdy 
> >> wrote:
> >> > > >
> >> > > > Congratulations Weijie
> >> > > > Best Regards
> >> > > > Ahmed Hamdy
> >> > > >
> >> > > >
> >> > > > On Tue, 4 Jun 2024 at 10:51, Matthias Pohl 
> >> wrote:
> >> > > >
> >> > > > Congratulations, Weijie!
> >> > > >
> >> > > > Matthias
> >> > > >
> >> > > > On Tue, Jun 4, 2024 at 11:12 AM Guowei Ma 
> >> > > > wrote:
> >> > > >
> >> > > > Congratulations!
> >> > > >
> >> > > > Best,
> >> > > > Guowei
> >> > > >
> >> > > >
> >> > > > On Tue, Jun 4, 2024 at 4:55 PM gongzhongqiang <
> >> > > > gongzhongqi...@apache.org
> >> > > >
> >> > > > wrote:
> >> > > >
> >> > > > Congratulations Weijie! Best,
> >> > > > Zhongqiang Gong
> >> > > >
> >> > > > Xintong Song  于2024年6月4日周二 14:46写道：
> >> > > >
> >> > > > Hi everyone,
> >> > > >
> >> > > > On behalf of the PMC, I'm very happy to announce that Weijie Guo
> >> > > > has
> >> > > > joined
> >> > > > the Flink PMC!
> >> > > >
> >> > > > Weijie has been an active member of the Apache Flink community
> >> > > > for
> >> > > > many
> >> > > > years. He has made significant contributions in many components,
> >> > > > including
> >> > > > runtime, shuffle, sdk, connectors, etc. He has driven /
> >> > > > participated
> >> > > > in
> >> > > > many FLIPs, authored and reviewed hundreds of PRs, been
> >> > > > consistently
> >> > > > active
> >> > > > on mailing lists, and also helped with release management of 1.20
> >> > > > and
> >> > > > several other bugfix releases.
> >> > > >
> >> > > > Congratulations and welcome Weijie!
> >> > > >
> >> > > > Best,
> >> > > >
> >> > > > Xintong (on behalf of the Flink PMC)
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
>

[jira] [Created] (FLINK-35479) Add end-to-end test for materialized table

2024-05-29 Thread Feng Jin (Jira)

Feng Jin created FLINK-35479:


 Summary: Add end-to-end test for materialized table
 Key: FLINK-35479
 URL: https://issues.apache.org/jira/browse/FLINK-35479
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Gateway, Tests
Reporter: Feng Jin
 Fix For: 1.20.0


Add end-to-end test cases related to materialized tables, including the 
processes of dropping, refreshing, and dropping materialized tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-26 Thread Feng Jin

+1 (non-binding)

Best,
Feng Jin

On Mon, May 27, 2024 at 1:05 PM Benchao Li  wrote:

> +1 (binding)
>
> Yubin Li  于2024年5月25日周六 12:26写道：
> >
> > +1 (non-binding)
> >
> > Best,
> > Yubin
> >
> > On Fri, May 24, 2024 at 2:04 PM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > +1(binding)
> > >
> > > Best,
> > > Rui
> > >
> > > On Fri, May 24, 2024 at 1:45 PM Leonard Xu  wrote:
> > >
> > > > +1
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > > 2024年5月24日 下午1:27，weijie guo  写道：
> > > > >
> > > > > +1(binding)
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > Lincoln Lee  于2024年5月24日周五 12:20写道：
> > > > >
> > > > >> +1(binding)
> > > > >>
> > > > >> Best,
> > > > >> Lincoln Lee
> > > > >>
> > > > >>
> > > > >> Jane Chan  于2024年5月24日周五 09:52写道：
> > > > >>
> > > > >>> Hi all,
> > > > >>>
> > > > >>> I'd like to start a vote on FLIP-457[1] after reaching a
> consensus
> > > > >> through
> > > > >>> the discussion thread[2].
> > > > >>>
> > > > >>> The vote will be open for at least 72 hours unless there is an
> > > > objection
> > > > >> or
> > > > >>> insufficient votes.
> > > > >>>
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > > >>
> > > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=307136992
> > > > >>> [2]
> https://lists.apache.org/thread/1sthbv6q00sq52pp04n2p26d70w4fqj1
> > > > >>>
> > > > >>> Best,
> > > > >>> Jane
> > > > >>>
> > > > >>
> > > >
> > > >
>
>
>
> --
>
> Best,
> Benchao Li
>

Re: Re: [VOTE] FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-05-09 Thread Feng Jin

+1 (non-binding)


Best,
Feng


On Thu, May 9, 2024 at 7:37 PM Xuyang  wrote:

> +1 (non-binding)
>
>
> --
>
> Best！
> Xuyang
>
>
>
>
>
> At 2024-05-09 13:57:07, "Ron Liu"  wrote:
> >Sorry for the re-post, just to format this email content.
> >
> >Hi Dev
> >
> >Thank you to everyone for the feedback on FLIP-448: Introduce Pluggable
> >Workflow Scheduler Interface for Materialized Table[1][2].
> >I'd like to start a vote for it. The vote will be open for at least 72
> >hours unless there is an objection or not enough votes.
> >
> >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> >
> >[2] https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> >
> >Best,
> >Ron
> >
> >Ron Liu  于2024年5月9日周四 13:52写道：
> >
> >> Hi Dev, Thank you to everyone for the feedback on FLIP-448: Introduce
> >> Pluggable Workflow Scheduler Interface for Materialized Table[1][2]. I'd
> >> like to start a vote for it. The vote will be open for at least 72 hours
> >> unless there is an objection or not enough votes. [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> >>
> >> [2] https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> >> Best, Ron
> >>
>

Re: [DISCUSS] FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-04-24 Thread Feng Jin

Hi Ron

Thank you for initiating this FLIP.

My current questions are as follows:

1. From my current understanding, the workflow handle should not be bound
to the Dynamic Table. Therefore, if the workflow is modified, does it mean
that the scheduling information corresponding to the Dynamic Table will be
lost?

2. Regarding the status information of the workflow, I am wondering if it
is necessary to provide an interface to display the backend scheduling
information? This would make it more convenient to view the execution
status of backend jobs.


Best,
Feng


On Wed, Apr 24, 2024 at 3:24 PM 
wrote:

> Hello Ron Liu! Thank you for your FLIP!
>
> Here are my considerations:
>
> 1.
> About the Operations interfaces, how can they be empty?
> Should not they provide at least a `run` or `execute` method (similar to
> the command pattern)?
> In this way, their implementation can wrap all the implementations details
> of particular schedulers, and the scheduler can simply execute the command.
> In general, I think a simple sequence diagram showcasing the interaction
> between the interfaces would be awesome to better understand the concept.
>
> 2.
> What about the RefreshHandler, I cannot find a definition of its interface
> here.
> Is it out of scope for this FLIP?
>
> 3.
> For the SqlGatewayService arguments:
>
> boolean isPeriodic,
> @Nullable String scheduleTime,
> @Nullable String scheduleTimeFormat,
>
> If it is periodic, where is the period?
> For the scheduleTime and format, why not simply pass an instance of
> LocalDateTime or similar? The gateway should not have the responsibility to
> parse the time.
>
> 4.
> For the REST API:
> wouldn't it be better (more REST) to move the `mt_identifier` to the URL?
> E.g.: v3/materialized_tables//refresh
>
> Thank you!
> On Apr 22, 2024 at 08:42 +0200, Ron Liu , wrote:
> > Hi, Dev
> >
> > I would like to start a discussion about FLIP-448: Introduce Pluggable
> > Workflow Scheduler Interface for Materialized Table.
> >
> > In FLIP-435[1], we proposed Materialized Table, which has two types of
> data
> > refresh modes: Full Refresh & Continuous Refresh Mode. In Full Refresh
> > mode, the Materialized Table relies on a workflow scheduler to perform
> > periodic refresh operation to achieve the desired data freshness.
> >
> > There are numerous open-source workflow schedulers available, with
> popular
> > ones including Airflow and DolphinScheduler. To enable Materialized Table
> > to work with different workflow schedulers, we propose a pluggable
> workflow
> > scheduler interface for Materialized Table in this FLIP.
> >
> > For more details, see FLIP-448 [2]. Looking forward to your feedback.
> >
> > [1] https://lists.apache.org/thread/c1gnn3bvbfs8v1trlf975t327s4rsffs
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> >
> > Best,
> > Ron
>

[jira] [Created] (FLINK-35230) Split FlinkSqlParserImplTest to reduce the code lines.

2024-04-24 Thread Feng Jin (Jira)

Feng Jin created FLINK-35230:


 Summary: Split FlinkSqlParserImplTest to reduce the code lines.
 Key: FLINK-35230
 URL: https://issues.apache.org/jira/browse/FLINK-35230
 Project: Flink
  Issue Type: Technical Debt
  Components: Table SQL / Planner
Reporter: Feng Jin


With the increasing extension of Calcite syntax, the current 
FlinkSqlParserImplTest has reached nearly 3000 lines of code. 

If it exceeds the current limit, it will result in errors in the code style 
check.

{code:log}
08:33:19.679 [ERROR] 
src/test/java/org/apache/flink/sql/parser/FlinkSqlParserImplTest.java:[1] 
(sizes) FileLength: File length is 3,166 lines (max allowed is 3,100).
{code}

To facilitate future syntax extensions, I suggest that we split 
FlinkSqlParserImplTest and place the same type of syntax in separate Java tests 
for the convenience of avoiding the continuous growth of the original test 
class.

My current idea is: 
Since *FlinkSqlParserImplTest* currently inherits *SqlParserTest*, and 
*SqlParserTest* itself contains many unit tests, for the convenience of future 
test splits, we should introduce a basic *ParserTestBase* inheriting 
*SqlParserTest*, and disable the original related unit tests in 
*SqlParserTest*. 

This will facilitate writing relevant unit tests more quickly during subsequent 
splitting, without the need to repeatedly execute the unit tests inside 
SqlParserTest.






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-04-16 Thread Feng Jin

+1 (non-binding)

Best,
Feng Jin

On Wed, Apr 17, 2024 at 2:28 PM Ron liu  wrote:

> +1(binding)
>
> Best,
> Ron
>
> Ron liu  于2024年4月17日周三 14:27写道：
>
> > Hi Dev,
> >
> > Thank you to everyone for the feedback on FLIP-435: Introduce a New
> > Materialized Table for Simplifying Data Pipelines[1][2].
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or not enough votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
> > [2] https://lists.apache.org/thread/c1gnn3bvbfs8v1trlf975t327s4rsffs
> >
> > Best,
> > Ron
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Zakelly Lan

2024-04-16 Thread Feng Jin

Congratulations!

Best,
Feng

On Tue, Apr 16, 2024 at 10:43 PM Ferenc Csaky 
wrote:

> Congratulations!
>
> Best,
> Ferenc
>
>
>
>
> On Tuesday, April 16th, 2024 at 16:28, Jeyhun Karimov <
> je.kari...@gmail.com> wrote:
>
> >
> >
> > Congratulations Zakelly!
> >
> > Regards,
> > Jeyhun
> >
> > On Tue, Apr 16, 2024 at 6:35 AM Feifan Wang zoltar9...@163.com wrote:
> >
> > > Congratulations, Zakelly!——
> > >
> > > Best regards,
> > >
> > > Feifan Wang
> > >
> > > At 2024-04-15 10:50:06, "Yuan Mei" yuanmei.w...@gmail.com wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > On behalf of the PMC, I'm happy to let you know that Zakelly Lan has
> > > > become
> > > > a new Flink Committer!
> > > >
> > > > Zakelly has been continuously contributing to the Flink project since
> > > > 2020,
> > > > with a focus area on Checkpointing, State as well as frocksdb (the
> default
> > > > on-disk state db).
> > > >
> > > > He leads several FLIPs to improve checkpoints and state APIs,
> including
> > > > File Merging for Checkpoints and configuration/API reorganizations.
> He is
> > > > also one of the main contributors to the recent efforts of
> "disaggregated
> > > > state management for Flink 2.0" and drives the entire discussion in
> the
> > > > mailing thread, demonstrating outstanding technical depth and
> breadth of
> > > > knowledge.
> > > >
> > > > Beyond his technical contributions, Zakelly is passionate about
> helping
> > > > the
> > > > community in numerous ways. He spent quite some time setting up the
> Flink
> > > > Speed Center and rebuilding the benchmark pipeline after the
> original one
> > > > was out of lease. He helps build frocksdb and tests for the upcoming
> > > > frocksdb release (bump rocksdb from 6.20.3->8.10).
> > > >
> > > > Please join me in congratulating Zakelly for becoming an Apache Flink
> > > > committer!
> > > >
> > > > Best,
> > > > Yuan (on behalf of the Flink PMC)
>

Re: [ANNOUNCE] New Apache Flink PMC Member - Jing Ge

2024-04-12 Thread Feng Jin

Congratulations, Jing

Best,
Feng Jin

On Fri, Apr 12, 2024 at 4:46 PM Samrat Deb  wrote:

> Congratulations, Jing!
>
>
> On Fri, 12 Apr 2024 at 2:15 PM, Jiabao Sun  wrote:
>
> > Congratulations, Jing!
> >
> > Best,
> > Jiabao
> >
> > Sergey Nuyanzin  于2024年4月12日周五 16:41写道：
> >
> > > Congratulations, Jing!
> > >
> > > On Fri, Apr 12, 2024 at 10:29 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > Congratulations, Jing~
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Fri, Apr 12, 2024 at 4:28 PM Yun Tang  wrote:
> > > >
> > > > > Congratulations, Jing!
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Jark Wu 
> > > > > Sent: Friday, April 12, 2024 16:02
> > > > > To: dev 
> > > > > Cc: gej...@gmail.com 
> > > > > Subject: [ANNOUNCE] New Apache Flink PMC Member - Jing Ge
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > On behalf of the PMC, I'm very happy to announce that Jing Ge has
> > > > > joined the Flink PMC!
> > > > >
> > > > > Jing has been contributing to Apache Flink for a long time. He
> > > > continuously
> > > > > works on SQL, connectors, Source, and Sink APIs, test, and document
> > > > > modules while contributing lots of code and insightful discussions.
> > He
> > > is
> > > > > one of the maintainers of Flink CI infra. He is also willing to
> help
> > a
> > > > lot
> > > > > in the
> > > > > community work, such as being the release manager for both 1.18 and
> > > 1.19,
> > > > > verifying releases, and answering questions on the mailing list.
> > > Besides
> > > > > that,
> > > > > he is continuously helping with the expansion of the Flink
> community
> > > and
> > > > > has
> > > > > given several talks about Flink at many conferences, such as Flink
> > > > Forward
> > > > > 2022 and 2023.
> > > > >
> > > > > Congratulations and welcome Jing!
> > > > >
> > > > > Best,
> > > > > Jark (on behalf of the Flink PMC)
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Sergey
> > >
> >
>

Re: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee

2024-04-12 Thread Feng Jin

Congratulations, Lincoln!

Best,
Feng Jin

On Fri, Apr 12, 2024 at 5:20 PM xiangyu feng  wrote:

> Congratulations, Lincoln!
>
> Best,
> Xiangyu Feng
>
> Feifan Wang  于2024年4月12日周五 17:19写道：
>
> > Congratulations, Lincoln!
> >
> >
> > ——
> >
> > Best regards,
> >
> > Feifan Wang
> >
> >
> >
> >
> > At 2024-04-12 15:59:00, "Jark Wu"  wrote:
> > >Hi everyone,
> > >
> > >On behalf of the PMC, I'm very happy to announce that Lincoln Lee has
> > >joined the Flink PMC!
> > >
> > >Lincoln has been an active member of the Apache Flink community for
> > >many years. He mainly works on Flink SQL component and has driven
> > >/pushed many FLIPs around SQL, including FLIP-282/373/415/435 in
> > >the recent versions. He has a great technical vision of Flink SQL and
> > >participated in plenty of discussions in the dev mailing list. Besides
> > >that,
> > >he is community-minded, such as being the release manager of 1.19,
> > >verifying releases, managing release syncs, writing the release
> > >announcement etc.
> > >
> > >Congratulations and welcome Lincoln!
> > >
> > >Best,
> > >Jark (on behalf of the Flink PMC)
> >
>

Re: [VOTE] FLIP-399: Flink Connector Doris

2024-04-09 Thread Feng Jin

+1 (non-binding)

Best,
Feng

On Tue, Apr 9, 2024 at 5:56 PM gongzhongqiang 
wrote:

> +1 (non-binding)
>
> Best,
>
> Zhongqiang Gong
>
> wudi <676366...@qq.com.invalid> 于2024年4月9日周二 10:48写道：
>
> > Hi devs,
> >
> > I would like to start a vote about FLIP-399 [1]. The FLIP is about
> > contributing the Flink Doris Connector[2] to the Flink community.
> > Discussion thread [3].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > insufficient votes.
> >
> >
> > Thanks,
> > Di.Wu
> >
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > [2] https://github.com/apache/doris-flink-connector
> > [3] https://lists.apache.org/thread/p3z4wsw3ftdyfs9p2wd7bbr2gfyl3xnh
> >
> >
>

Re: Re: [ANNOUNCE] Apache Paimon is graduated to Top Level Project

2024-03-31 Thread Feng Jin

Congratulations!

Best,
Feng Jin

On Mon, Apr 1, 2024 at 10:51 AM weijie guo 
wrote:

> Congratulations!
>
> Best regards,
>
> Weijie
>
>
> Hang Ruan  于2024年4月1日周一 09:49写道：
>
> > Congratulations!
> >
> > Best,
> > Hang
> >
> > Lincoln Lee  于2024年3月31日周日 00:10写道：
> >
> > > Congratulations!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Jark Wu  于2024年3月30日周六 22:13写道：
> > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Fri, 29 Mar 2024 at 12:08, Yun Tang  wrote:
> > > >
> > > > > Congratulations to all Paimon guys!
> > > > >
> > > > > Glad to see a Flink sub-project has been graduated to an Apache
> > > top-level
> > > > > project.
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > >
> > > > > 
> > > > > From: Hangxiang Yu 
> > > > > Sent: Friday, March 29, 2024 10:32
> > > > > To: dev@flink.apache.org 
> > > > > Subject: Re: Re: [ANNOUNCE] Apache Paimon is graduated to Top Level
> > > > Project
> > > > >
> > > > > Congratulations!
> > > > >
> > > > > On Fri, Mar 29, 2024 at 10:27 AM Benchao Li 
> > > > wrote:
> > > > >
> > > > > > Congratulations!
> > > > > >
> > > > > > Zakelly Lan  于2024年3月29日周五 10:25写道：
> > > > > > >
> > > > > > > Congratulations!
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Zakelly
> > > > > > >
> > > > > > > On Thu, Mar 28, 2024 at 10:13 PM Jing Ge
> > >  > > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congrats!
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Jing
> > > > > > > >
> > > > > > > > On Thu, Mar 28, 2024 at 1:27 PM Feifan Wang <
> > zoltar9...@163.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations!——
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > >
> > > > > > > > > Feifan Wang
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > At 2024-03-28 20:02:43, "Yanfei Lei" 
> > > > wrote:
> > > > > > > > > >Congratulations!
> > > > > > > > > >
> > > > > > > > > >Best,
> > > > > > > > > >Yanfei
> > > > > > > > > >
> > > > > > > > > >Zhanghao Chen  于2024年3月28日周四
> > > > 19:59写道：
> > > > > > > > > >>
> > > > > > > > > >> Congratulations!
> > > > > > > > > >>
> > > > > > > > > >> Best,
> > > > > > > > > >> Zhanghao Chen
> > > > > > > > > >> 
> > > > > > > > > >> From: Yu Li 
> > > > > > > > > >> Sent: Thursday, March 28, 2024 15:55
> > > > > > > > > >> To: d...@paimon.apache.org 
> > > > > > > > > >> Cc: dev ; user <
> > u...@flink.apache.org
> > > >
> > > > > > > > > >> Subject: Re: [ANNOUNCE] Apache Paimon is graduated to
> Top
> > > > Level
> > > > > > > > Project
> > > > > > > > > >>
> > > > > > > > > >> CC the Flink user and dev mailing list.
> > > > > > > > > >>
> > > > > > > > > >> Paimon originated within the Flink community, initially
> > > known
> > > > as
> >

Re: [DISCUSS] FLIP-399: Flink Connector Doris

2024-03-25 Thread Feng Jin

Hi Di

Thank you for your patience and explanation.

If this is a server-side configuration, we currently cannot modify it in
the client configuration. If Doris supports client-side configuration in
the future, we can reconsider whether to support it.

I currently have no other questions regarding this FLIP.  LGTM.


Best,
Feng

On Mon, Mar 25, 2024 at 3:42 PM wudi <676366...@qq.com.invalid> wrote:

> Hi, Feng
>
> Yes, if the StreamLoad transaction timeout is very short, you may
> encounter this situation.
>
> The timeout for StreamLoad transactions is controlled by the
> streaming_label_keep_max_second parameter [1] in FE (Frontend), and the
> default value is 12 hours. Currently, it is a global transaction
> configuration and cannot be set separately for a specific transaction.
>
> However, I understand the default 12-hour timeout should cover most cases
> unless you are restarting from a checkpoint that occurred a long time ago.
> What do you think?
>
>
> [1]
> https://github.com/apache/doris/blob/master/fe/fe-common/src/main/java/org/apache/doris/common/Config.java#L163-L168
>
>
> Brs
> di.wu
>
> > 2024年3月25日 11:45，Feng Jin  写道：
> >
> > Hi Di
> >
> > Thanks for your reply.
> >
> > The timeout I'm referring to here is not the commit timeout, but rather
> the
> > timeout for a single streamLoad transaction.
> >
> > Let's say we have set the transaction timeout for StreamLoad to be 10
> > minutes. Now, imagine there is a Flink job with two subtasks. Due to
> > significant data skew and backpressure issues, subtask 0 and subtask 1
> are
> > processing at different speeds. Subtask 0 finishes processing this
> > checkpoint first, while subtask 1 takes another 10 minutes to complete
> its
> > processing. At this point, the job's checkpoint is done. However, since
> > subtask 0 has been waiting for subtask 1 all along, its corresponding
> > streamLoad transaction closes after more than 10 minutes have passed - by
> > which time the server has already cleaned up this transaction, leading
> to a
> > failed commit.
> > Therefore, I would like to know if in such situations we can avoid this
> > problem by setting a longer lifespan for transactions.
> >
> >
> > Best,
> > Feng
> >
> >
> > On Fri, Mar 22, 2024 at 10:24 PM wudi <676366...@qq.com.invalid> wrote:
> >
> >> Hi, Feng,
> >>
> >> 1. Are you suggesting that when a commit gets stuck, we can interrupt
> the
> >> commit request using a timeout parameter? Currently, there is no such
> >> parameter. In my understanding, in a two-phase commit, checkpoint must
> be
> >> enabled, so the commit timeout is essentially the checkpoint timeout.
> >> Therefore, it seems unnecessary to add an additional parameter here.
> What
> >> do you think?
> >>
> >> 2. In addition to deleting checkpoints to re-consume data again, the
> >> Connector also provides an option to ignore commit errors[1]. However,
> this
> >> option is only used for error recovery scenarios, such as when a
> >> transaction is cleared by the server but you want to reuse the upstream
> >> offset from the checkpoint.
> >>
> >> 3. Also, thank you for pointing out the issue with the parameter. It has
> >> already been addressed[2], but the FLIP changes were overlooked. It has
> >> been updated.
> >>
> >> [1]
> >>
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/committer/DorisCommitter.java#L150-L160
> >> [2]
> >>
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java#L89-L98
> >>
> >> Brs
> >> di.wu
> >>
> >>
> >>
> >>> 2024年3月22日 18:28，Feng Jin  写道：
> >>>
> >>> Hi Di,
> >>>
> >>> Thank you for the update, as well as quickly implementing corresponding
> >>> capabilities including filter push down and project push down.
> >>>
> >>> Regarding the transaction timeout, I still have some doubts. I would
> like
> >>> to confirm if we can control this timeout parameter in the connector,
> >> such
> >>> as setting it to 10 minutes or 1 hour.
> >>> Also, when a transaction is cleared by the server, the commit operation
> >> of
> >>> the connector will fail, leading to job failure. In this case, can
> users
> >>> only

Re: [DISCUSS] FLIP-399: Flink Connector Doris

2024-03-24 Thread Feng Jin

Hi Di

Thanks for your reply.

The timeout I'm referring to here is not the commit timeout, but rather the
timeout for a single streamLoad transaction.

Let's say we have set the transaction timeout for StreamLoad to be 10
minutes. Now, imagine there is a Flink job with two subtasks. Due to
significant data skew and backpressure issues, subtask 0 and subtask 1 are
processing at different speeds. Subtask 0 finishes processing this
checkpoint first, while subtask 1 takes another 10 minutes to complete its
processing. At this point, the job's checkpoint is done. However, since
subtask 0 has been waiting for subtask 1 all along, its corresponding
streamLoad transaction closes after more than 10 minutes have passed - by
which time the server has already cleaned up this transaction, leading to a
failed commit.
Therefore, I would like to know if in such situations we can avoid this
problem by setting a longer lifespan for transactions.


Best,
Feng


On Fri, Mar 22, 2024 at 10:24 PM wudi <676366...@qq.com.invalid> wrote:

> Hi, Feng,
>
> 1. Are you suggesting that when a commit gets stuck, we can interrupt the
> commit request using a timeout parameter? Currently, there is no such
> parameter. In my understanding, in a two-phase commit, checkpoint must be
> enabled, so the commit timeout is essentially the checkpoint timeout.
> Therefore, it seems unnecessary to add an additional parameter here. What
> do you think?
>
> 2. In addition to deleting checkpoints to re-consume data again, the
> Connector also provides an option to ignore commit errors[1]. However, this
> option is only used for error recovery scenarios, such as when a
> transaction is cleared by the server but you want to reuse the upstream
> offset from the checkpoint.
>
> 3. Also, thank you for pointing out the issue with the parameter. It has
> already been addressed[2], but the FLIP changes were overlooked. It has
> been updated.
>
> [1]
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/committer/DorisCommitter.java#L150-L160
> [2]
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java#L89-L98
>
> Brs
> di.wu
>
>
>
> > 2024年3月22日 18:28，Feng Jin  写道：
> >
> > Hi Di,
> >
> > Thank you for the update, as well as quickly implementing corresponding
> > capabilities including filter push down and project push down.
> >
> > Regarding the transaction timeout, I still have some doubts. I would like
> > to confirm if we can control this timeout parameter in the connector,
> such
> > as setting it to 10 minutes or 1 hour.
> > Also, when a transaction is cleared by the server, the commit operation
> of
> > the connector will fail, leading to job failure. In this case, can users
> > only choose to delete the checkpoint and re-consume historical data?
> >
> > There is also a small question regarding the parameters*: *
> > *doris.request.connect.timeout.ms <
> http://doris.request.connect.timeout.ms>*
> > and d*oris.request.read.timeout.ms <http://oris.request.read.timeout.ms
> >*,
> > can we change them to Duration type and remove the "ms" suffix.?
> > This way, all time parameters can be kept uniform in type as duration.
> >
> >
> > Best,
> > Feng
> >
> > On Fri, Mar 22, 2024 at 4:46 PM wudi <676366...@qq.com.invalid> wrote:
> >
> >> Hi, Feng,
> >> Thank you, that's a great suggestion !
> >>
> >> I have already implemented FilterPushDown and removed that parameter on
> >> DorisDynamicTableSource[1], and also updated FLIP.
> >>
> >> Regarding the mention of [Doris also aborts transactions], it may not
> have
> >> been described accurately. It mainly refers to the automatic expiration
> of
> >> long-running transactions in Doris that have not been committed for a
> >> prolonged period.
> >>
> >> As for two-phase commit, when a commit fails, the checkpoint will also
> >> fail, and the job will be continuously retried.
> >>
> >> [1]
> >>
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisDynamicTableSource.java#L58
> >>
> >> Brs
> >> di.wu
> >>
> >>
> >>> 2024年3月15日 14:53，Feng Jin  写道：
> >>>
> >>> Hi Di
> >>>
> >>> Thank you for initiating this FLIP, +1 for this.
> >>>
> >>> Regarding the option `doris.filter.query` of doris source

Re: [DISCUSS] FLIP-399: Flink Connector Doris

2024-03-22 Thread Feng Jin

Hi Di,

Thank you for the update, as well as quickly implementing corresponding
capabilities including filter push down and project push down.

Regarding the transaction timeout, I still have some doubts. I would like
to confirm if we can control this timeout parameter in the connector, such
as setting it to 10 minutes or 1 hour.
Also, when a transaction is cleared by the server, the commit operation of
the connector will fail, leading to job failure. In this case, can users
only choose to delete the checkpoint and re-consume historical data?

There is also a small question regarding the parameters*: *
*doris.request.connect.timeout.ms <http://doris.request.connect.timeout.ms>*
and d*oris.request.read.timeout.ms <http://oris.request.read.timeout.ms>*,
can we change them to Duration type and remove the "ms" suffix.?
This way, all time parameters can be kept uniform in type as duration.


Best,
Feng

On Fri, Mar 22, 2024 at 4:46 PM wudi <676366...@qq.com.invalid> wrote:

> Hi, Feng,
> Thank you, that's a great suggestion !
>
> I have already implemented FilterPushDown and removed that parameter on
> DorisDynamicTableSource[1], and also updated FLIP.
>
> Regarding the mention of [Doris also aborts transactions], it may not have
> been described accurately. It mainly refers to the automatic expiration of
> long-running transactions in Doris that have not been committed for a
> prolonged period.
>
> As for two-phase commit, when a commit fails, the checkpoint will also
> fail, and the job will be continuously retried.
>
> [1]
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisDynamicTableSource.java#L58
>
> Brs
> di.wu
>
>
> > 2024年3月15日 14:53，Feng Jin  写道：
> >
> > Hi Di
> >
> > Thank you for initiating this FLIP, +1 for this.
> >
> > Regarding the option `doris.filter.query` of doris source table
> >
> > Can we directly implement the FilterPushDown capability of Flink Source
> > like Jdbc Source [1] instead of introducing an option?
> >
> >
> > Regarding two-phase commit,
> >
> >> At the same time, Doris will also abort transactions that have not been
> > committed for a long time
> >
> > Can we control the transaction timeout in the connector?
> > And control the behavior when timeout occurs, whether to discard by
> default
> > or trigger job failure?
> >
> >
> > [1]. https://issues.apache.org/jira/browse/FLINK-16024
> >
> > Best,
> > Feng
> >
> >
> > On Tue, Mar 12, 2024 at 12:12 AM Ferenc Csaky  >
> > wrote:
> >
> >> Hi,
> >>
> >> Thanks for driving this, +1 for the FLIP.
> >>
> >> Best,
> >> Ferenc
> >>
> >>
> >>
> >>
> >> On Monday, March 11th, 2024 at 15:17, Ahmed Hamdy  >
> >> wrote:
> >>
> >>>
> >>>
> >>> Hello,
> >>> Thanks for the proposal, +1 for the FLIP.
> >>>
> >>> Best Regards
> >>> Ahmed Hamdy
> >>>
> >>>
> >>> On Mon, 11 Mar 2024 at 15:12, wudi 676366...@qq.com.invalid wrote:
> >>>
> >>>> Hi, Leonard
> >>>> Thank you for your suggestion.
> >>>> I referred to other Connectors[1], modified the naming and types of
> >>>> relevant parameters[2], and also updated FLIP.
> >>>>
> >>>> [1]
> >>>>
> >>
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/
> >>>> [1]
> >>>>
> >>
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java
> >>>>
> >>>> Brs,
> >>>> di.wu
> >>>>
> >>>>> 2024年3月7日 14:33，Leonard Xu xbjt...@gmail.com 写道：
> >>>>>
> >>>>> Thanks wudi for the updating, the FLIP generally looks good to me, I
> >>>>> only left two minor suggestions:
> >>>>>
> >>>>> (1) The suffix `.s` in configoption doris.request.query.timeout.s
> >> looks
> >>>>> strange to me, could we change all time interval related option
> >> value type
> >>>>> to Duration ?
> >>>>>
> >>>>> (2) Could you check and improve all config options like
> >>>>> `doris.exec.mem.limit` to make them to follow flink config option
> >> naming
> >&g

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Feng Jin

Congratulations!


Best,
Feng


On Thu, Mar 21, 2024 at 11:37 AM Ron liu  wrote:

> Congratulations!
>
> Best,
> Ron
>
> Jark Wu  于2024年3月21日周四 10:46写道：
>
> > Congratulations and welcome!
> >
> > Best,
> > Jark
> >
> > On Thu, 21 Mar 2024 at 10:35, Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > Congratulations!
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Mar 21, 2024 at 10:25 AM Hang Ruan 
> > wrote:
> > >
> > > > Congrattulations!
> > > >
> > > > Best,
> > > > Hang
> > > >
> > > > Lincoln Lee  于2024年3月21日周四 09:54写道：
> > > >
> > > >>
> > > >> Congrats, thanks for the great work!
> > > >>
> > > >>
> > > >> Best,
> > > >> Lincoln Lee
> > > >>
> > > >>
> > > >> Peter Huang  于2024年3月20日周三 22:48写道：
> > > >>
> > > >>> Congratulations
> > > >>>
> > > >>>
> > > >>> Best Regards
> > > >>> Peter Huang
> > > >>>
> > > >>> On Wed, Mar 20, 2024 at 6:56 AM Huajie Wang 
> > > wrote:
> > > >>>
> > > 
> > >  Congratulations
> > > 
> > > 
> > > 
> > >  Best,
> > >  Huajie Wang
> > > 
> > > 
> > > 
> > >  Leonard Xu  于2024年3月20日周三 21:36写道：
> > > 
> > > > Hi devs and users,
> > > >
> > > > We are thrilled to announce that the donation of Flink CDC as a
> > > > sub-project of Apache Flink has completed. We invite you to
> explore
> > > the new
> > > > resources available:
> > > >
> > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > - Flink CDC Documentation:
> > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > >
> > > > After Flink community accepted this donation[1], we have
> completed
> > > > software copyright signing, code repo migration, code cleanup,
> > > website
> > > > migration, CI migration and github issues migration etc.
> > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> > > > Qingsheng Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > > contributors
> > > > for their contributions and help during this process！
> > > >
> > > >
> > > > For all previous contributors: The contribution process has
> > slightly
> > > > changed to align with the main Flink project. To report bugs or
> > > suggest new
> > > > features, please open tickets
> > > > Apache Jira (https://issues.apache.org/jira).  Note that we will
> > no
> > > > longer accept GitHub issues for these purposes.
> > > >
> > > >
> > > > Welcome to explore the new repository and documentation. Your
> > > feedback
> > > > and contributions are invaluable as we continue to improve Flink
> > CDC.
> > > >
> > > > Thanks everyone for your support and happy exploring Flink CDC!
> > > >
> > > > Best，
> > > > Leonard
> > > > [1]
> > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-20 Thread Feng Jin

Hi Ron and Lincoln

Thanks for driving this discussion.  I believe it will greatly improve the
convenience of managing user real-time pipelines.

I have some questions.

*Regarding Limitations of Dynamic Table:*

> Does not support modifying the select statement after the dynamic table
is created.

Although currently we restrict users from modifying the query, I wonder if
we can provide a better way to help users rebuild it without affecting
downstream OLAP queries.


*Regarding the management of background jobs:*

1. From the documentation, the definitions SQL and job information are
stored in the Catalog. Does this mean that if a system needs to adapt to
Dynamic Tables, it also needs to store Flink's job information in the
corresponding system?
For example, does MySQL's Catalog need to store flink job information as
well?


2. Users still need to consider how much memory is being used, how large
the concurrency is, which type of state backend is being used, and may need
to set TTL expiration.


*Regarding the Refresh Part:*

> If the refresh mode is continuous and a background job is running,
caution should be taken with the refresh command as it can lead to
inconsistent data.

When we submit a refresh command, can we help users detect if there are any
running jobs and automatically stop them before executing the refresh
command? Then wait for it to complete before restarting the background
streaming job?

Best,
Feng

On Tue, Mar 19, 2024 at 9:40 PM Lincoln Lee  wrote:

> Hi Yun,
>
> Thank you very much for your valuable input!
>
> Incremental mode is indeed an attractive idea, we have also discussed
> this, but in the current design,
>
> we first provided two refresh modes: CONTINUOUS and
> FULL. Incremental mode can be introduced
>
> once the execution layer has the capability.
>
> My answer for the two questions:
>
> 1.
> Yes, cascading is a good question.  Current proposal provides a
> freshness that defines a dynamic
> table relative to the base table’s lag. If users need to consider the
> end-to-end freshness of multiple
> cascaded dynamic tables, he can manually split them for now. Of
> course, how to let multiple cascaded
>  or dependent dynamic tables complete the freshness definition in a
> simpler way, I think it can be
> extended in the future.
>
> 2.
> Cascading refresh is also a part we focus on discussing. In this flip,
> we hope to focus as much as
> possible on the core features (as it already involves a lot things),
> so we did not directly introduce related
>  syntax. However, based on the current design, combined with the
> catalog and lineage, theoretically,
> users can also finish the cascading refresh.
>
>
> Best,
> Lincoln Lee
>
>
> Yun Tang  于2024年3月19日周二 13:45写道：
>
> > Hi Lincoln,
> >
> > Thanks for driving this discussion, and I am so excited to see this topic
> > being discussed in the Flink community!
> >
> > From my point of view, instead of the work of unifying streaming and
> batch
> > in DataStream API [1], this FLIP actually could make users benefit from
> one
> > engine to rule batch & streaming.
> >
> > If we treat this FLIP as an open-source implementation of Snowflake's
> > dynamic tables [2], we still lack an incremental refresh mode to make the
> > ETL near real-time with a much cheaper computation cost. However, I think
> > this could be done under the current design by introducing another
> refresh
> > mode in the future. Although the extra work of incremental view
> maintenance
> > would be much larger.
> >
> > For the FLIP itself, I have several questions below:
> >
> > 1. It seems this FLIP does not consider the lag of refreshes across ETL
> > layers from ODS ---> DWD ---> APP [3]. We currently only consider the
> > scheduler interval, which means we cannot use lag to automatically
> schedule
> > the upfront micro-batch jobs to do the work.
> > 2. To support the automagical refreshes, we should consider the lineage
> in
> > the catalog or somewhere else.
> >
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-134%3A+Batch+execution+for+the+DataStream+API
> > [2] https://docs.snowflake.com/en/user-guide/dynamic-tables-about
> > [3] https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh
> >
> > Best
> > Yun Tang
> >
> >
> > 
> > From: Lincoln Lee 
> > Sent: Thursday, March 14, 2024 14:35
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for
> > Simplifying Data Pipelines
> >
> > Hi Jing,
> >
> > Thanks for your attention to this flip! I'll try to answer the following
> > questions.
> >
> > > 1. How to define query of dynamic table?
> > > Use flink sql or introducing new syntax?
> > > If use flink sql, how to handle the difference in SQL between streaming
> > and
> > > batch processing?
> > > For example, a query including window aggregate based on processing
> time?
> > > or a query including global order by?
> >
> > Similar to `CREATE TABLE AS query`, he

Re: [VOTE] FLIP-436: Introduce Catalog-related Syntax

2024-03-19 Thread Feng Jin

+1 (non-binding)

Best,
Feng

On Tue, Mar 19, 2024 at 7:46 PM Ferenc Csaky 
wrote:

> +1 (non-binding).
>
> Best,
> Ferenc
>
>
>
>
> On Tuesday, March 19th, 2024 at 12:39, Jark Wu  wrote:
>
> >
> >
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Tue, 19 Mar 2024 at 19:05, Yuepeng Pan panyuep...@apache.org wrote:
> >
> > > Hi, Yubin
> > >
> > > Thanks for driving it !
> > >
> > > +1 non-binding.
> > >
> > > Best,
> > > Yuepeng Pan.
> > >
> > > At 2024-03-19 17:56:42, "Yubin Li" lyb5...@gmail.com wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Thanks for all the feedback, I'd like to start a vote on the
> FLIP-436:
> > > > Introduce Catalog-related Syntax [1]. The discussion thread is here
> > > > [2].
> > > >
> > > > The vote will be open for at least 72 hours unless there is an
> > > > objection or insufficient votes.
> > > >
> > > > [1]
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> > > > [2] https://lists.apache.org/thread/10k1bjb4sngyjwhmfqfky28lyoo7sv0z
> > > >
> > > > Best regards,
> > > > Yubin
>

Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Feng Jin

Congratulations!

Best,
Feng

On Mon, Mar 18, 2024 at 6:18 PM Yuepeng Pan  wrote:

> Congratulations!
>
>
> Thanks to release managers and everyone involved.
>
>
>
>
> Best,
> Yuepeng Pan
>
>
>
>
>
>
>
>
> At 2024-03-18 18:09:45, "Yubin Li"  wrote:
> >Congratulations!
> >
> >Thanks to release managers and everyone involved.
> >
> >On Mon, Mar 18, 2024 at 5:55 PM Hangxiang Yu  wrote:
> >>
> >> Congratulations!
> >> Thanks release managers and all involved!
> >>
> >> On Mon, Mar 18, 2024 at 5:23 PM Hang Ruan 
> wrote:
> >>
> >> > Congratulations!
> >> >
> >> > Best,
> >> > Hang
> >> >
> >> > Paul Lam  于2024年3月18日周一 17:18写道：
> >> >
> >> > > Congrats! Thanks to everyone involved!
> >> > >
> >> > > Best,
> >> > > Paul Lam
> >> > >
> >> > > > 2024年3月18日 16:37，Samrat Deb  写道：
> >> > > >
> >> > > > Congratulations !
> >> > > >
> >> > > > On Mon, 18 Mar 2024 at 2:07 PM, Jingsong Li <
> jingsongl...@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > >> Congratulations!
> >> > > >>
> >> > > >> On Mon, Mar 18, 2024 at 4:30 PM Rui Fan <1996fan...@gmail.com>
> wrote:
> >> > > >>>
> >> > > >>> Congratulations, thanks for the great work!
> >> > > >>>
> >> > > >>> Best,
> >> > > >>> Rui
> >> > > >>>
> >> > > >>> On Mon, Mar 18, 2024 at 4:26 PM Lincoln Lee <
> lincoln.8...@gmail.com>
> >> > > >> wrote:
> >> > > 
> >> > >  The Apache Flink community is very happy to announce the
> release of
> >> > > >> Apache Flink 1.19.0, which is the fisrt release for the Apache
> Flink
> >> > > 1.19
> >> > > >> series.
> >> > > 
> >> > >  Apache Flink® is an open-source stream processing framework for
> >> > > >> distributed, high-performing, always-available, and accurate data
> >> > > streaming
> >> > > >> applications.
> >> > > 
> >> > >  The release is available for download at:
> >> > >  https://flink.apache.org/downloads.html
> >> > > 
> >> > >  Please check out the release blog post for an overview of the
> >> > > >> improvements for this bugfix release:
> >> > > 
> >> > > >>
> >> > >
> >> >
> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
> >> > > 
> >> > >  The full release notes are available in Jira:
> >> > > 
> >> > > >>
> >> > >
> >> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353282
> >> > > 
> >> > >  We would like to thank all contributors of the Apache Flink
> >> > community
> >> > > >> who made this release possible!
> >> > > 
> >> > > 
> >> > >  Best,
> >> > >  Yun, Jing, Martijn and Lincoln
> >> > > >>
> >> > >
> >> > >
> >> >
> >>
> >>
> >> --
> >> Best,
> >> Hangxiang.
>

Re: [DISCUSS] FLIP-399: Flink Connector Doris

2024-03-14 Thread Feng Jin

Hi Di

Thank you for initiating this FLIP, +1 for this.

Regarding the option `doris.filter.query` of doris source table

Can we directly implement the FilterPushDown capability of Flink Source
like Jdbc Source [1] instead of introducing an option?


Regarding two-phase commit,

> At the same time, Doris will also abort transactions that have not been
committed for a long time

Can we control the transaction timeout in the connector?
And control the behavior when timeout occurs, whether to discard by default
or trigger job failure?


[1]. https://issues.apache.org/jira/browse/FLINK-16024

Best,
Feng


On Tue, Mar 12, 2024 at 12:12 AM Ferenc Csaky 
wrote:

> Hi,
>
> Thanks for driving this, +1 for the FLIP.
>
> Best,
> Ferenc
>
>
>
>
> On Monday, March 11th, 2024 at 15:17, Ahmed Hamdy 
> wrote:
>
> >
> >
> > Hello,
> > Thanks for the proposal, +1 for the FLIP.
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 11 Mar 2024 at 15:12, wudi 676366...@qq.com.invalid wrote:
> >
> > > Hi, Leonard
> > > Thank you for your suggestion.
> > > I referred to other Connectors[1], modified the naming and types of
> > > relevant parameters[2], and also updated FLIP.
> > >
> > > [1]
> > >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/
> > > [1]
> > >
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java
> > >
> > > Brs,
> > > di.wu
> > >
> > > > 2024年3月7日 14:33，Leonard Xu xbjt...@gmail.com 写道：
> > > >
> > > > Thanks wudi for the updating, the FLIP generally looks good to me, I
> > > > only left two minor suggestions:
> > > >
> > > > (1) The suffix `.s` in configoption doris.request.query.timeout.s
> looks
> > > > strange to me, could we change all time interval related option
> value type
> > > > to Duration ?
> > > >
> > > > (2) Could you check and improve all config options like
> > > > `doris.exec.mem.limit` to make them to follow flink config option
> naming
> > > > and value type?
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > > > 2024年3月6日 06:12，Jing Ge j...@ververica.com.INVALID 写道：
> > > > > >
> > > > > > Hi Di,
> > > > > >
> > > > > > Thanks for your proposal. +1 for the contribution. I'd like to
> know
> > > > > > your
> > > > > > thoughts about the following questions:
> > > > > >
> > > > > > 1. According to your clarification of the exactly-once, thanks
> for it
> > > > > > BTW,
> > > > > > no PreCommitTopology is required. Does it make sense to let
> > > > > > DorisSink[1]
> > > > > > implement SupportsCommitter, since the TwoPhaseCommittingSink is
> > > > > > deprecated[2] before turning the Doris connector into a Flink
> > > > > > connector?
> > > > > > 2. OLAP engines are commonly used as the tail/downstream of a
> data
> > > > > > pipeline
> > > > > > to support further e.g. ad-hoc query or cube with feasible
> > > > > > pre-aggregation.
> > > > > > Just out of curiosity, would you like to share some real use
> cases that
> > > > > > will use OLAP engines as the source of a streaming data
> pipeline? Or it
> > > > > > will only be used as the source for the batch?
> > > > > > 3. The E2E test only covered sink[3], if I am not mistaken.
> Would you
> > > > > > like
> > > > > > to test the source in E2E too?
> > > > > >
> > > > > > [1]
> > >
> > >
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/DorisSink.java#L55
> > >
> > > > > > [2]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
> > >
> > > > > > [3]
> > >
> > >
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/test/java/org/apache/doris/flink/tools/cdc/MySQLDorisE2ECase.java#L96
> > >
> > > > > > Best regards,
> > > > > > Jing
> > > > > >
> > > > > > On Tue, Mar 5, 2024 at 11:18 AM wudi 676366...@qq.com.invalid
> wrote:
> > > > > >
> > > > > > > Hi, Jeyhun Karimov.
> > > > > > > Thanks for your question.
> > > > > > >
> > > > > > > - How to ensure Exactly-Once?
> > > > > > > 1. When the Checkpoint Barrier arrives, DorisSink will trigger
> the
> > > > > > > precommit api of StreamLoad to complete the persistence of
> data in
> > > > > > > Doris
> > > > > > > (the data will not be visible at this time), and will also
> pass this
> > > > > > > TxnID
> > > > > > > to the Committer.
> > > > > > > 2. When this Checkpoint of the entire Job is completed, the
> Committer
> > > > > > > will
> > > > > > > call the commit api of StreamLoad and commit TxnID to complete
> the
> > > > > > > visibility of the transaction.
> > > > > > > 3. When the task is restarted, the Txn with successful
> precommit and
> > > > > > > failed commit will be aborted based on the label-prefix, and
> Doris'
> > > > > > > abort
> > > > > > > API will be called. (At the same t

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-13 Thread Feng Jin

Hi Yubin

Thank you for initiating this FLIP.

I have just one minor question:

I noticed that we added a new function `getCatalogStore` to expose
CatalogStore, and it seems fine.
However, could we add a new method `getCatalogDescriptor()` to
CatalogManager instead of directly exposing CatalogStore?
By only providing the `getCatalogDescriptor()` interface, it may be easier
for us to implement audit tracking in CatalogManager in the future.  WDYT ?
Although we have only collected some modified events at the moment.[1]


[1].
https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener

Best,
Feng

On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li  wrote:

> +1 for this.
>
> We are missing a series of catalog related syntaxes.
> Especially after the introduction of catalog store. [1]
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>
> Best,
> Jingsong
>
> On Wed, Mar 13, 2024 at 5:09 PM Yubin Li  wrote:
> >
> > Hi devs,
> >
> > I'd like to start a discussion about FLIP-436: Introduce "SHOW CREATE
> > CATALOG" Syntax [1].
> >
> > At present, the `SHOW CREATE TABLE` statement provides strong support for
> > users to easily
> > reuse created tables. However, despite the increasing importance of the
> > `Catalog` in user's
> > business, there is no similar statement for users to use.
> >
> > According to the online discussion in FLINK-24939 [2] with Jark Wu and
> Feng
> > Jin, since `CatalogStore`
> > has been introduced in FLIP-295 [3], we could use this component to
> > implement such a long-awaited
> > feature, Please refer to the document [1] for implementation details.
> >
> > examples as follows:
> >
> > Flink SQL> create catalog cat2 WITH ('type'='generic_in_memory',
> > > 'default-database'='db');
> > > [INFO] Execute statement succeeded.
> > > Flink SQL> show create catalog cat2;
> > >
> > >
> ++
> > > | result |
> > >
> > >
> ++
> > > | CREATE CATALOG `cat2` WITH (
> > >   'default-database' = 'db',
> > >   'type' = 'generic_in_memory'
> > > )
> > >  |
> > >
> > >
> ++
> > > 1 row in set
> >
> >
> >
> > Looking forward to hearing from you, thanks!
> >
> > Best regards,
> > Yubin
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> > [2] https://issues.apache.org/jira/browse/FLINK-24939
> > [3]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-11 Thread Feng Jin

+1 (non-binding)

- Verified signatures and checksums
- Verified that source does not contain binaries
- Build source code successfully
- Run a simple sql query successfully

Best,
Feng Jin


On Tue, Mar 12, 2024 at 11:09 AM Ron liu  wrote:

> +1 (non binding)
>
> quickly verified:
> - verified that source distribution does not contain binaries
> - verified checksums
> - built source code successfully
>
>
> Best,
> Ron
>
> Jeyhun Karimov  于2024年3月12日周二 01:00写道：
>
> > +1 (non binding)
> >
> > - verified that source distribution does not contain binaries
> > - verified signatures and checksums
> > - built source code successfully
> >
> > Regards,
> > Jeyhun
> >
> >
> > On Mon, Mar 11, 2024 at 3:08 PM Samrat Deb 
> wrote:
> >
> > > +1 (non binding)
> > >
> > > - verified signatures and checksums
> > > - ASF headers are present in all expected file
> > > - No unexpected binaries files found in the source
> > > - Build successful locally
> > > - tested basic word count example
> > >
> > >
> > >
> > >
> > > Bests,
> > > Samrat
> > >
> > > On Mon, 11 Mar 2024 at 7:33 PM, Ahmed Hamdy 
> > wrote:
> > >
> > > > Hi Lincoln
> > > > +1 (non-binding) from me
> > > >
> > > > - Verified Checksums & Signatures
> > > > - Verified Source dists don't contain binaries
> > > > - Built source successfully
> > > > - reviewed web PR
> > > >
> > > >
> > > > Best Regards
> > > > Ahmed Hamdy
> > > >
> > > >
> > > > On Mon, 11 Mar 2024 at 15:18, Lincoln Lee 
> > > wrote:
> > > >
> > > > > Hi Robin,
> > > > >
> > > > > Thanks for helping verifying the release note[1], FLINK-14879
> should
> > > not
> > > > > have been included, after confirming this
> > > > > I moved all unresolved non-blocker issues left over from 1.19.0 to
> > > 1.20.0
> > > > > and reconfigured the release note [1].
> > > > >
> > > > > Best,
> > > > > Lincoln Lee
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353282
> > > > >
> > > > >
> > > > > Robin Moffatt  于2024年3月11日周一 19:36写道：
> > > > >
> > > > > > Looking at the release notes [1] it lists `DESCRIBE DATABASE`
> > > > > (FLINK-14879)
> > > > > > and `DESCRIBE CATALOG` (FLINK-14690).
> > > > > > When I try these in 1.19 RC2 the behaviour is as in 1.18.1, i.e.
> it
> > > is
> > > > > not
> > > > > > supported:
> > > > > >
> > > > > > ```
> > > > > > [INFO] Execute statement succeed.
> > > > > >
> > > > > > Flink SQL> show catalogs;
> > > > > > +-+
> > > > > > |catalog name |
> > > > > > +-+
> > > > > > |   c_new |
> > > > > > | default_catalog |
> > > > > > +-+
> > > > > > 2 rows in set
> > > > > >
> > > > > > Flink SQL> DESCRIBE CATALOG c_new;
> > > > > > [ERROR] Could not execute SQL statement. Reason:
> > > > > > org.apache.calcite.sql.validate.SqlValidatorException: Column
> > 'c_new'
> > > > not
> > > > > > found in any table
> > > > > >
> > > > > > Flink SQL> show databases;
> > > > > > +--+
> > > > > > |database name |
> > > > > > +--+
> > > > > > | default_database |
> > > > > > +--+
> > > > > > 1 row in set
> > > > > >
> > > > > > Flink SQL> DESCRIBE DATABASE default_database;
> > > > > > [ERROR] Could not execute SQL statement. Reason:
> > > > > > org.apache.calcite.sql.validate.SqlValidatorException: Column
> > > > > > 'default_database' not found in
> > > > > > any table
> > > > > >

Re: [VOTE] FLIP-314: Support Customized Job Lineage Listener

2024-02-28 Thread Feng Jin

+1 (non-binding)

Best,
Feng Jin

On Thu, Feb 29, 2024 at 4:41 AM Márton Balassi 
wrote:

> +1 (binding)
>
> Marton
>
> On Wed, Feb 28, 2024 at 5:14 PM Gyula Fóra  wrote:
>
> > +1 (binding)
> >
> > Gyula
> >
> > On Wed, Feb 28, 2024 at 11:10 AM Maciej Obuchowski <
> mobuchow...@apache.org
> > >
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Maciej Obuchowski
> > >
> > > śr., 28 lut 2024 o 10:29 Zhanghao Chen 
> > > napisał(a):
> > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > From: Yong Fang 
> > > > Sent: Wednesday, February 28, 2024 10:12
> > > > To: dev 
> > > > Subject: [VOTE] FLIP-314: Support Customized Job Lineage Listener
> > > >
> > > > Hi devs,
> > > >
> > > > I would like to restart a vote about FLIP-314: Support Customized Job
> > > > Lineage Listener[1].
> > > >
> > > > Previously, we added lineage related interfaces in FLIP-314. Before
> the
> > > > interfaces were developed and merged into the master, @Maciej and
> > > > @Zhenqiu provided valuable suggestions for the interface from the
> > > > perspective of the lineage system. So we updated the interfaces of
> > > FLIP-314
> > > > and discussed them again in the discussion thread [2].
> > > >
> > > > So I am here to initiate a new vote on FLIP-314, the vote will be
> open
> > > for
> > > > at least 72 hours unless there is an objection or insufficient votes
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener
> > > > [2] https://lists.apache.org/thread/wopprvp3ww243mtw23nj59p57cghh7mc
> > > >
> > > > Best,
> > > > Fang Yong
> > > >
> > >
> >
>

Re: Temporal join on rolling aggregate

2024-02-23 Thread Feng Jin

+1 Support this feature. There are many limitations to using time window
aggregation currently, and if we can declare watermark and time attribute
on the view, it will make it easier for us to use time windows. Similarly,
it would be very useful if the primary key could be declared in the view.

Therefore, I believe we need a FLIP to detail the design of this feature.


Best,
Feng

On Fri, Feb 23, 2024 at 2:39 PM  wrote:

> +1 for supporting defining time attributes on views.
>
> I once encountered the same problem as yours. I did some regular joins and
> lost time attribute, and hence I could no longer do window operations in
> subsequent logics. I had to output the joined view to Kafka, read from it
> again, and define watermark on the new source - a cubersome workaround.
>
> It would be more flexible if we could control time attribute / watermark
> on views, just as if it's some kind of special source.
>
> Thanks,
> Yaming
> 在 Feb 22, 2024, 7:46 PM +0800，Gyula Fóra ，写道：
> > Posting this to dev as well as it potentially has some implications on
> development effort.
> >
> > What seems to be the problem here is that we cannot control/override
> Timestamps/Watermarks/Primary key on VIEWs. It's understandable that you
> cannot create a PRIMARY KEY on the view but I think the temporal join also
> should not require the PK, should we remove this limitation?
> >
> > The general problem is the inflexibility of the timestamp/watermark
> handling on query outputs, which makes this again impossible.
> >
> > The workaround here can be to write the rolling aggregate to Kafka, read
> it back again and join with that. The fact that this workaround is possible
> actually highlights the need for more flexibility on the query/view side in
> my opinion.
> >
> > Has anyone else run into this issue and considered the proper solution
> to the problem? Feels like it must be pretty common :)
> >
> > Cheers,
> > Gyula
> >
> >
> >
> >
> > > On Wed, Feb 21, 2024 at 10:29 PM Sébastien Chevalley 
> wrote:
> > > > Hi,
> > > >
> > > > I have been trying to write a temporal join in SQL done on a rolling
> aggregate view. However it does not work and throws :
> > > >
> > > > org.apache.flink.table.api.ValidationException: Event-Time Temporal
> Table Join requires both primary key and row time attribute in versioned
> table, but no row time attribute can be found.
> > > >
> > > > It seems that after the aggregation, the table looses the watermark
> and it's not possible to add one with the SQL API as it's a view.
> > > >
> > > > CREATE TABLE orders (
> > > > order_id INT,
> > > > price DECIMAL(6, 2),
> > > > currency_id INT,
> > > > order_time AS NOW(),
> > > > WATERMARK FOR order_time AS order_time - INTERVAL '2' SECOND
> > > > )
> > > > WITH (
> > > > 'connector' = 'datagen',
> > > > 'rows-per-second' = '10',
> > > > 'fields.order_id.kind' = 'sequence',
> > > > 'fields.order_id.start' = '1',
> > > > 'fields.order_id.end' = '10',
> > > > 'fields.currency_id.min' = '1',
> > > > 'fields.currency_id.max' = '20'
> > > > );
> > > >
> > > > CREATE TABLE currency_rates (
> > > > currency_id INT,
> > > > conversion_rate DECIMAL(4, 3),
> > > > PRIMARY KEY (currency_id) NOT ENFORCED
> > > > )
> > > > WITH (
> > > > 'connector' = 'datagen',
> > > > 'rows-per-second' = '10',
> > > > 'fields.currency_id.min' = '1',
> > > > 'fields.currency_id.max' = '20'
> > > > );
> > > >
> > > > CREATE TEMPORARY VIEW max_rates AS (
> > > > SELECT
> > > > currency_id,
> > > > MAX(conversion_rate) AS max_rate
> > > > FROM currency_rates
> > > > GROUP BY currency_id
> > > > );
> > > >
> > > > CREATE TEMPORARY VIEW temporal_join AS (
> > > > SELECT
> > > > order_id,
> > > > max_rates.max_rate
> > > > FROM orders
> > > >  LEFT JOIN max_rates FOR SYSTEM_TIME AS OF orders.order_time
> > > >  ON orders.currency_id = max_rates.currency_id
> > > > );
> > > >
> > > > SELECT * FROM temporal_join;
> > > >
> > > > Am I missing something? What would be a good starting point to
> address this?
> > > >
> > > > Thanks in advance,
> > > > Sébastien Chevalley
>

Re: [ANNOUNCE] New Apache Flink Committer - Jiabao Sun

2024-02-19 Thread Feng Jin

Congratulations, Jiabao!


Best,
Feng

On Mon, Feb 19, 2024 at 7:35 PM Sergey Nuyanzin  wrote:

> Congratulations, Jiabao!
>
> On Mon, Feb 19, 2024 at 12:26 PM Yanquan Lv  wrote:
>
> > Congratulations, Jiabao.
> >
> > He Wang  于2024年2月19日周一 19:21写道：
> >
> > > Congrats, Jiabao!
> > >
> > > On Mon, Feb 19, 2024 at 7:19 PM Benchao Li 
> wrote:
> > >
> > > > Congrats, Jiabao!
> > > >
> > > > Zhanghao Chen  于2024年2月19日周一 18:42写道：
> > > > >
> > > > > Congrats, Jiaba!
> > > > >
> > > > > Best,
> > > > > Zhanghao Chen
> > > > > 
> > > > > From: Qingsheng Ren 
> > > > > Sent: Monday, February 19, 2024 17:53
> > > > > To: dev ; jiabao...@apache.org <
> > > > jiabao...@apache.org>
> > > > > Subject: [ANNOUNCE] New Apache Flink Committer - Jiabao Sun
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > On behalf of the PMC, I'm happy to announce Jiabao Sun as a new
> Flink
> > > > > Committer.
> > > > >
> > > > > Jiabao began contributing in August 2022 and has contributed 60+
> > > commits
> > > > > for Flink main repo and various connectors. His most notable
> > > contribution
> > > > > is being the core author and maintainer of MongoDB connector, which
> > is
> > > > > fully functional in DataStream and Table/SQL APIs. Jiabao is also
> the
> > > > > author of FLIP-377 and the main contributor of JUnit 5 migration in
> > > > runtime
> > > > > and table planner modules.
> > > > >
> > > > > Beyond his technical contributions, Jiabao is an active member of
> our
> > > > > community, participating in the mailing list and consistently
> > > > volunteering
> > > > > for release verifications and code reviews with enthusiasm.
> > > > >
> > > > > Please join me in congratulating Jiabao for becoming an Apache
> Flink
> > > > > committer!
> > > > >
> > > > > Best,
> > > > > Qingsheng (on behalf of the Flink PMC)
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Benchao Li
> > > >
> > >
> >
>
>
> --
> Best regards,
> Sergey
>

[jira] [Created] (FLINK-34312) Improve the handling of default node types when using named parameters.

2024-01-30 Thread Feng Jin (Jira)

Feng Jin created FLINK-34312:


 Summary: Improve the handling of default node types when using 
named parameters.
 Key: FLINK-34312
 URL: https://issues.apache.org/jira/browse/FLINK-34312
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin


Currently, we have supported the use of named parameters with optional 
arguments. 

By adapting the interface of Calcite, we can fill in the default operator when 
a parameter is missing. Whether it is during the validation phase or when 
converting to SqlToRel phase, we need to handle it specially by modifying the 
return type of DEFAULT operator based on the argument type of the operator.  
We have multiple places that need to handle the type of DEFAULT operator, 
including SqlCallBinding, SqlOperatorBinding, and SqlToRelConverter.


The improved solution is as follows: 

Before SqlToRel, we can construct a DEFAULT node with a return type that 
matches the argument type. This way, during the SqlToRel phase, there is no 
need for special handling of the DEFAULT node's type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34265) Add the doc of named parameters

2024-01-29 Thread Feng Jin (Jira)

Feng Jin created FLINK-34265:


 Summary: Add the doc of named parameters
 Key: FLINK-34265
 URL: https://issues.apache.org/jira/browse/FLINK-34265
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Table SQL / Planner
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [SUMMARY] Flink 1.19 Release Sync 01/23/2024

2024-01-24 Thread Feng Jin

Hi  everyone,

Xuyang and I are currently working on FLIP-387[1], which aims to support
named parameters for functions and procedures.
This will make it more convenient for users to utilize functions and
procedures with multiple parameters.

We have divided the task into four sub-tasks, and we are currently working
on them:
https://issues.apache.org/jira/browse/FLINK-34055
https://issues.apache.org/jira/browse/FLINK-34056
https://issues.apache.org/jira/browse/FLINK-34057

These tasks have already been developed and reviewed, and we expect them to
be merged today(Jan 25th).

However, there is still one remaining task:
https://issues.apache.org/jira/browse/FLINK-34058.
I have already completed the necessary development work for this task. It
may still require 2-3 rounds of review before it is finalized.
I anticipate that it will take another 2-3 days to complete.

Therefore, I kindly request that we merge the pull request next Monday(Jan
29th). I apologize if this affects your related schedule.

[1].  https://issues.apache.org/jira/browse/FLINK-34054

Best regards,
Feng Jin

On Wed, Jan 24, 2024 at 10:00 PM Lincoln Lee  wrote:

> Hi devs,
>
> I'd like to share some highlights from the release sync on 01/23/2024
>
>
> *- Feature freeze*  *We plan to freeze the feature on Jan 26th. If there's
> specific need for an extension, please confirm with RMs by replying this
> mail.*
>
>
> *- Features & issues tracking*  So far we've had 15 flips been marked
> done(some documentation is still in progress), we also ask responsible
> contributors to help update the status of the remaining items on the 1.19
> wiki page [1], including *documentation and cross-team testing
> requirements*,
> this will help the release process.
>
>
> *- Blockers*  There're performance regression and blocker issues are being
> worked on:
>   https://issues.apache.org/jira/browse/FLINK-34148
>   https://issues.apache.org/jira/browse/FLINK-34007
>   https://issues.apache.org/jira/browse/FLINK-34225
>   Note that test instabilities will be upgraded to blocker if it is newly
> introduced.
>
> *- Sync meeting* (https://meet.google.com/vcx-arzs-trv)
>   The next release sync is *Jan 30th, 2024*. We'll switch to weekly release
> sync.
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/1.19+Release
>
> Best,
> Yun, Jing, Martijn and Lincoln
>

Re: Re: [VOTE] FLIP-377: Support fine-grained configuration to control filter push down for Table/SQL Sources

2024-01-23 Thread Feng Jin

+1 (non-binding)


Best,
Feng Jin


On Tue, Jan 23, 2024 at 5:46 PM Qingsheng Ren  wrote:

> +1 (binding)
>
> Thanks for driving this, Jiabao!
>
> Best,
> Qingsheng
>
> On Fri, Jan 19, 2024 at 2:11 PM Jark Wu  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Tue, 16 Jan 2024 at 18:01, Xuyang  wrote:
> >
> > > +1 (non-binding)
> > >
> > >
> > > --
> > >
> > > Best！
> > > Xuyang
> > >
> > >
> > >
> > >
> > >
> > > 在 2024-01-16 17:52:38，"Leonard Xu"  写道：
> > > >+1 (binding)
> > > >
> > > >Best,
> > > >Leonard
> > > >
> > > >> 2024年1月16日 下午5:40，Hang Ruan  写道：
> > > >>
> > > >> +1 (non-binding)
> > > >>
> > > >> Best,
> > > >> Hang
> > > >>
> > > >> Jiabao Sun  于2024年1月9日周二 19:39写道：
> > > >>
> > > >>> Hi Devs,
> > > >>>
> > > >>> I'd like to start a vote on FLIP-377: Support fine-grained
> > > configuration
> > > >>> to control filter push down for Table/SQL Sources[1]
> > > >>> which has been discussed in this thread[2].
> > > >>>
> > > >>> The vote will be open for at least 72 hours unless there is an
> > > objection
> > > >>> or not enough votes.
> > > >>>
> > > >>> [1]
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
> > > >>> [2]
> https://lists.apache.org/thread/nvxx8sp9jm009yywm075hoffr632tm7j
> > > >>>
> > > >>> Best,
> > > >>> Jiabao
> > >
> >
>

[jira] [Created] (FLINK-34058) Support optional parameters for named parameters

2024-01-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-34058:


 Summary: Support optional parameters for named parameters
 Key: FLINK-34058
 URL: https://issues.apache.org/jira/browse/FLINK-34058
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Feng Jin
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34057) Support named parameters for functions

2024-01-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-34057:


 Summary: Support named parameters for functions
 Key: FLINK-34057
 URL: https://issues.apache.org/jira/browse/FLINK-34057
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Feng Jin
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34056) Support named parameters for procedures

2024-01-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-34056:


 Summary: Support named parameters for procedures
 Key: FLINK-34056
 URL: https://issues.apache.org/jira/browse/FLINK-34056
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Feng Jin
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34055) Introduce a new annotation for named parameters.

2024-01-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-34055:


 Summary: Introduce a new annotation for named parameters.
 Key: FLINK-34055
 URL: https://issues.apache.org/jira/browse/FLINK-34055
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Feng Jin
 Fix For: 1.19.0


Introduce a new annotation to specify the parameter name, indicate if it is 
optional, and potentially support specifying default values in the future.

Deprecate the argumentNames method in FunctionHints as it is not user-friendly 
for specifying argument names with optional configuration.

 
{code:java}
public @interface ArgumentHint {
/**
 * The name of the parameter, default is an empty string.
 */
String name() default "";
 
/**
 * Whether the parameter is optional, default is false.
 */
boolean isOptional() default false;
 
/**
 * The data type hint for the parameter.
 */
DataTypeHint type() default @DataTypeHint();
}
{code}



{code:java}
public @interface FunctionHint {
  
/**
 * Deprecated attribute for specifying the names of the arguments.
 * It is no longer recommended to use this attribute.
 */
@Deprecated
String[] argumentNames() default {""};
  
/**
 * Attribute for specifying the hints and additional information for 
function arguments.
 */
ArgumentHint[] arguments() default {};
}
{code}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34054) FLIP-387: Support named parameters for functions and call procedures

2024-01-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-34054:


 Summary: FLIP-387: Support named parameters for functions and call 
procedures
 Key: FLINK-34054
 URL: https://issues.apache.org/jira/browse/FLINK-34054
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Planner
Reporter: Feng Jin
 Fix For: 1.19.0


Umbrella issue for 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[RESULT][VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-09 Thread Feng Jin

Hi devs,

I'm happy to announce that FLIP-387: Support named parameters for functions
and call procedures [1] [2]
 has been accepted with 6 approving votes (4 binding)

 - Benchao Li (binding)
 - Lincoln Lee (binding)
 - Timo Walther (binding)
 - Xuyang (non-binding)
 - Jingsong Li (binding)
 - Hang Ruan (non-binding)

There are no disapproving votes.

Thanks again to everyone who participated in the discussion and voting.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
[2] https://lists.apache.org/thread/m3hw5cc1ovtvvplv9hpd56fzzl44xxcr


Best,
Feng Jin

Re: Re: [VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-09 Thread Feng Jin

Thank you all! Closing the vote. The result will be sent in a separate
email.

Best,
Feng Jin

On Wed, Jan 10, 2024 at 2:04 PM Hang Ruan  wrote:

> +1 (non-binding)
>
> Best,
> Hang
>
> Jingsong Li  于2024年1月10日周三 12:03写道：
>
> > +1
> >
> > On Wed, Jan 10, 2024 at 11:24 AM Xuyang  wrote:
> > >
> > > +1(non-binding)--
> > >
> > > Best！
> > > Xuyang
> > >
> > >
> > >
> > >
> > >
> > > 在 2024-01-08 00:34:55，"Feng Jin"  写道：
> > > >Hi Alexey
> > > >
> > > >Thank you for the reminder, the link has been updated.
> > > >
> > > >Best,
> > > >Feng Jin
> > > >
> > > >On Sat, Jan 6, 2024 at 12:55 AM Alexey Leonov-Vendrovskiy <
> > > >vendrov...@gmail.com> wrote:
> > > >
> > > >> Thanks for starting the vote!
> > > >> Do you mind adding a link from the FLIP to this thread?
> > > >>
> > > >> Thanks,
> > > >> Alexey
> > > >>
> > > >> On Thu, Jan 4, 2024 at 6:48 PM Feng Jin 
> > wrote:
> > > >>
> > > >> > Hi everyone
> > > >> >
> > > >> > Thanks for all the feedback about the FLIP-387: Support named
> > parameters
> > > >> > for functions and call procedures [1] [2] .
> > > >> >
> > > >> > I'd like to start a vote for it. The vote will be open for at
> least
> > 72
> > > >> > hours(excluding weekends,until Jan 10, 12:00AM GMT) unless there
> is
> > an
> > > >> > objection or an insufficient number of votes.
> > > >> >
> > > >> >
> > > >> >
> > > >> > [1]
> > > >> >
> > > >> >
> > > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> > > >> > [2]
> > https://lists.apache.org/thread/bto7mpjvcx7d7k86owb00dwrm65jx8cn
> > > >> >
> > > >> >
> > > >> > Best,
> > > >> > Feng Jin
> > > >> >
> > > >>
> >
>

Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-09 Thread Feng Jin

+1 (non-binding)

Best,
Feng Jin

On Tue, Jan 9, 2024 at 5:29 PM Yuxin Tan  wrote:

> +1 (non-binding)
>
> Best,
> Yuxin
>
>
> Márton Balassi  于2024年1月9日周二 17:25写道：
>
> > +1 (binding)
> >
> > On Tue, Jan 9, 2024 at 10:15 AM Leonard Xu  wrote:
> >
> > > +1(binding)
> > >
> > > Best,
> > > Leonard
> > >
> > > > 2024年1月9日 下午5:08，Yangze Guo  写道：
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Tue, Jan 9, 2024 at 5:06 PM Robert Metzger 
> > > wrote:
> > > >>
> > > >> +1 (binding)
> > > >>
> > > >>
> > > >> On Tue, Jan 9, 2024 at 9:54 AM Guowei Ma 
> > wrote:
> > > >>
> > > >>> +1 (binding)
> > > >>> Best,
> > > >>> Guowei
> > > >>>
> > > >>>
> > > >>> On Tue, Jan 9, 2024 at 4:49 PM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > >>>
> > > >>>> +1 (non-binding)
> > > >>>>
> > > >>>> Best,
> > > >>>> Rui
> > > >>>>
> > > >>>> On Tue, Jan 9, 2024 at 4:41 PM Hang Ruan 
> > > wrote:
> > > >>>>
> > > >>>>> +1 (non-binding)
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> Hang
> > > >>>>>
> > > >>>>> gongzhongqiang  于2024年1月9日周二 16:25写道：
> > > >>>>>
> > > >>>>>> +1 non-binding
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Zhongqiang
> > > >>>>>>
> > > >>>>>> Leonard Xu  于2024年1月9日周二 15:05写道：
> > > >>>>>>
> > > >>>>>>> Hello all,
> > > >>>>>>>
> > > >>>>>>> This is the official vote whether to accept the Flink CDC code
> > > >>>>>> contribution
> > > >>>>>>> to Apache Flink.
> > > >>>>>>>
> > > >>>>>>> The current Flink CDC code, documentation, and website can be
> > > >>>>>>> found here:
> > > >>>>>>> code: https://github.com/ververica/flink-cdc-connectors <
> > > >>>>>>> https://github.com/ververica/flink-cdc-connectors>
> > > >>>>>>> docs: https://ververica.github.io/flink-cdc-connectors/ <
> > > >>>>>>> https://ververica.github.io/flink-cdc-connectors/>
> > > >>>>>>>
> > > >>>>>>> This vote should capture whether the Apache Flink community is
> > > >>>>> interested
> > > >>>>>>> in accepting, maintaining, and evolving Flink CDC.
> > > >>>>>>>
> > > >>>>>>> Regarding my original proposal[1] in the dev mailing list, I
> > firmly
> > > >>>>>> believe
> > > >>>>>>> that this initiative aligns perfectly with Flink. For the Flink
> > > >>>>>> community,
> > > >>>>>>> it represents an opportunity to bolster Flink's competitive
> edge
> > in
> > > >>>>>>> streaming
> > > >>>>>>> data integration, fostering the robust growth and prosperity of
> > the
> > > >>>>>> Apache
> > > >>>>>>> Flink
> > > >>>>>>> ecosystem. For the Flink CDC project, becoming a sub-project of
> > > >>>> Apache
> > > >>>>>>> Flink
> > > >>>>>>> means becoming an integral part of a neutral open-source
> > community,
> > > >>>>>>> capable of
> > > >>>>>>> attracting a more diverse pool of contributors.
> > > >>>>>>>
> > > >>>>>>> All Flink CDC maintainers are dedicated to continuously
> > > >>> contributing
> > > >>>> to
> > > >>>>>>> achieve
> > > >>>>>>> seamless integration with Flink. Additionally, PMC members like
> > > >>> Jark,
> > > >>>>>>> Qingsheng,
> > > >>>>>>> and I are willing to infacilitate the expansion of contributors
> > and
> > > >>>>>>> committers to
> > > >>>>>>> effectively maintain this new sub-project.
> > > >>>>>>>
> > > >>>>>>> This is a "Adoption of a new Codebase" vote as per the Flink
> > bylaws
> > > >>>>> [2].
> > > >>>>>>> Only PMC votes are binding. The vote will be open at least 7
> days
> > > >>>>>>> (excluding weekends), meaning until Thursday January 18 12:00
> > UTC,
> > > >>> or
> > > >>>>>>> until we
> > > >>>>>>> achieve the 2/3rd majority. We will follow the instructions in
> > the
> > > >>>>> Flink
> > > >>>>>>> Bylaws
> > > >>>>>>> in the case of insufficient active binding voters:
> > > >>>>>>>
> > > >>>>>>>> 1. Wait until the minimum length of the voting passes.
> > > >>>>>>>> 2. Publicly reach out via personal email to the remaining
> > binding
> > > >>>>>> voters
> > > >>>>>>> in the
> > > >>>>>>> voting mail thread for at least 2 attempts with at least 7 days
> > > >>>> between
> > > >>>>>>> two attempts.
> > > >>>>>>>> 3. If the binding voter being contacted still failed to
> respond
> > > >>>> after
> > > >>>>>>> all the attempts,
> > > >>>>>>> the binding voter will be considered as inactive for the
> purpose
> > of
> > > >>>>> this
> > > >>>>>>> particular voting.
> > > >>>>>>>
> > > >>>>>>> Welcome voting !
> > > >>>>>>>
> > > >>>>>>> Best,
> > > >>>>>>> Leonard
> > > >>>>>>> [1]
> > > >>> https://lists.apache.org/thread/o7klnbsotmmql999bnwmdgo56b6kxx9l
> > > >>>>>>> [2]
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> > >
> >
>

Re: [VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-07 Thread Feng Jin

Hi Alexey

Thank you for the reminder, the link has been updated.

Best,
Feng Jin

On Sat, Jan 6, 2024 at 12:55 AM Alexey Leonov-Vendrovskiy <
vendrov...@gmail.com> wrote:

> Thanks for starting the vote!
> Do you mind adding a link from the FLIP to this thread?
>
> Thanks,
> Alexey
>
> On Thu, Jan 4, 2024 at 6:48 PM Feng Jin  wrote:
>
> > Hi everyone
> >
> > Thanks for all the feedback about the FLIP-387: Support named parameters
> > for functions and call procedures [1] [2] .
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours(excluding weekends,until Jan 10, 12:00AM GMT) unless there is an
> > objection or an insufficient number of votes.
> >
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> > [2] https://lists.apache.org/thread/bto7mpjvcx7d7k86owb00dwrm65jx8cn
> >
> >
> > Best,
> > Feng Jin
> >
>

Re: [VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-05 Thread Feng Jin

Hi Timo,

Thank you for the suggestion. Previously, I thought most parameters were
optional, so the default value was set to true.

Your concern is reasonable. We should declare it as false by default and
developers should explicitly state if a parameter is optional instead of
using our default value.

Regarding this part, I have already made modifications in the document.


Best,
Feng


On Fri, Jan 5, 2024 at 3:38 PM Timo Walther  wrote:

> Thanks, for starting the VOTE thread and thanks for considering my
> feedback. One last comment before I'm also happy to give my +1 to this:
>
> Why is ArgumentHint's default isOptinal=true? Shouldn't it be false by
> default? Many function implementers will forget to set this to false and
> suddenly get NULLs passed to their functions. Marking an argument as
> optional should be an explicit decision of an implementer.
>
> Regards,
> Timo
>
>
> On 05.01.24 05:06, Lincoln Lee wrote:
> > +1 (binding)
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Benchao Li  于2024年1月5日周五 11:46写道：
> >
> >> +1 (binding)
> >>
> >> Feng Jin  于2024年1月5日周五 10:49写道：
> >>>
> >>> Hi everyone
> >>>
> >>> Thanks for all the feedback about the FLIP-387: Support named
> parameters
> >>> for functions and call procedures [1] [2] .
> >>>
> >>> I'd like to start a vote for it. The vote will be open for at least 72
> >>> hours(excluding weekends,until Jan 10, 12:00AM GMT) unless there is an
> >>> objection or an insufficient number of votes.
> >>>
> >>>
> >>>
> >>> [1]
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> >>> [2] https://lists.apache.org/thread/bto7mpjvcx7d7k86owb00dwrm65jx8cn
> >>>
> >>>
> >>> Best,
> >>> Feng Jin
> >>
> >>
> >>
> >> --
> >>
> >> Best,
> >> Benchao Li
> >>
> >
>
>

[VOTE] FLIP-387: Support named parameters for functions and call procedures

2024-01-04 Thread Feng Jin

Hi everyone

Thanks for all the feedback about the FLIP-387: Support named parameters
for functions and call procedures [1] [2] .

I'd like to start a vote for it. The vote will be open for at least 72
hours(excluding weekends,until Jan 10, 12:00AM GMT) unless there is an
objection or an insufficient number of votes.



[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
[2] https://lists.apache.org/thread/bto7mpjvcx7d7k86owb00dwrm65jx8cn


Best,
Feng Jin

[jira] [Created] (FLINK-33996) Support disabling project rewrite when multiple exprs in the project reference the same project.

2024-01-04 Thread Feng Jin (Jira)

Feng Jin created FLINK-33996:


 Summary: Support disabling project rewrite when multiple exprs in 
the project reference the same project.
 Key: FLINK-33996
 URL: https://issues.apache.org/jira/browse/FLINK-33996
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Affects Versions: 1.18.0
Reporter: Feng Jin


When multiple top projects reference the same bottom project, project rewrite 
rules may result in complex projects being calculated multiple times.

Take the following SQL as an example:

{code:sql}
create table test_source(a varchar) with ('connector'='datagen');

explan plan for select a || 'a' as a, a || 'b' as b FROM (select 
REGEXP_REPLACE(a, 'aaa', 'bbb') as a FROM test_source);
{code}


The final SQL plan is as follows:


{code:sql}
== Abstract Syntax Tree ==
LogicalProject(a=[||($0, _UTF-16LE'a')], b=[||($0, _UTF-16LE'b')])
+- LogicalProject(a=[REGEXP_REPLACE($0, _UTF-16LE'aaa', _UTF-16LE'bbb')])
   +- LogicalTableScan(table=[[default_catalog, default_database, test_source]])

== Optimized Physical Plan ==
Calc(select=[||(REGEXP_REPLACE(a, _UTF-16LE'aaa', _UTF-16LE'bbb'), 
_UTF-16LE'a') AS a, ||(REGEXP_REPLACE(a, _UTF-16LE'aaa', _UTF-16LE'bbb'), 
_UTF-16LE'b') AS b])
+- TableSourceScan(table=[[default_catalog, default_database, test_source]], 
fields=[a])

== Optimized Execution Plan ==
Calc(select=[||(REGEXP_REPLACE(a, 'aaa', 'bbb'), 'a') AS a, 
||(REGEXP_REPLACE(a, 'aaa', 'bbb'), 'b') AS b])
+- TableSourceScan(table=[[default_catalog, default_database, test_source]], 
fields=[a])
{code}

It can be observed that after project write, regex_place is calculated twice. 
Generally speaking, regular expression matching is a time-consuming operation 
and we usually do not want it to be calculated multiple times. Therefore, for 
this scenario, we can support disabling project rewrite.

After disabling some rules, the final plan we obtained is as follows:


{code:sql}
== Abstract Syntax Tree ==
LogicalProject(a=[||($0, _UTF-16LE'a')], b=[||($0, _UTF-16LE'b')])
+- LogicalProject(a=[REGEXP_REPLACE($0, _UTF-16LE'aaa', _UTF-16LE'bbb')])
   +- LogicalTableScan(table=[[default_catalog, default_database, test_source]])

== Optimized Physical Plan ==
Calc(select=[||(a, _UTF-16LE'a') AS a, ||(a, _UTF-16LE'b') AS b])
+- Calc(select=[REGEXP_REPLACE(a, _UTF-16LE'aaa', _UTF-16LE'bbb') AS a])
   +- TableSourceScan(table=[[default_catalog, default_database, test_source]], 
fields=[a])

== Optimized Execution Plan ==
Calc(select=[||(a, 'a') AS a, ||(a, 'b') AS b])
+- Calc(select=[REGEXP_REPLACE(a, 'aaa', 'bbb') AS a])
   +- TableSourceScan(table=[[default_catalog, default_database, test_source]], 
fields=[a])
{code}


After testing, we probably need to modify these few rules:

org.apache.flink.table.planner.plan.rules.logical.FlinkProjectMergeRule

org.apache.flink.table.planner.plan.rules.logical.FlinkCalcMergeRule

org.apache.flink.table.planner.plan.rules.logical.FlinkProjectMergeRule








--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

2024-01-02 Thread Feng Jin

Hi all,

Thank you for the valuable input.
Since there are no new objections or suggestions, I will open a voting
thread in two days.


Best,
Feng


On Thu, Dec 21, 2023 at 6:58 PM Benchao Li  wrote:

> I'm glad to hear that this is in your plan. Sorry that I overlooked
> the PoC link in the FLIP previously, I'll go over the code of PoC, and
> post here if there are any more concerns.
>
> Xuyang  于2023年12月21日周四 10:39写道：
>
> >
> > Hi, Benchao.
> >
> >
> > When Feng Jin and I tried the poc together, we found that when using
> udaf, Calcite directly using the function's input parameters from
> SqlCall#getOperandList. But in fact, these input parameters may use named
> arguments, the order of parameters may be wrong, and they may not include
> optional parameters that need to set default values. Actually, we should
> use new SqlCallBinding(this, scope, call).operands() to let this method
> correct the order and add default values. (You can see the modification in
> SqlToRelConverter in poc branch[1])
> >
> >
> > We have not reported this bug to the calcite community yet. Our original
> plan was to report this bug to the calcite community during the process of
> doing this flip, and fix it separately in flink's own calcite file. Because
> the time for Calcite to release the version is uncertain. And the time to
> upgrade flink to the latest calcite version is also unknown.
> >
> >
> > The link to the poc code is at the bottom of the flip[2]. I'm post it
> here again[1].
> >
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
> > [2]
> https://github.com/apache/flink/compare/master...hackergin:flink:poc_named_argument
> >
> >
> >
> > --
> >
> > Best！
> > Xuyang
> >
> >
> >
> >
> >
> > 在 2023-12-20 13:31:26，"Benchao Li"  写道：
> > >I didn't see your POC code, so I assumed that you'll need to add
> > >SqlStdOperatorTable#DEFAULT and
> > >SqlStdOperatorTable#ARGUMENT_ASSIGNMENT to FlinkSqlOperatorTable, am I
> > >right?
> > >
> > >If yes, this would enable many builtin functions to allow default and
> > >optional arguments, for example, `select md5(DEFAULT)`, I guess this
> > >is not what we want to support right? If so, I would suggest to throw
> > >proper errors for these unexpected usages.
> > >
> > >Benchao Li  于2023年12月20日周三 13:16写道：
> > >>
> > >> Thanks Feng for driving this, it's a very useful feature.
> > >>
> > >> In the FLIP, you mentioned that
> > >> > During POC verification, bugs were discovered in Calcite that
> caused issues during the validation phase. We need to modify the
> SqlValidatorImpl and SqlToRelConverter to address these problems.
> > >>
> > >> Could you log bugs in Calcite, and reference the corresponding Jira
> > >> number in your code. We want to upgrade Calcite version to latest as
> > >> much as possible, and maintaining many bugfixes in Flink will add
> > >> additional burdens for upgrading Calcite. By adding corresponding
> > >> issue numbers, we can easily make sure that we can remove these Flink
> > >> hosted bugfixes when we upgrade to a version that already contains the
> > >> fix.
> > >>
> > >> Feng Jin  于2023年12月14日周四 19:30写道：
> > >> >
> > >> > Hi Timo
> > >> > Thanks for your reply.
> > >> >
> > >> > >  1) ArgumentNames annotation
> > >> >
> > >> > I'm sorry for my incorrect expression. argumentNames is a method of
> > >> > FunctionHints. We should introduce a new arguments method to
> replace this
> > >> > method and return Argument[].
> > >> > I updated the FLIP doc about this part.
> > >> >
> > >> > >  2) Evolution of FunctionHint
> > >> >
> > >> > +1 define DataTypeHint as part of ArgumentHint. I'll update the
> FLIP doc.
> > >> >
> > >> > > 3)  Semantical correctness
> > >> >
> > >> > I realized that I forgot to submit the latest modification of the
> FLIP
> > >> > document. Xuyang and I had a prior discussion before starting this
> discuss.
> > >> > Let's restrict it to supporting only one eval() function, which will
> > >> > simplify the overall design.
> > &

Re: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-02 Thread Feng Jin

Congratulations, Alex!

Best,
Feng

On Tue, Jan 2, 2024 at 11:04 PM Chen Yu  wrote:

> Congratulations, Alex!
>
> Best,
> Yu Chen
>
> 
> 发件人: Zhanghao Chen 
> 发送时间: 2024年1月2日 22:44
> 收件人: dev
> 抄送: Alexander Fedulov
> 主题: Re: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov
>
> Congrats, Alex!
>
> Best,
> Zhanghao Chen
> 
> From: Maximilian Michels 
> Sent: Tuesday, January 2, 2024 20:15
> To: dev 
> Cc: Alexander Fedulov 
> Subject: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov
>
> Happy New Year everyone,
>
> I'd like to start the year off by announcing Alexander Fedulov as a
> new Flink committer.
>
> Alex has been active in the Flink community since 2019. He has
> contributed more than 100 commits to Flink, its Kubernetes operator,
> and various connectors [1][2].
>
> Especially noteworthy are his contributions on deprecating and
> migrating the old Source API functions and test harnesses, the
> enhancement to flame graphs, the dynamic rescale time computation in
> Flink Autoscaling, as well as all the small enhancements Alex has
> contributed which make a huge difference.
>
> Beyond code contributions, Alex has been an active community member
> with his activity on the mailing lists [3][4], as well as various
> talks and blog posts about Apache Flink [5][6].
>
> Congratulations Alex! The Flink community is proud to have you.
>
> Best,
> The Flink PMC
>
> [1]
> https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> [2]
> https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> [3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> [4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> [5]
> https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> [6]
> https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
>

[jira] [Created] (FLINK-33936) Mini-batch should output the result when the result is same as last if TTL is setted.

2023-12-25 Thread Feng Jin (Jira)

Feng Jin created FLINK-33936:


 Summary: Mini-batch should output the result when the result is 
same as last if TTL is setted.
 Key: FLINK-33936
 URL: https://issues.apache.org/jira/browse/FLINK-33936
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.18.0
Reporter: Feng Jin


If mini-batch is enabled currently, and if the aggregated result is the same as 
the previous output, this time's aggregation result will not be sent 
downstream. The specific logic is as follows. This will cause downstream nodes 
to not receive updated data. If there is a TTL set for states at this time, the 
TTL of downstream will not be updated either.

https://github.com/hackergin/flink/blob/a18c0cd3f0cdfd7e0acb53283f40cd2033a86472/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/aggregate/MiniBatchGroupAggFunction.java#L224

{code:java}
if (!equaliser.equals(prevAggValue, newAggValue)) {
// new row is not same with prev row
if (generateUpdateBefore) {
// prepare UPDATE_BEFORE message for previous row
resultRow
.replace(currentKey, prevAggValue)
.setRowKind(RowKind.UPDATE_BEFORE);
out.collect(resultRow);
}
// prepare UPDATE_AFTER message for new row
resultRow.replace(currentKey, 
newAggValue).setRowKind(RowKind.UPDATE_AFTER);
out.collect(resultRow);
}
// new row is same with prev row, no need to output
{code}



When mini-batch is not enabled, even if the aggregation result of this time is 
the same as last time, new results will still be sent if TTL is set.

https://github.com/hackergin/flink/blob/e9353319ad625baa5b2c20fa709ab5b23f83c0f4/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/aggregate/GroupAggFunction.java#L170

{code:java}

if (stateRetentionTime <= 0 && equaliser.equals(prevAggValue, 
newAggValue)) {
// newRow is the same as before and state cleaning is not 
enabled.
// We do not emit retraction and acc message.
// If state cleaning is enabled, we have to emit messages 
to prevent too early
// state eviction of downstream operators.
return;
} else {
// retract previous result
if (generateUpdateBefore) {
// prepare UPDATE_BEFORE message for previous row
resultRow
.replace(currentKey, prevAggValue)
.setRowKind(RowKind.UPDATE_BEFORE);
out.collect(resultRow);
}
// prepare UPDATE_AFTER message for new row
resultRow.replace(currentKey, 
newAggValue).setRowKind(RowKind.UPDATE_AFTER);
}
{code}


Therefore, based on the consideration of TTL scenarios, I believe that when 
mini-batch aggregation is enabled, new results should also be issued when the 
aggregated result is the same as the previous one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

2023-12-14 Thread Feng Jin

Hi Timo
Thanks for your reply.

>  1) ArgumentNames annotation

I'm sorry for my incorrect expression. argumentNames is a method of
FunctionHints. We should introduce a new arguments method to replace this
method and return Argument[].
I updated the FLIP doc about this part.

>  2) Evolution of FunctionHint

+1 define DataTypeHint as part of ArgumentHint. I'll update the FLIP doc.

> 3)  Semantical correctness

I realized that I forgot to submit the latest modification of the FLIP
document. Xuyang and I had a prior discussion before starting this discuss.
Let's restrict it to supporting only one eval() function, which will
simplify the overall design.

Therefore, I also concur with not permitting overloaded named parameters.


Best,
Feng

On Thu, Dec 14, 2023 at 6:15 PM Timo Walther  wrote:

> Hi Feng,
>
> thank you for proposing this FLIP. This nicely completes FLIP-65 which
> is great for usability.
>
> I read the FLIP and have some feedback:
>
>
> 1) ArgumentNames annotation
>
>  > Deprecate the ArgumentNames annotation as it is not user-friendly for
> specifying argument names with optional configuration.
>
> Which annotation does the FLIP reference here? I cannot find it in the
> Flink code base.
>
> 2) Evolution of FunctionHint
>
> Introducing @ArgumentHint makes a lot of sense to me. However, using it
> within @FunctionHint looks complex, because there is both `input=` and
> `arguments=`. Ideally, the @DataTypeHint can be defined inline as part
> of the @ArgumentHint. It could even be the `value` such that
> `@ArgumentHint(@DataTypeHint("INT"))` is valid on its own.
>
> We could deprecate `input=`. Or let both `input` and `arguments=`
> coexist but never be defined at the same time.
>
> 3) Semantical correctness
>
> As you can see in the `TypeInference` class, named parameters are
> prepared in the stack already. However, we need to watch out between
> helpful explanation (see `InputTypeStrategy#getExpectedSignatures`) and
> named parameters (see `TypeInference.Builder#namedArguments`) that can
> be used in SQL.
>
> If I remember correctly, named parameters can be reordered and don't
> allow overloading of signatures. Thus, only a single eval() should have
> named parameters. Looking at the FLIP it seems you would like to support
> multiple parameter lists. What changes are you planning to TypeInference
> (which is also declared as @PublicEvoving)? This should also be
> documented as the annotations should compile into this class.
>
> In general, I would prefer to keep it simple and don't allow overloading
> named parameters. With the optionality, users can add an arbitrary
> number of parameters to the signature of the same eval method.
>
> Regards,
> Timo
>
> On 14.12.23 10:02, Feng Jin wrote:
> > Hi all,
> >
> >
> > Xuyang and I would like to start a discussion of FLIP-387: Support
> > named parameters for functions and call procedures [1]
> >
> > Currently, when users call a function or call a procedure, they must
> > specify all fields in order. When there are a large number of
> > parameters, it is easy to make mistakes and cannot omit specifying
> > non-mandatory fields.
> >
> > By using named parameters, you can selectively specify the required
> > parameters, reducing the probability of errors and making it more
> > convenient to use.
> >
> > Here is an example of using Named Procedure.
> > ```
> > -- for scalar function
> > SELECT my_scalar_function(param1 => ‘value1’, param2 => ‘value2’’) FROM
> []
> >
> > -- for table function
> > SELECT  *  FROM TABLE(my_table_function(param1 => 'value1', param2 =>
> 'value2'))
> >
> > -- for agg function
> > SELECT my_agg_function(param1 => 'value1', param2 => 'value2') FROM []
> >
> > -- for call procedure
> > CALL  procedure_name(param1 => ‘value1’, param2 => ‘value2’)
> > ```
> >
> > For UDX and Call procedure developers, we introduce a new annotation
> > to specify the parameter name, indicate if it is optional, and
> > potentially support specifying default values in the future
> >
> > ```
> > public @interface ArgumentHint {
> >  /**
> >   * The name of the parameter, default is an empty string.
> >   */
> >  String name() default "";
> >
> >  /**
> >   * Whether the parameter is optional, default is true.
> >   */
> >  boolean isOptional() default true;
> > }}
> > ```
> >
> > ```
> > // Call Procedure Development
&g

[DISCUSS] FLIP-387: Support named parameters for functions and call procedures

2023-12-14 Thread Feng Jin

Hi all,


Xuyang and I would like to start a discussion of FLIP-387: Support
named parameters for functions and call procedures [1]

Currently, when users call a function or call a procedure, they must
specify all fields in order. When there are a large number of
parameters, it is easy to make mistakes and cannot omit specifying
non-mandatory fields.

By using named parameters, you can selectively specify the required
parameters, reducing the probability of errors and making it more
convenient to use.

Here is an example of using Named Procedure.
```
-- for scalar function
SELECT my_scalar_function(param1 => ‘value1’, param2 => ‘value2’’) FROM []

-- for table function
SELECT  *  FROM TABLE(my_table_function(param1 => 'value1', param2 => 'value2'))

-- for agg function
SELECT my_agg_function(param1 => 'value1', param2 => 'value2') FROM []

-- for call procedure
CALL  procedure_name(param1 => ‘value1’, param2 => ‘value2’)
```

For UDX and Call procedure developers, we introduce a new annotation
to specify the parameter name, indicate if it is optional, and
potentially support specifying default values in the future

```
public @interface ArgumentHint {
/**
 * The name of the parameter, default is an empty string.
 */
String name() default "";

/**
 * Whether the parameter is optional, default is true.
 */
boolean isOptional() default true;
}}
```

```
// Call Procedure Development

public static class NamedArgumentsProcedure implements Procedure {

   // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 => 'value2')

   // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 =>
'value2', in3 => 'value3')

   @ProcedureHint(
   input = {@DataTypeHint(value = "STRING"),
@DataTypeHint(value = "STRING"), @DataTypeHint(value = "STRING")},
   output = @DataTypeHint("STRING"),
   arguments = {
@ArgumentHint(name = "in1", isOptional = false),
@ArgumentHint(name = "in2", isOptional = true)
@ArgumentHint(name = "in3", isOptional = true)})
   public String[] call(ProcedureContext procedureContext, String
arg1, String arg2, String arg3) {
   return new String[]{arg1 + ", " + arg2 + "," + arg3};
   }
}
```


Currently, we offer support for two scenarios when calling a function
or procedure:

1. The corresponding parameters can be specified using the parameter
name, without a specific order.
2. Unnecessary parameters can be omitted.


There are still some limitations when using Named parameters:
1. Named parameters do not support variable arguments.
2. UDX or procedure classes that support named parameters can only
have one eval method.
3. Due to the current limitations of Calcite-947[2], we cannot specify
a default value for omitted parameters, which is Null by default.



Also, thanks very much for the suggestions and help provided by Zelin
and Lincoln.




1. 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures.

2. https://issues.apache.org/jira/browse/CALCITE-947



Best,

Feng

Re: [PROPOSAL] Contribute Flink CDC Connectors project to Apache Flink

2023-12-06 Thread Feng Jin

This is incredibly exciting news, a big +1 for this.

Thank you for the fantastic work on Flink CDC. We have created thousands of
real-time integration jobs using Flink CDC connectors.


Best,
Feng

On Thu, Dec 7, 2023 at 1:45 PM gongzhongqiang 
wrote:

> It's very exciting to hear the news.
> +1 for adding CDC Connectors  to Apache Flink !
>
>
> Best,
> Zhongqiang
>
> Leonard Xu  于2023年12月7日周四 11:25写道：
>
> > Dear Flink devs,
> >
> >
> > As you may have heard, we at Alibaba (Ververica) are planning to donate
> CDC Connectors for the Apache Flink project
> > *[1]* to the Apache Flink community.
> >
> >
> >
> > CDC Connectors for Apache Flink comprise a collection of source
> connectors designed specifically for Apache Flink. These connectors
> > *[2]*
> >  enable the ingestion of changes from various databases using Change
> Data Capture (CDC), most of these CDC connectors are powered by Debezium
> > *[3]*
> > . They support both the DataStream API and the Table/SQL API,
> facilitating the reading of database snapshots and continuous reading of
> transaction logs with exactly-once processing, even in the event of
> failures.
> >
> >
> >
> > Additionally, in the latest version 3.0, we have introduced many
> long-awaited features. Starting from CDC version 3.0, we've built a
> Streaming ELT Framework available for streaming data integration. This
> framework allows users to write their data synchronization logic in a
> simple YAML file, which will automatically be translated into a Flink
> DataStreaming job. It emphasizes optimizing the task submission process and
> offers advanced functionalities such as whole database synchronization,
> merging sharded tables, and schema evolution
> > *[4]*.
> >
> >
> >
> >
> > I believe this initiative is a perfect match for both sides. For the
> Flink community, it presents an opportunity to enhance Flink's competitive
> advantage in streaming data integration, promoting the healthy growth and
> prosperity of the Apache Flink ecosystem. For the CDC Connectors project,
> becoming a sub-project of Apache Flink means being part of a neutral
> open-source community, which can attract a more diverse pool of
> contributors.
> >
> >
> > Please note that the aforementioned points represent only some of our
> motivations and vision for this donation. Specific future operations need
> to be further discussed in this thread. For example, the sub-project name
> after the donation; we hope to name it Flink-CDC
> > aiming to streaming data intergration through Apache Flink,
> > following the naming convention of Flink-ML; And this project is managed
> by a total of 8 maintainers, including 3 Flink PMC members and 1 Flink
> Committer. The remaining 4 maintainers are also highly active contributors
> to the Flink community, donating this project to the Flink community
> implies that their permissions might be reduced. Therefore, we may need to
> bring up this topic for further discussion within the Flink PMC.
> Additionally, we need to discuss how to migrate existing users and
> documents. We have a user group of nearly 10,000 people and a multi-version
> documentation site need to migrate. We also need to plan for the migration
> of CI/CD processes and other specifics.
> >
> >
> >
> > While there are many intricate details that require implementation, we
> are committed to progressing and finalizing this donation process.
> >
> >
> >
> > Despite being Flink’s most active ecological project (as evaluated by
> GitHub metrics), it also boasts a significant user base. However, I believe
> it's essential to commence discussions on future operations only after the
> community reaches a consensus on whether they desire this donation.
> >
> >
> > Really looking forward to hear what you think!
> >
> >
> >
> > Best,
> > Leonard (on behalf of the Flink CDC Connectors project maintainers)
> >
> > [1] https://github.com/ververica/flink-cdc-connectors
> > [2]
> >
> https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-connectors.html
> > [3] https://debezium.io
> > [4]
> >
> https://ververica.github.io/flink-cdc-connectors/master/content/overview/cdc-pipeline.html
> >
>

Re: [DISCUSS] FLIP-392: Deprecate the Legacy Group Window Aggregation

2023-12-04 Thread Feng Jin

Hi xuyang,

Thank you for initiating this proposal.

I'm glad to see that TVF's functionality can be fully supported.

Regarding the early fire, late fire, and allow lateness features, how will
they be provided to users? The documentation doesn't seem to provide a
detailed description of this part.

Since this FLIP will also involve a lot of feature development, I am more
than willing to help, including development and code review.

Best,
Feng

On Tue, Nov 28, 2023 at 8:31 PM Xuyang  wrote:

> Hi all.
> I'd like to start a discussion of FLIP-392: Deprecate the Legacy Group
> Window Aggregation.
>
>
> Although the current Flink SQL Window Aggregation documentation[1]
> indicates that the legacy Group Window Aggregation
> syntax has been deprecated, the new Window TVF Aggregation syntax has not
> fully covered all of the features of the legacy one.
>
>
> Compared to Group Window Aggergation, Window TVF Aggergation has several
> advantages, such as two-stage optimization,
> support for standard GROUPING SET syntax, and so on. However, it needs to
> supplement and enrich the following features.
>
>
> 1. Support for SESSION Window TVF Aggregation
> 2. Support for consuming CDC stream
> 3. Support for HOP window size with non-integer step length
> 4. Support for configurations such as early fire, late fire and allow
> lateness
> (which are internal experimental configurations in Group Window
> Aggregation and not public to users yet.)
> 5. Unification of the Window TVF Aggregation operator in runtime at the
> implementation layer
> (In the long term, the cost to maintain the operators about Window TVF
> Aggregation and Group Window Aggregation is too expensive.)
>
>
> This flip aims to continue the unfinished work in FLIP-145[2], which is to
> fully enable the capabilities of Window TVF Aggregation
>  and officially deprecate the legacy syntax Group Window Aggregation, to
> prepare for the removal of the legacy one in Flink 2.0.
>
>
> I have already done some preliminary POC to validate the feasibility of
> the related work in this flip as follows.
> 1. POC for SESSION Window TVF Aggregation [3]
> 2. POC for CUMULATE in Group Window Aggregation operator [4]
> 3. POC for consuming CDC stream in Window Aggregation operator [5]
>
>
> Looking forward to your feedback and thoughts!
>
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-agg/
>
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-SessionWindows
> [3] https://github.com/xuyangzhong/flink/tree/FLINK-24024
> [4]
> https://github.com/xuyangzhong/flink/tree/poc_legacy_group_window_agg_cumulate
> [5]
> https://github.com/xuyangzhong/flink/tree/poc_window_agg_consumes_cdc_stream
>
>
>
> --
>
> Best！
> Xuyang

Re: Streaming data from AWS S3 Source

2023-11-23 Thread Feng Jin

Hi Neelabh

You can use  FileSystem Connector .  for DataStream [1]   for TableAPI [2]
.
And you need to put necessary  dependency to your flink environment. [3]

For Flink SQL setup, you can reference `sql getting started module`[4]

[1].
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/#file-source
[2].
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/filesystem/
[3].
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
[4].
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/gettingstarted/#prerequisites


Best,
Feng

On Fri, Nov 24, 2023 at 1:52 PM Neelabh Shukla 
wrote:

> Hey Team,
> I want to stream data from AWS S3 source as they are being generated by an
> event to Apache Flink Stream for a data transformation job.
>
> I found out about FileSystem SQL Connector
> <
> https://stackoverflow.com/questions/62618381/apache-flink-with-s3-as-source-and-s3-as-sink
> >
> but
> need some reference on how to set this up.
>
> Feel free to share any learnings you have with this problem or any resource
> I can refer to.
>
> Based on slack recommendation I am reaching out to this email list but let
> me know if this is not the correct place to reach out to the community.
>
> Thanks,
> Neelabh
>

Re: [VOTE] Release 1.16.3, release candidate #1

2023-11-23 Thread Feng Jin

+1(non-binding)

- verified signatures and hashsums
- Verified there are no binaries in the source archive
- Built Flink from sources
- Started local standalone cluster, submitted sql job using sql-client

Best,
Feng

On Thu, Nov 23, 2023 at 11:00 PM Hang Ruan  wrote:

> +1 (non-binding)
>
> - verified signatures
> - verified hashsums
> - Verified there are no binaries in the source archive
> - reviewed the Release Note
> - reviewed the web PR
>
> Best,
> Hang
>
> Jiabao Sun  于2023年11月23日周四 22:15写道：
>
> > Thanks for driving this release.
> >
> > +1 (non-binding)
> >
> > - Checked the tag in git
> > - Verified signatures and hashsums
> > - Verified there are no binaries in the source archive
> > - Built Flink from sources
> >
> > Best,
> > Jiabao
> >
> > > 2023年11月23日 21:55，Sergey Nuyanzin  写道：
> > >
> > > +1 (non-binding)
> > >
> > > - Downloaded artifacts
> > > - Built Flink from sources
> > > - Verified checksums & signatures
> > > - Verified pom/NOTICE files
> > > - reviewed the web PR
> > >
> > > On Thu, Nov 23, 2023 at 1:28 PM Leonard Xu  wrote:
> > >
> > >> +1 (binding)
> > >>
> > >> - verified signatures
> > >> - verified hashsums
> > >> - checked that all POM files point to the same version 1.16.3
> > >> - started SQL Client, used MySQL CDC connector to read changelog from
> > >> database , the result is expected
> > >> - reviewed the web PR, left minor comment
> > >>
> > >> Best,
> > >> Leonard
> > >>
> > >>> 2023年11月21日 下午8:56，Matthias Pohl 
> 写道：
> > >>>
> > >>> +1 (binding)
> > >>>
> > >>> * Downloaded artifacts
> > >>> * Built Flink from sources
> > >>> * Verified SHA512 checksums & GPG signatures
> > >>> * Compared checkout with provided sources
> > >>> * Verified pom file versions
> > >>> * Went over NOTICE/pom file changes without finding anything
> suspicious
> > >>> * Deployed standalone session cluster and ran WordCount example in
> > batch
> > >>> and streaming: Nothing suspicious in log files found
> > >>> * Verified Java version of uploaded binaries
> > >>>
> > >>> Thanks for wrapping 1.16 up. :)
> > >>>
> > >>> On Tue, Nov 21, 2023 at 4:55 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > >>>
> >  +1 (non-binding)
> > 
> >  Verified based on this wiki[1].
> > 
> >  - Verified signatures and sha512
> >  - The source archives do not contain any binaries
> >  - Build the source with Maven 3 and java8 (Checked the license as
> > well)
> >  - bin/start-cluster.sh with java8, it works fine and no any
> unexpected
> > >> LOG
> >  - Ran demo, it's fine:  bin/flink run
> >  examples/streaming/StateMachineExample.jar
> > 
> >  [1]
> > 
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release
> > 
> >  Best,
> >  Rui
> > 
> >  On Fri, Nov 17, 2023 at 11:52 AM Yun Tang  wrote:
> > 
> > > +1 (non-binding)
> > >
> > >
> > > *   Verified signatures
> > > *   Build from source code, and it looks good
> > > *   Verified that jar packages are built with maven-3.2.5 and JDK8
> > > *   Reviewed the flink-web PR
> > > *   Start a local standalone cluster and submit examples
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Rui Fan <1996fan...@gmail.com>
> > > Sent: Monday, November 13, 2023 18:20
> > > To: dev 
> > > Subject: [VOTE] Release 1.16.3, release candidate #1
> > >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #1 for the version
> >  1.16.3,
> > >
> > > as follows:
> > >
> > > [ ] +1, Approve the release
> > >
> > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > >
> > >
> > > The complete staging area is available for your review, which
> > includes:
> > > * JIRA release notes [1],
> > >
> > > * the official Apache source release and binary convenience
> releases
> > to
> >  be
> > > deployed to dist.apache.org [2], which are signed with the key
> with
> > > fingerprint B2D64016B940A7E0B9B72E0D7D0528B28037D8BC [3],
> > >
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > >
> > > * source code tag "release-1.16.3-rc1" [5],
> > >
> > > * website pull request listing the new release and adding
> > announcement
> >  blog
> > > post [6].
> > >
> > >
> > > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > >
> > > [1]
> > >
> > >
> > 
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353259
> > >
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.16.3-rc1/
> > >
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >
> > > [4]
> > >
> > >>
> https://repository.apache.org/content/

Re: [VOTE] Release 1.17.2, release candidate #1

2023-11-23 Thread Feng Jin

+1(non-binding)

- verified signatures and hashsums
- Verified there are no binaries in the source archive
- Built Flink from sources
- Started local standalone cluster, submitted sql job using sql-client

Best,
Feng


On Thu, Nov 23, 2023 at 11:08 PM Hang Ruan  wrote:

> +1 (non-binding)
>
> - verified signatures
> - verified hashsums
> - Verified there are no binaries in the source archive
> - reviewed the web PR
> - built Flink from sources
>
> Best,
> Hang
>
> Jiabao Sun  于2023年11月23日周四 22:09写道：
>
> > Thanks for driving this release.
> >
> > +1(non-binding)
> >
> > - Checked the tag in git
> > - Verified signatures and hashsums
> > - Verified there are no binaries in the source archive
> > - Built Flink from sources
> >
> > Best,
> > Jiabao
> >
> >
> > > 2023年11月21日 20:46，Matthias Pohl  写道：
> > >
> > > +1 (binding)
> > >
> > > * Downloaded artifacts
> > > * Built Flink from sources
> > > * Verified SHA512 checksums & GPG signatures
> > > * Compared checkout with provided sources
> > > * Verified pom file versions
> > > * Went over NOTICE/pom file changes without finding anything suspicious
> > > * Deployed standalone session cluster and ran WordCount example in
> batch
> > > and streaming: Nothing suspicious in log files found
> > > * Verified Java version of uploaded binaries
> > >
> > > Thanks Yun Tang for taking care of it.
> > >
> > > On Thu, Nov 16, 2023 at 7:02 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> - Verified signatures
> > >> - Reviewed the flink-web PR, left a couple of comments
> > >> - The source archives do not contain any binaries
> > >> - Build the source with Maven 3 and java8 (Checked the license as
> well)
> > >> - bin/start-cluster.sh with java8, it works fine and no any unexpected
> > LOG
> > >> - Ran demo, it's fine:  bin/flink run
> > >> examples/streaming/StateMachineExample.jar
> > >>
> > >> Best,
> > >> Rui
> > >>
> > >> On Mon, Nov 13, 2023 at 4:04 PM Yun Tang  wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> Please review and vote on the release candidate #1 for the version
> > >> 1.17.2,
> > >>>
> > >>> as follows:
> > >>>
> > >>> [ ] +1, Approve the release
> > >>>
> > >>> [ ] -1, Do not approve the release (please provide specific comments)
> > >>>
> > >>>
> > >>> The complete staging area is available for your review, which
> includes:
> > >>> * JIRA release notes [1],
> > >>>
> > >>> * the official Apache source release and binary convenience releases
> to
> > >> be
> > >>> deployed to dist.apache.org [2], which are signed with the key with
> > >>> fingerprint 2E0E1AB5D39D55E608071FB9F795C02A4D2482B3 [3],
> > >>>
> > >>> * all artifacts to be deployed to the Maven Central Repository [4],
> > >>>
> > >>> * source code tag "release-1.17.2-rc1" [5],
> > >>>
> > >>> * website pull request listing the new release and adding
> announcement
> > >>> blog post [6].
> > >>>
> > >>>
> > >>> The vote will be open for at least 72 hours. It is adopted by
> majority
> > >>> approval, with at least 3 PMC affirmative votes.
> > >>>
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353260
> > >>>
> > >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.17.2-rc1/
> > >>>
> > >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >>>
> > >>> [4]
> > >>>
> > https://repository.apache.org/content/repositories/orgapacheflink-1669/
> > >>>
> > >>> [5] https://github.com/apache/flink/releases/tag/release-1.17.2-rc1
> > >>>
> > >>> [6] https://github.com/apache/flink-web/pull/696
> > >>>
> > >>> Thanks,
> > >>> Release Manager
> > >>>
> > >>
> >
> >
>

Re: [VOTE] FLIP-390: Support System out and err to be redirected to LOG or discarded

2023-11-22 Thread Feng Jin

+1(non-binding)

Best,
Feng

On Wed, Nov 22, 2023 at 6:37 PM Leonard Xu  wrote:

> +1(binding)
>
> Best,
> Leonard
>
> > 2023年11月22日 下午6:19，Roman Khachatryan  写道：
> >
> > +1 (binding)
> >
> >
> > Thanks for the proposal
> >
> > Regards,
> > Roman
> >
> > On Wed, Nov 22, 2023, 10:08 AM Piotr Nowojski 
> > wrote:
> >
> >> Thanks Rui!
> >>
> >> +1 (binding)
> >>
> >> Best,
> >> Piotrek
> >>
> >> śr., 22 lis 2023 o 08:05 Hangxiang Yu  napisał(a):
> >>
> >>> +1 (binding)
> >>> Thanks for your efforts!
> >>>
> >>> On Mon, Nov 20, 2023 at 11:53 AM Rui Fan <1996fan...@gmail.com> wrote:
> >>>
>  Hi everyone,
> 
>  Thank you to everyone for the feedback on FLIP-390: Support
>  System out and err to be redirected to LOG or discarded[1]
>  which has been discussed in this thread [2].
> 
>  I would like to start a vote for it. The vote will be open for at
> least
> >>> 72
>  hours unless there is an objection or not enough votes.
> 
>  [1] https://cwiki.apache.org/confluence/x/4guZE
>  [2] https://lists.apache.org/thread/47pdjggh0q0tdkq0cwt6y5o2o8wrl9jl
> 
>  Best,
>  Rui
> 
> >>>
> >>>
> >>> --
> >>> Best,
> >>> Hangxiang.
> >>>
> >>
>
>

Re: [DISCUSS] FLIP-390: Support System out and err to be redirected to LOG or discarded

2023-11-14 Thread Feng Jin

gt; > > > > Or implement your own loop? It shouldn't be more than a couple of
> > > > lines.
> > > > > >
> > > > > > Best,
> > > > > > Piotrek
> > > > > >
> > > > > > czw., 9 lis 2023 o 06:43 Rui Fan <1996fan...@gmail.com>
> > napisał(a):
> > > > > >
> > > > > > > Hi Piotr, Archit, Feng and Hangxiang:
> > > > > > >
> > > > > > > Thanks a lot for your feedback!
> > > > > > >
> > > > > > > Following is my comment, please correct me if I misunderstood
> > > > anything!
> > > > > > >
> > > > > > > To Piotr:
> > > > > > >
> > > > > > > > Is there a reason why you are suggesting to copy out bytes
> from
> > > > `buf`
> > > > > > to
> > > > > > > `bytes`,
> > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int
> > > > > aToIndex,
> > > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > > >
> > > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int
> aFromIndex,
> > > int
> > > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > > > >
> > > > > > > Also, this code doesn't run in production. As the comment of
> > > > > > > System.lineSeparator():
> > > > > > >
> > > > > > > > On UNIX systems, it returns {@code "\n"}; on Microsoft
> > > > > > > > Windows systems it returns {@code "\r\n"}.
> > > > > > >
> > > > > > > So Mac and Linux just return one character, we will compare
> > > > > > > one byte directly.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > To Feng:
> > > > > > >
> > > > > > > > Will they be written to the taskManager.log file by default
> > > > > > > > or the taskManager.out file?
> > > > > > >
> > > > > > > I prefer LOG as the default value for
> > taskmanager.system-out.mode.
> > > > > > > It's useful for job stability and doesn't introduce significant
> > > > impact
> > > > > to
> > > > > > > users. Also, our production has already used this feature for
> > > > > > > more than 1 years, it works well.
> > > > > > >
> > > > > > > However, I write the DEFAULT as the default value for
> > > > > > > taskmanager.system-out.mode, because when the community
> > introduces
> > > > > > > new options, the default value often selects the original
> > behavior.
> > > > > > >
> > > > > > > Looking forward to hear more thoughts from community about this
> > > > > > > default value.
> > > > > > >
> > > > > > > > If we can make taskmanager.out splittable and rolling, would
> it
> > > be
> > > > > > > > easier for users to use this feature?
> > > > > > >
> > > > > > > Making taskmanager.out splittable and rolling is a good choice!
> > > > > > > I have some concerns about it:
> > > > > > >
> > > > > > > 1. Users may also want to use LOG.info in their code and just
> > > > > > >   accidentally use System.out.println. It is possible that they
> > > will
> > > > > > >   also find the logs directly in taskmanager.log.
> > > > > > > 2. I'm not sure whether the rolling strategy is easy to
> > implement.
> > > > > > >   If we do it, it's necessary to define a series of flink
> options
> > > > > similar
> > > > > > >   to log options, such as: fileMax(how many files should be
> > > > retained),
> > > > > > >   fileSize(The max size each file), fileNamePatten (The suffix
> of
> > > > file
> > > > > > > name),
> > > > > > > 3. Check the file size periodica

Re: [DISCUSS] FLIP-390: Support System out and err to be redirected to LOG or discarded

2023-11-08 Thread Feng Jin

Hi, Rui.

Thank you for initiating this proposal.

I have a question regarding redirecting stdout and stderr to LOG:

Will they be written to the taskManager.log file by default or the
taskManager.out file?
If we can make taskmanager.out splittable and rolling, would it be easier
for users to use this feature?

Best,
Feng

On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal 
wrote:

> Hi Rui,
>
> Thanks for the proposal.
>
> The proposed solution of supporting System out and err to be redirected to
> LOG or discarded and introducing an enum and two options to manage this,
> seems reasonable.
>
> +1
>
> Thanks,
> Archit Goyal
>
>
> From: Piotr Nowojski 
> Date: Wednesday, November 8, 2023 at 7:38 AM
> To: dev@flink.apache.org 
> Subject: Re: [DISCUSS] FLIP-390: Support System out and err to be
> redirected to LOG or discarded
> Hi Rui,
>
> Thanks for the proposal.
>
> +1 I don't have any major comments :)
>
> One nit. In `SystemOutRedirectToLog` in this code:
>
>System.arraycopy(buf, count - LINE_SEPARATOR_LENGTH, bytes, 0,
> LINE_SEPARATOR_LENGTH);
> return Arrays.equals(LINE_SEPARATOR_BYTES, bytes)
>
> Is there a reason why you are suggesting to copy out bytes from `buf` to
> `bytes`,
> instead of using `Arrays.equals(int[] a, int aFromIndex, int aToIndex,
> int[] b, int bFromIndex, int bToIndex)`?
>
> Best,
> Piotrek
>
> śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com> napisał(a):
>
> > Hi all!
> >
> > I would like to start a discussion of FLIP-390: Support System out and
> err
> > to be redirected to LOG or discarded[1].
> >
> > In various production environments, either cloud native or physical
> > machines, the disk space that Flink TaskManager can use is limited.
> >
> > In general, the flink users shouldn't use the `System.out.println` in
> > production,
> > however this may happen when the number of Flink jobs and job developers
> > is very large. Flink job may use System.out to output a large amount of
> > data
> > to the taskmanager.out file. This file will not roll, it will always
> > increment.
> > Eventually the upper limit of what the TM can be used for is reached.
> >
> > We can support System out and err to be redirected to LOG or discarded,
> > the LOG can roll and won't increment forever.
> >
> > This feature is useful for SREs who maintain Flink environments, they can
> > redirect System.out to LOG by default. Although the cause of this problem
> > is
> > that the user's code is not standardized, for SRE, pushing users to
> modify
> > the code one by one is usually a very time-consuming operation. It's also
> > useful for job stability where System.out is accidentally used.
> >
> > Looking forward to your feedback, thanks~
> >
> > [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0
> 
> >
> > Best,
> > Rui
> >
>

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Feng Jin

Thanks for the great work! Congratulations


Best,
Feng Jin

On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:

> Congratulations, Well done!
>
> Best,
> Leonard
>
> On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
> wrote:
>
> > Thanks for the great work! Congrats all!
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Jing Ge  于2023年10月27日周五 00:16写道：
> >
> > > The Apache Flink community is very happy to announce the release of
> > Apache
> > > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> > series.
> > >
> > > Apache Flink® is an open-source unified stream and batch data
> processing
> > > framework for distributed, high-performing, always-available, and
> > accurate
> > > data applications.
> > >
> > > The release is available for download at:
> > > https://flink.apache.org/downloads.html
> > >
> > > Please check out the release blog post for an overview of the
> > improvements
> > > for this release:
> > >
> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > >
> > > The full release notes are available in Jira:
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352885
> > >
> > > We would like to thank all contributors of the Apache Flink community
> who
> > > made this release possible!
> > >
> > > Best regards,
> > > Konstantin, Qingsheng, Sergey, and Jing
> > >
> >
>

Re: [VOTE] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-22 Thread Feng Jin

+1(non-binding)


Best,
Feng


On Mon, Oct 23, 2023 at 11:58 AM Xuyang  wrote:

> +1(non-binding)
>
>
>
>
> --
>
> Best！
> Xuyang
>
>
>
>
>
> At 2023-10-23 11:38:15, "Jane Chan"  wrote:
> >Hi developers,
> >
> >Thanks for all the feedback on FLIP-373: Support Configuring Different
> >State TTLs using SQL Hint [1].
> >Based on the discussion [2], we have reached a consensus, so I'd like to
> >start a vote.
> >
> >The vote will last for at least 72 hours (Oct. 26th at 10:00 A.M. GMT)
> >unless there is an objection or insufficient votes.
> >
> >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> >[2] https://lists.apache.org/thread/3s69dhv3rp4s0kysnslqbvyqo3qf7zq5
> >
> >Best,
> >Jane
>

Re: [ANNOUNCE] New Apache Flink Committer - Ron Liu

2023-10-15 Thread Feng Jin

Congratulations, Ron!

Best,
Feng

On Mon, Oct 16, 2023 at 11:22 AM yh z  wrote:

> Congratulations, Ron!
>
> Best,
> Yunhong (SwuferHong)
>
> Yuxin Tan  于2023年10月16日周一 11:12写道：
>
> > Congratulations, Ron!
> >
> > Best,
> > Yuxin
> >
> >
> > Junrui Lee  于2023年10月16日周一 10:24写道：
> >
> > > Congratulations Ron !
> > >
> > > Best,
> > > Junrui
> > >
> > > Yun Tang  于2023年10月16日周一 10:22写道：
> > >
> > > > Congratulations, Ron!
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: yu zelin 
> > > > Sent: Monday, October 16, 2023 10:16
> > > > To: dev@flink.apache.org 
> > > > Cc: ron9@gmail.com 
> > > > Subject: Re: [ANNOUNCE] New Apache Flink Committer - Ron Liu
> > > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Yu Zelin
> > > >
> > > > > 2023年10月16日 09:56，Jark Wu  写道：
> > > > >
> > > > > Hi, everyone
> > > > >
> > > > > On behalf of the PMC, I'm very happy to announce Ron Liu as a new
> > Flink
> > > > > Committer.
> > > > >
> > > > > Ron has been continuously contributing to the Flink project for
> many
> > > > years,
> > > > > authored and reviewed a lot of codes. He mainly works on Flink SQL
> > > parts
> > > > > and drove several important FLIPs, e.g., USING JAR (FLIP-214),
> > Operator
> > > > > Fusion CodeGen (FLIP-315), Runtime Filter (FLIP-324). He has a
> great
> > > > > knowledge of the Batch SQL and improved a lot of batch performance
> in
> > > the
> > > > > past several releases. He is also quite active in mailing lists,
> > > > > participating in discussions and answering user questions.
> > > > >
> > > > > Please join me in congratulating Ron Liu for becoming a Flink
> > > Committer!
> > > > >
> > > > > Best,
> > > > > Jark Wu (on behalf of the Flink PMC)
> > > >
> > > >
> > >
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Jane Chan

2023-10-15 Thread Feng Jin

Congratulations Jane!

Best,
Feng

On Mon, Oct 16, 2023 at 11:23 AM yh z  wrote:

> Congratulations Jane!
>
> Best,
> Yunhong (swuferHong)
>
> Yuxin Tan  于2023年10月16日周一 11:11写道：
>
> > Congratulations Jane!
> >
> > Best,
> > Yuxin
> >
> >
> > xiangyu feng  于2023年10月16日周一 10:27写道：
> >
> > > Congratulations Jane!
> > >
> > > Best,
> > > Xiangyu
> > >
> > > Xuannan Su  于2023年10月16日周一 10:25写道：
> > >
> > > > Congratulations Jane!
> > > >
> > > > Best,
> > > > Xuannan
> > > >
> > > > On Mon, Oct 16, 2023 at 10:21 AM Yun Tang  wrote:
> > > > >
> > > > > Congratulations, Jane!
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Rui Fan <1996fan...@gmail.com>
> > > > > Sent: Monday, October 16, 2023 10:16
> > > > > To: dev@flink.apache.org 
> > > > > Cc: qingyue@gmail.com 
> > > > > Subject: Re: [ANNOUNCE] New Apache Flink Committer - Jane Chan
> > > > >
> > > > > Congratulations Jane!
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Mon, Oct 16, 2023 at 10:15 AM yu zelin 
> > > wrote:
> > > > >
> > > > > > Congratulations!
> > > > > >
> > > > > > Best,
> > > > > > Yu Zelin
> > > > > >
> > > > > > > 2023年10月16日 09:58，Jark Wu  写道：
> > > > > > >
> > > > > > > Hi, everyone
> > > > > > >
> > > > > > > On behalf of the PMC, I'm very happy to announce Jane Chan as a
> > new
> > > > Flink
> > > > > > > Committer.
> > > > > > >
> > > > > > > Jane started code contribution in Jan 2021 and has been active
> in
> > > the
> > > > > > Flink
> > > > > > > community since. She authored more than 60 PRs and reviewed
> more
> > > > than 40
> > > > > > > PRs. Her contribution mainly revolves around Flink SQL,
> including
> > > > Plan
> > > > > > > Advice (FLIP-280), operator-level state TTL (FLIP-292), and
> ALTER
> > > > TABLE
> > > > > > > statements (FLINK-21634). Jane participated deeply in
> development
> > > > > > > discussions and also helped answer user question emails. Jane
> was
> > > > also a
> > > > > > > core contributor of Flink Table Store (now Paimon) when the
> > project
> > > > was
> > > > > > in
> > > > > > > the early days.
> > > > > > >
> > > > > > > Please join me in congratulating Jane Chan for becoming a Flink
> > > > > > Committer!
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark Wu (on behalf of the Flink PMC)
> > > > > >
> > > > > >
> > > >
> > >
> >
>

Re: Re: [DISCUSS] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-10 Thread Feng Jin

Hi Jane,

Thank you for providing this explanation.

Another small question, since there is no exception thrown when the STATE
hint is set incorrectly,
should we somehow show that the TTL setting has taken effect?
For instance, exhibiting the TTL option within the operator's description?

Best,
Feng

On Tue, Oct 10, 2023 at 7:51 PM Xuyang  wrote:

> Hi, Jane.
>
>
> I think this syntax will be easier for users to set operator ttl. So big
> +1. I left some minor comments here.
>
>
> I notice that using STATE_TTL hints wrongly will not throw any exceptions.
> But it seems that in the current join hint scenario,
> if user uses an unknown table name as the chosen side, a validation
> exception will be thrown.
> Maybe we should distinguish which exceptions need to be thrown explicitly.
>
>
>
>
> --
>
> Best！
> Xuyang
>
>
>
>
>
> At 2023-10-10 18:23:55, "Jane Chan"  wrote:
> >Hi Feng,
> >
> >Thank you for your valuable comments. The reason for not including the
> >scenarios above is as follows:
> >
> >For <1>, the automatically inferred stateful operators are not easily
> >expressible in SQL. This issue was discussed in FLIP-292, and besides
> >ChangelogNormalize, SinkUpsertMateralizer also faces the same problem.
> >
> >For <2> and <3>, the challenge lies in internal implementation. During the
> >default_rewrite phase, the row_number expression in LogicalProject is
> >transformed into LogicalWindow by Calcite's
> >CoreRules#PROJECT_TO_LOGICAL_PROJECT_AND_WINDOW. However, CalcRelSplitter
> >does not pass the hints as an input argument when creating LogicalWindow,
> >resulting in the loss of the hint at this step. To support this, we may
> >need to rewrite some optimization rules in Calcite, which could be a
> >follow-up work if required.
> >
> >Best,
> >Jane
> >
> >On Tue, Oct 10, 2023 at 1:40 AM Feng Jin  wrote:
> >
> >> Hi Jane,
> >>
> >> Thank you for proposing this FLIP.
> >>
> >> I believe that this FLIP will greatly enhance the flexibility of setting
> >> state, and by setting different operators' TTL, it can also increase job
> >> stability, especially in regular join scenarios.
> >> The parameter design is very concise, big +1 for this, and it is also
> >> relatively easy to use for users.
> >>
> >>
> >> I have a small question: in the FLIP, it only mentions join and group.
> >> Should we also consider other scenarios?
> >>
> >> 1. the auto generated deduplicate operator[1].
> >> 2. the deduplicate query[2].
> >> 3. the topN query[3].
> >>
> >> [1]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
> >> [2]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
> >> [3]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/
> >>
> >>
> >> Best,
> >> Feng
> >>
> >> On Sun, Oct 8, 2023 at 5:53 PM Jane Chan  wrote:
> >>
> >> > Hi devs,
> >> >
> >> > I'd like to initiate a discussion on FLIP-373: Support Configuring
> >> > Different State TTLs using SQL Hint [1]. This proposal is on top of
> the
> >> > FLIP-292 [2] to address typical scenarios with unambiguous semantics
> and
> >> > hint propagation.
> >> >
> >> > I'm looking forward to your opinions!
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> >> > [2]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
> >> >
> >> > Best,
> >> > Jane
> >> >
> >>
>

Re: [DISCUSS] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-09 Thread Feng Jin

Hi Jane,

Thank you for proposing this FLIP.

I believe that this FLIP will greatly enhance the flexibility of setting
state, and by setting different operators' TTL, it can also increase job
stability, especially in regular join scenarios.
The parameter design is very concise, big +1 for this, and it is also
relatively easy to use for users.

I have a small question: in the FLIP, it only mentions join and group.
Should we also consider other scenarios?

1. the auto generated deduplicate operator[1].
2. the deduplicate query[2].
3. the topN query[3].

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
[2]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
[3]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/

Best,
Feng

On Sun, Oct 8, 2023 at 5:53 PM Jane Chan  wrote:

> Hi devs,
>
> I'd like to initiate a discussion on FLIP-373: Support Configuring
> Different State TTLs using SQL Hint [1]. This proposal is on top of the
> FLIP-292 [2] to address typical scenarios with unambiguous semantics and
> hint propagation.
>
> I'm looking forward to your opinions!
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
>
> Best,
> Jane
>

Re: [VOTE] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-10-09 Thread Feng Jin

+1 (non-binding)

Best,
Feng

On Mon, Oct 9, 2023 at 3:12 PM Yangze Guo  wrote:

> +1 (binding)
>
> Best,
> Yangze Guo
>
> On Mon, Oct 9, 2023 at 2:46 PM Yun Tang  wrote:
> >
> > +1 (binding)
> >
> > Best
> > Yun Tang
> > 
> > From: Weihua Hu 
> > Sent: Monday, October 9, 2023 12:03
> > To: dev@flink.apache.org 
> > Subject: Re: [VOTE] FLIP-367: Support Setting Parallelism for Table/SQL
> Sources
> >
> > +1 (binding)
> >
> > Best,
> > Weihua
> >
> >
> > On Mon, Oct 9, 2023 at 11:47 AM Shammon FY  wrote:
> >
> > > +1 (binding)
> > >
> > >
> > > On Mon, Oct 9, 2023 at 11:12 AM Benchao Li 
> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Zhanghao Chen  于2023年10月9日周一 10:20写道：
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Thanks for all the feedback on FLIP-367: Support Setting
> Parallelism
> > > for
> > > > Table/SQL Sources [1][2].
> > > > >
> > > > > I'd like to start a vote for FLIP-367. The vote will be open until
> Oct
> > > > 12th 12:00 PM GMT) unless there is an objection or insufficient
> votes.
> > > > >
> > > > > [1]
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429150
> > > > > [2]
> https://lists.apache.org/thread/gtpswl42jzv0c9o3clwqskpllnw8rh87
> > > > >
> > > > > Best,
> > > > > Zhanghao Chen
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Benchao Li
> > > >
> > >
>

Re: [VOTE] FLIP-314: Support Customized Job Lineage Listener

2023-09-25 Thread Feng Jin

+1(no-binding)


thanks for driving this proposal


Best,
Feng

On Mon, Sep 25, 2023 at 11:19 PM Jing Ge  wrote:

> +1(binding) Thanks!
>
> Best Regards,
> Jing
>
> On Sun, Sep 24, 2023 at 10:30 PM Shammon FY  wrote:
>
> > Hi devs,
> >
> > Thanks for all the feedback on FLIP-314: Support Customized Job Lineage
> > Listener [1] in thread [2].
> >
> > I would like to start a vote for it. The vote will be opened for at least
> > 72 hours unless there is an objection or insufficient votes.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener
> > [2] https://lists.apache.org/thread/wopprvp3ww243mtw23nj59p57cghh7mc
> >
> > Best,
> > Shammon FY
> >
>

Re: [DISCUSS] Implementing SQL remote functions

2023-09-20 Thread Feng Jin

Hi Alan

I believe that supporting asynchronous UDF is a valuable
feature. Currently, there is a similar FLIP[1] available:
Can this meet your needs?

[1].
https://cwiki.apache.org/confluence/display/FLINK/FLIP-313%3A+Add+support+of+User+Defined+AsyncTableFunction


Best,
Feng

On Thu, Sep 21, 2023 at 1:12 PM Alan Sheinberg
 wrote:

> Hi Ron,
>
> Thanks for your response.  I've answered some of your questions below.
>
> I think one solution is to support Mini-Batch Lookup Join by the framework
> > layer, do a RPC call by a batch input row, which can improve throughput.
>
>
> Would the idea be to collect a batch and then do a single RPC (or at least
> handle a number of rpcs in a single AsyncLookupFunction call)?  That is an
> interesting idea and could simplify things.  For our use cases,
> technically, I can write a AsyncLookupFunction and utilize
> AsyncWaitOperator using unbatched RPCs and do a Lookup Join without any
> issue. My hesitation is that I'm afraid that callers will find it
> unintuitive to join with a table where the underlying RPC is not being
> modeled in that manner.  For example, it could be a text classifier
> IS_POSITIVE_SENTIMENT(...) where there's no backing table, just
> computation.
>
> how does the remote function help to solve your problem?
>
>
> The problem is pretty open-ended.  There are jobs where you want to join
> data with some external data source and inject it into your pipeline, but
> others might also be offloading some computation to an external system.
> The external system might be owned by a different party, have different
> permissions, have different hardware to do a computation (e.g. train a
> model), or just block for a while.  The most intuitive invocation for this
> is just a scalar function in SQL.  You just want it to be able to run at a
> high throughput.
>
> Regarding implementing the Remote Function, can you go into more detail
> > about your idea, how we should support it, and how users should use it,
> > from API design to semantic explanation？and
>
>
> The unimplemented option I gave the most thought to is 3).  You can imagine
> an AsyncScalarFunction definition and example class like:
>
> public class AsyncScalarFunction extends UserDefinedFunction {
>   @Override public final FunctionKind getKind() {
> return FunctionKind.ASYNC_SCALAR;
>   }
>   @Override public TypeInference getTypeInference(DataTypeFactory
> typeFactory) {
>return TypeInferenceExtractor.forAsyncScalarFunction(typeFactory,
> getClass());
>   }
> }
>
> class MyScalarFunction extends AsyncScalarFunction {
>   // Eval method with a future to use as a callback, with arbitrary
> additional arguments
>   public void eval(CompletableFuture result, String input) {
> // Example which uses an async http client
> AsyncHttpClient httpClient = new AsyncHttpClient();
> // Do the request and then invoke the callback depending on the
> outcome.
> Future responseFuture = httpClient.doPOST(getRequestBody(
> input));
> responseFuture.handle((response, throwable) -> {
> if (throwable != null) {
>   result.completeExceptionally(throwable);
> } else {
>   result.complete(response.getBody());
> }
>});
>  }
>  ...
> }
>
> Then you can register it in your Flink program as with other UDFs and call
> it:
> tEnv.createTemporarySystemFunction("MY_FUNCTION", MyScalarFunction.class);
> TableResult result = tEnv.executeSql("SELECT MY_FUNCTION(input) FROM
> (SELECT i.input from Inputs i ORDER BY i.timestamp)");
>
> I know there are questions about SQL semantics to consider.  For example,
> does invocation of MY_FUNCTION preserve the order of the subquery above.
> To be SQL compliant, I believe it must, so any async request we send out
> must be output in order, regardless of when they complete.  There are
> probably other considerations as well.   This for example is implemented as
> an option already in AsyncWaitOperator.
>
> I could do a similar dive into option 2) if that would also be helpful,
> though maybe this is a good starting point for conversation.
>
> Hope that addressed your questions,
> Alan
>
> On Mon, Sep 18, 2023 at 6:51 PM liu ron  wrote:
>
> > Hi, Alan
> >
> > Thanks for driving this proposal. It sounds interesting.
> > Regarding implementing the Remote Function, can you go into more detail
> > about your idea, how we should support it, and how users should use it,
> > from API design to semantic explanation？and how does the remote function
> > help to solve your problem?
> >
> > I understand that your core pain point is that there are performance
> issues
> > with too many RPC calls. For the three solutions you have explored.
> > Regarding the Lookup Join Cons,
> >
> > >> *Lookup Joins:*
> > Pros:
> > - Part of the Flink codebase
> > - High throughput
> > Cons:
> > - Unintuitive syntax
> > - Harder to do multiple remote calls per input row
> >
> > I think one solution is to support Mini-Batch Lookup Join by the
> framework
> > layer, do a RPC

Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-16 Thread Feng Jin

Hi, Zhanghao

Thank you for proposing this FLIP, it is a very meaningful feature.

I agree that currently we may only consider the parallelism setting of the
source itself. If we consider the parallelism setting of other operators,
it may make the entire design more complex.

Regarding the situation where the parallelism of the source is different
from that of downstream tasks, I did not find a more detailed description
in FLIP.

By default, if the parallelism between two operators is different, the
rebalance partitioner will be used.
But in the SQL scenario, I believe that we should keep the behavior of
parallelism setting consistent with that of the sink.

1. When the source only generates insert-only data, if there is a mismatch
in parallelism between the source and downstream operators, rebalance is
used by default.

2. When the source generates update and delete data, we should require the
source to configure a primary key and then build a hash partitioner based
on that primary key.

WDYT ？


Best,
Feng


On Sat, Sep 16, 2023 at 5:58 PM Jane Chan  wrote:

> Hi Zhanghao,
>
> Thanks for the explanation.
>
> For Q1, I think the key lies in determining the boundary where the chain
> should be broken. However, this boundary is ultimately determined by the
> specific requirements of each user query.
>
> The most straightforward approach is breaking the chain after the source
> operator, even though it involves a tradeoff. This is because there may be
> instances of `StreamExecWatermarkAssigner`, `StreamExecMiniBatchAssigner`,
> or `StreamExecChangelogNormalize` occurring before the `StreamExecCalc`
> node, and it would be complex and challenging to enumerate all possible
> match patterns.
>
> A more complex workaround would be to provide an entry point for users to
> configure the specific operator that should serve as the breakpoint.
> Meanwhile, this would further increase the complexity of this FLIP.
>
> However, if the parallelism of each operator can be configured (in the
> future), then this problem would not exist (it might be beyond the scope of
> discussion for this FLIP).
>
> I personally lean towards keeping the FLIP concise and focused by choosing
> the most straightforward approach. I would also like to hear other's
> opinions.
>
> Best,
> Jane
>
> On Sat, Sep 16, 2023 at 10:21 AM Yun Tang  wrote:
>
> > Hi Zhanghao,
> >
> > Certainly, I think we shall leave this FLIP focus on setting the source
> > parallelism via DDL's properties. I just want to clarify that setting
> > parallelism for individual operators is also profitable from my
> experience,
> > which is slighted in your FLIP.
> >
> > @ConradJam BTW, compared with SQL hint, I think using `scan.parallelism`
> > is better to align with current `sink.parallelism`. And once we introduce
> > such option, we can also use SQL hint of dynamic table options[1] to
> > configure the source parallelism.
> >
> > [1]
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options
> >
> >
> > Best
> > Yun Tang
> > 
> > From: ConradJam 
> > Sent: Friday, September 15, 2023 22:52
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for
> Table/SQL
> > Sources
> >
> > + 1 Thanks for the FLIP and the discussion. I would like to ask whether
> to
> > use SQL Hint syntax to set this parallelism？
> >
> > Martijn Visser  于2023年9月15日周五 20:52写道：
> >
> > > Hi everyone,
> > >
> > > Thanks for the FLIP and the discussion. I find it exciting. Thanks for
> > > pushing for this.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Fri, Sep 15, 2023 at 2:25 PM Chen Zhanghao <
> zhanghao.c...@outlook.com
> > >
> > > wrote:
> > >
> > > > Hi Jane,
> > > >
> > > > Thanks for the valuable suggestions.
> > > >
> > > > For Q1, it's indeed an issue. Some possible ideas include
> introducing a
> > > > fake transformation after the source that takes the global default
> > > > parallelism, or simply make exec nodes to take the global default
> > > > parallelism, but both ways prevent potential chaining opportunity and
> > I'm
> > > > not sure if that's good to go. We'll need to give deeper thoughts in
> it
> > > and
> > > > polish our proposal. We're also more than glad to hear your inputs on
> > it.
> > > >
> > > > For Q2, scan.parallelism will take high precedence, as the more
> > specific
> > > > config should take higher precedence.
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > 发件人: Jane Chan 
> > > > 发送时间: 2023年9月15日 11:56
> > > > 收件人: dev@flink.apache.org 
> > > > 抄送: dewe...@outlook.com 
> > > > 主题: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL
> > > > Sources
> > > >
> > > > Hi, Zhanghao, Dewei,
> > > >
> > > > Thanks for initiating this discussion. This feature is valuable in
> > > > providing more flexibility for performance tuning for SQL pipelines.
> > > >
> > > > Here are

Re: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and support the Standalone Autoscaler

2023-09-13 Thread Feng Jin

Thanks for driving this, looking forward to this feature.


+1 (non-binding)

Best,
Feng

On Wed, Sep 13, 2023 at 9:11 PM Chen Zhanghao 
wrote:

> Thanks for driving this. +1 (non-binding)
>
> Best,
> Zhanghao Chen
> 
> 发件人: Rui Fan <1996fan...@gmail.com>
> 发送时间: 2023年9月13日 10:52
> 收件人: dev 
> 主题: [VOTE] FLIP-334: Decoupling autoscaler and kubernetes and support the
> Standalone Autoscaler
>
> Hi all,
>
> Thanks for all the feedback about the FLIP-334:
> Decoupling autoscaler and kubernetes and
> support the Standalone Autoscaler[1].
> This FLIP was discussed in [2].
>
> I'd like to start a vote for it. The vote will be open for at least 72
> hours (until Sep 16th 11:00 UTC+8) unless there is an objection or
> insufficient votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-334+%3A+Decoupling+autoscaler+and+kubernetes+and+support+the+Standalone+Autoscaler
> [2] https://lists.apache.org/thread/kmm03gls1vw4x6vk1ypr9ny9q9522495
>
> Best,
> Rui
>

[jira] [Created] (FLINK-33070) Add doc for 'unnest'

2023-09-11 Thread Feng Jin (Jira)

Feng Jin created FLINK-33070:


 Summary: Add doc for 'unnest' 
 Key: FLINK-33070
 URL: https://issues.apache.org/jira/browse/FLINK-33070
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Reporter: Feng Jin


Row and column transformation is a commonly used approach. In Flink SQL, we can 
use unnest for this purpose.

However, the usage and support of unnest are not explained in the documentation.

 

 I think we can at least add it to the built-in functions section 
(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/#scalar-functions)
 ， or we provide some examples. 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS][FLINK-31788][FLINK-33015] Add back Support emitUpdateWithRetract for TableAggregateFunction

2023-09-10 Thread Feng Jin

Thanks Jane for following up on this issue!

+1 for adding it back first.

Supporting emitUpdateWithRetract for TableAggregateFunction is a good
feature, we should support it unless there are better alternatives.


Best,
Feng

On Thu, Sep 7, 2023 at 11:01 PM Lincoln Lee  wrote:

> Thanks to Jane for following up on this issue!  +1 for adding it back
> first.
>
> For the deprecation, considering that users aren't usually motivated to
> upgrade to a major version (1.14, from two years ago, wasn't that old,
> which may be
> part of the reason for not receiving more feedback), I'd recommend holding
> off on removing `TableAggregateFunction` until we have a replacement for
> it,
> e.g., user-defined-operator as Jark mentioned or something else.
>
> Best,
> Lincoln Lee
>
>
> Best,
> Lincoln Lee
>
>
> Jark Wu  于2023年9月7日周四 21:30写道：
>
> > +1 to fix it first.
> >
> > I also agree to deprecate it if there are few people using it,
> > but this should be another discussion thread within dev+user ML.
> >
> > In the future, we are planning to introduce user-defined-operator
> > based on the TVF functionality which I think can fully subsume
> > the UDTAG, cc @Timo Walther .
> >
> > Best,
> > Jark
> >
> > On Thu, 7 Sept 2023 at 11:44, Jane Chan  wrote:
> >
> > > Hi devs,
> > >
> > > Recently, we noticed an issue regarding a feature regression related to
> > > Table API. `org.apache.flink.table.functions.TableAggregateFunction`
> > > provides an API `emitUpdateWithRetract` [1] to cope with updated
> values,
> > > but it's not being called in the code generator. As a result, even if
> > users
> > > override this method, it does not work as intended.
> > >
> > > This issue has been present since version 1.15 (when the old planner
> was
> > > deprecated), but surprisingly, only two users have raised concerns
> about
> > it
> > > [2][3].
> > >
> > > So, I would like to initiate a discussion to bring it back. Of course,
> if
> > > few users use it, we can also consider deprecating it.
> > >
> > > [1]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/udfs/#retraction-example
> > > [2] https://lists.apache.org/thread/rnvw8k3636dqhdttpmf1c9colbpw9svp
> > > [3]
> https://www.mail-archive.com/user-zh@flink.apache.org/msg15230.html
> > >
> > > Best,
> > > Jane
> > >
> >
>

[jira] [Created] (FLINK-32976) NullpointException when starting flink cluster

2023-08-28 Thread Feng Jin (Jira)

Feng Jin created FLINK-32976:


 Summary: NullpointException when starting flink cluster
 Key: FLINK-32976
 URL: https://issues.apache.org/jira/browse/FLINK-32976
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration
Affects Versions: 1.17.1
Reporter: Feng Jin


The error message as follows: 

 

 
{code:java}
//代码占位符
Caused by: java.ang.NullPointerExceptionat org.apache.flink. runtime. 
security.token.hadoop.HadoopFSDelegationTokenProvider.getFileSystemsToAccess(HadoopFSDelegationTokenProvider.java:173)~[flink-dist-1.17.1.jar:1.17.1]at
 
org.apache.flink.runtime.security.token.hadoop.HadoopFSDelegationTokenProvidertionTokens$1(HadoopFSDelegationTokenProvider.java:113)
 ~[flink-dist-1.17.1.jar:1.17.1at 
java.security.AccessController.doprivileged(Native Method)~[?:1.8.0 281]at 
javax.security.auth.Subject.doAs(Subject.java:422)~[?:1.8.0 281]at org. 
apache.hadoop . security.UserGroupInformation.doAs(UserGroupInformation. 
java:1876) 
~flink-shacd-hadoop-3-uber-3.1.1.7.2.1.0-327-9.0.jar:3.1.1.7.2.1.0-327-9.0]at 
org. apache.flink. runtime.security.token .hadoop 
.HadoopFSDelegationTokenProvider.obtainDelegationTcens(HadoopFSDelegationTokenProvider.java:108)~flink-dist-1.17.1.jar:1.17.1]at
 org.apache.flink. runtime. security.token.DefaultDelegationTokenManager . 
lambda$obtainDelSAndGetNextRenewal$1(DefaultDelegationTokenManager .java:264)~ 
flink-dist-1.17.1.jar:1.17.1]at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
~?:1.8.0 281at 
java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1628)~[?:1.8.0 
281]at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)~?:1.8.0 
281]at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) 
~?:1.8.0 281at 
java,util.stream.Reduce0ps$Reduce0p.evaluateSequential(Reduce0ps.java:708)~?:1.8.0
 281]at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)~[?:1.8.0 
281]at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:479) 
~?:1.8.0 281at 
java.util.stream.ReferencePipeline.min(ReferencePipeline.java:520)~?:1.8.0 
281at org. apache. flink. runtime. security.token.DefaultDelegationTokenManager 
.obtainDelegationTokensAndGeNextRenewal(DefaultDelegationTokenManager 
.java:286)~[flink-dist-1.17.1.jar:1.17.1at org.apache. flink.runtime. 
security.token.DefaultDelegationTokenManager. 
obtainDelegationTokens(DefaltDelegationTokenManager.java:242)~[flink-dist-1.17.1.jar:1.17.1]at
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializes@) 
~[flink-dist-1.17.1.jar:1.17.1]at 
org.apache.flink.runtime.entrypoint.clusterEntrypoint.nk-dist-1.17.1.jar:1.17.1]at
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint:232) 
~[flink-dist-1.17.1.jar:1.17.1]at 
java.security.AccessController.doPrivileged(Native Method) ~[?:1.8. 281]at 
javax.security.auth.Subject.doAs(Subject.java:422)~?:1.8.0 281]at org. 
apache.hadoop . security.UserGroupInformation. doAs (UserGroupInformation. 
java:1876)~[flink-shadd-hadoop-3-uber-3.1.1.7.2.1.0-327-9.0.jar:3.1.1.7.2.1.0-327-9.0]at
 org.apache.flink.runtime.security. contexts 
.HadoopSecurityContext.runSecured(HadoopSecurijava:41) 
~[flink-dist-1.17.1.jar:1.17.1at org. apache.flink. runtime. entrypoint. 
ClusterEntrypoint . startCluster(clusterEntrypoint. 
java:229)link-dist-1.17.1.jar:1.17.1]...2 more{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl

2023-08-04 Thread Feng Jin

Congratulations, Matthias!

Best regards

Feng

On Fri, Aug 4, 2023 at 4:29 PM weijie guo  wrote:

> Congratulations, Matthias!
>
> Best regards,
>
> Weijie
>
>
> Wencong Liu  于2023年8月4日周五 15:50写道：
>
> > Congratulations, Matthias!
> >
> > Best,
> > Wencong Liu
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2023-08-04 11:18:00, "Xintong Song"  wrote:
> > >Hi everyone,
> > >
> > >On behalf of the PMC, I'm very happy to announce that Matthias Pohl has
> > >joined the Flink PMC!
> > >
> > >Matthias has been consistently contributing to the project since Sep
> 2020,
> > >and became a committer in Dec 2021. He mainly works in Flink's
> distributed
> > >coordination and high availability areas. He has worked on many FLIPs
> > >including FLIP195/270/285. He helped a lot with the release management,
> > >being one of the Flink 1.17 release managers and also very active in
> Flink
> > >1.18 / 2.0 efforts. He also contributed a lot to improving the build
> > >stability.
> > >
> > >Please join me in congratulating Matthias!
> > >
> > >Best,
> > >
> > >Xintong (on behalf of the Apache Flink PMC)
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu

2023-08-04 Thread Feng Jin

Congratulations Weihua!

Best regards,

Feng

On Fri, Aug 4, 2023 at 4:28 PM weijie guo  wrote:

> Congratulations Weihua!
>
> Best regards,
>
> Weijie
>
>
> Lijie Wang  于2023年8月4日周五 15:28写道：
>
> > Congratulations, Weihua!
> >
> > Best,
> > Lijie
> >
> > yuxia  于2023年8月4日周五 15:14写道：
> >
> > > Congratulations, Weihua!
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > - 原始邮件 -
> > > 发件人: "Yun Tang" 
> > > 收件人: "dev" 
> > > 发送时间: 星期五, 2023年 8 月 04日 下午 3:05:30
> > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu
> > >
> > > Congratulations, Weihua!
> > >
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Jark Wu 
> > > Sent: Friday, August 4, 2023 15:00
> > > To: dev@flink.apache.org 
> > > Subject: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu
> > >
> > > Congratulations, Weihua!
> > >
> > > Best,
> > > Jark
> > >
> > > On Fri, 4 Aug 2023 at 14:48, Yuxin Tan  wrote:
> > >
> > > > Congratulations Weihua!
> > > >
> > > > Best,
> > > > Yuxin
> > > >
> > > >
> > > > Junrui Lee  于2023年8月4日周五 14:28写道：
> > > >
> > > > > Congrats, Weihua!
> > > > > Best,
> > > > > Junrui
> > > > >
> > > > > Geng Biao  于2023年8月4日周五 14:25写道：
> > > > >
> > > > > > Congrats, Weihua!
> > > > > > Best,
> > > > > > Biao Geng
> > > > > >
> > > > > > 发送自 Outlook for iOS
> > > > > > 
> > > > > > 发件人: 周仁祥 
> > > > > > 发送时间: Friday, August 4, 2023 2:23:42 PM
> > > > > > 收件人: dev@flink.apache.org 
> > > > > > 抄送: Weihua Hu 
> > > > > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu
> > > > > >
> > > > > > Congratulations, Weihua~
> > > > > >
> > > > > > > 2023年8月4日 14:21，Sergey Nuyanzin  写道：
> > > > > > >
> > > > > > > Congratulations, Weihua!
> > > > > > >
> > > > > > > On Fri, Aug 4, 2023 at 8:03 AM Chen Zhanghao <
> > > > > zhanghao.c...@outlook.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Congratulations, Weihua!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Zhanghao Chen
> > > > > > >> 
> > > > > > >> 发件人: Xintong Song 
> > > > > > >> 发送时间: 2023年8月4日 11:18
> > > > > > >> 收件人: dev 
> > > > > > >> 抄送: Weihua Hu 
> > > > > > >> 主题: [ANNOUNCE] New Apache Flink Committer - Weihua Hu
> > > > > > >>
> > > > > > >> Hi everyone,
> > > > > > >>
> > > > > > >> On behalf of the PMC, I'm very happy to announce Weihua Hu as
> a
> > > new
> > > > > > Flink
> > > > > > >> Committer!
> > > > > > >>
> > > > > > >> Weihua has been consistently contributing to the project since
> > May
> > > > > > 2022. He
> > > > > > >> mainly works in Flink's distributed coordination areas. He is
> > the
> > > > main
> > > > > > >> contributor of FLIP-298 and many other improvements in
> > large-scale
> > > > job
> > > > > > >> scheduling and improvements. He is also quite active in
> mailing
> > > > lists,
> > > > > > >> participating discussions and answering user questions.
> > > > > > >>
> > > > > > >> Please join me in congratulating Weihua!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >>
> > > > > > >> Xintong (on behalf of the Apache Flink PMC)
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best regards,
> > > > > > > Sergey
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Support read ROWKIND metadata by ROW_KIND() function

2023-07-24 Thread Feng Jin

Hi casel,

Thank you for initiating the discussion about RowKind meta. I believe that
in certain scenarios, it is necessary to expose RowKind. We also have
similar situations internally:

In simple terms, we need to be able to control the behavior of RowKind in
both Source and Sink:
- When reading data from the Source, filter out unnecessary delete records.
- When writing data to the Sink, filter out unnecessary delete records.


Firstly, I don't think it is appropriate to expose RowKind at the Flink SQL
framework level as it may lead to ambiguity in certain semantic scenarios.
Therefore, I am more inclined towards perceiving specific information at
the format layer.

Here is my preliminary proposal:
Introduce a proxy format where we can freely control which RowKinds are
needed and which ones are not. At the same time, we can also expose RowKind.

1. For your scenario one, using a proxy format allows us to do this:

```

CREATE TABLE kafka_source (
f1 VARCHAR,
f2 VARCHAR
) with (
'connector' = 'kafka',
'format' = 'proxy',
'proxy.format' = 'canal-json',
'proxy.filter' = 'DELETE,UPDATE_BEFORE'
);


CREATE TABLE kafka_sink (
f1 VARCHAR,
f2 VARCHAR
) with (
'connector' = 'kafka',
'format' = 'json'
);


INSERT INTO kafka_sink select * from kafka_source;

```

By using a proxy, we can filter out the unwanted RowKind data types and
transform the original RetractStream into an AppendOnly Stream.


2. For scenario 2, we can  also use the proxy-format to achieve this.

```

CREATE TABLE kafka_source (
f1 VARCHAR,
f2 VARCHAR
) with (
'connector' = 'kafka',
'format' = 'canal-json'
);


CREATE TABLE kafka_sink (
f1 VARCHAR,
f2 VARCHAR
) with (
'connector' = 'kafka',
'format' = 'proxy',
'proxy.format' = 'json',
'proxy.filter' = 'DELETE,UPDATE_BEFORE'
);


INSERT INTO kafka_sink select * from kafka_source;

```

Through the proxy format, the original appendOnly sink is enabled to
support sink retract stream.


This is my preliminary idea. There are probably many more details to
consider, but the overall concept is to use proxy format to implement some
logic that we want to achieve without affecting the original format.


Best,
Feng


On Wed, Jul 19, 2023 at 10:06 PM casel.chen  wrote:

> CDC format like debezium-json and canal-json support read ROWKIND metadata.
>
>
> 1. Our first scenario is syncing data of operational tables into our
> streaming warehouse.
> All operational data in mysql should NOT be physically deleted, so we use
> "is_deleted" column to do logical delete, and there should NOT be any
> delete operations happen on our streaming warehouse.
> But as data grows up quickly we need to delete old data such as half year
> ago in operational table to keep table size manageable and ensure the query
> performance not to be decreased. These deleted records for maintain purpose
> should be not synced into our streaming warehouse. So we have to filter our
> them in our flink sql jobs. But currently it is not convenient to do
> ROWKIND filtering. That is why I ask flink support read ROWKIND metadata by
> ROW_KIND() function. Then we can use the following flink sql to do
> filtering. For example:
>
> create table customer_source (
>   id BIGINT PRIMARY KEY NOT ENFORCED,
>   name STRING,
>   region STRING
>) with (
>   'connector' = 'kafka',
>   'format' = 'canal-json',
>   ...
>);
>
>
>create table customer_sink (
>   id BIGINT PRIMARY KEY NOT ENFORCED,
>   name STRING,
>   region STRING
>) with (
>   'connector' = 'paimon'
>   ...
>);
>
>
>   INSERT INTO customer_sink SELECT * FROM customer_source WHERE ROW_KIND()
> <> '-D';
>
>
> 2. Out secondary scenario is we need sink aggregation result into MQ which
> does NOT support retract data. Although flink provide upsert kafka
> connector, but unfortunetly our sink system is NOT kafka, so we have to
> write customized connector like upsert-kafka again. If flink sql support
> filter data by ROWKIND, we don't need write any more upsert-xxx connector.
> For example,
>
>create table customer_source (
>   id BIGINT PRIMARY KEY NOT ENFORCED,
>   name STRING,
>   region STRING
>) with (
>   'connector' = 'kafka',
>   'format' = 'canal-json',
>   ...
>);
>
>
>create table customer_agg_sink (
>   region STRING,
>   cust_count BIGINT
>) with (
>   'connector' = 'MQ',
>   'format' = 'json',
>   ...
>);
>
>
>INSERT INTO customer_agg_sink SELECT * FROM (SELECT region, count(1) as
> cust_count  from customer_source group by region) t WHERE ROW_KIND() <>
> '-U' AND ROW_KIND() <> '-D';
>
>
> How do you think? Looking forward to your feedback, thanks!

Re: [ANNOUNCE] New Apache Flink Committer - Yong Fang

2023-07-24 Thread Feng Jin

Congratulations, Yong Fang

Best,
Feng


On Mon, Jul 24, 2023 at 3:12 PM Leonard Xu  wrote:

> Congratulations, Yong Fang
>
> Best,
> Leonard
>
> > On Jul 24, 2023, at 2:01 PM, yuxia  wrote:
> >
> > Congrats, Shammon!
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "Benchao Li" 
> > 收件人: "dev" 
> > 抄送: "Shammon FY" 
> > 发送时间: 星期一, 2023年 7 月 24日 下午 1:23:55
> > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Yong Fang
> >
> > Congratulations, Yong! Well deserved!
> >
> > Yangze Guo  于2023年7月24日周一 12:16写道：
> >
> >> Congrats, Yong!
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Mon, Jul 24, 2023 at 12:02 PM xiangyu feng 
> >> wrote:
> >>>
> >>> Congratulations, Yong!
> >>>
> >>> Best,
> >>> Xiangyu
> >>>
> >>> liu ron  于2023年7月24日周一 11:48写道：
> >>>
>  Congratulations,
> 
>  Best,
>  Ron
> 
>  Qingsheng Ren  于2023年7月24日周一 11:18写道：
> 
> > Congratulations and welcome aboard, Yong!
> >
> > Best,
> > Qingsheng
> >
> > On Mon, Jul 24, 2023 at 11:14 AM Chen Zhanghao <
>  zhanghao.c...@outlook.com>
> > wrote:
> >
> >> Congrats, Shammon!
> >>
> >> Best,
> >> Zhanghao Chen
> >> 
> >> 发件人: Weihua Hu 
> >> 发送时间: 2023年7月24日 11:11
> >> 收件人: dev@flink.apache.org 
> >> 抄送: Shammon FY 
> >> 主题: Re: [ANNOUNCE] New Apache Flink Committer - Yong Fang
> >>
> >> Congratulations!
> >>
> >> Best,
> >> Weihua
> >>
> >>
> >> On Mon, Jul 24, 2023 at 11:04 AM Paul Lam 
>  wrote:
> >>
> >>> Congrats, Shammon!
> >>>
> >>> Best,
> >>> Paul Lam
> >>>
>  2023年7月24日 10:56，Jingsong Li  写道：
> 
>  Shammon
> >>>
> >>>
> >>
> >
> 
> >>
> >
> >
> > --
> >
> > Best,
> > Benchao Li
>
>

[jira] [Created] (FLINK-32653) Add doc for catalog store

2023-07-23 Thread Feng Jin (Jira)

Feng Jin created FLINK-32653:


 Summary: Add doc for catalog store
 Key: FLINK-32653
 URL: https://issues.apache.org/jira/browse/FLINK-32653
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32647) Support config catalog store in python table environment

2023-07-22 Thread Feng Jin (Jira)

Feng Jin created FLINK-32647:


 Summary: Support config catalog store in python table environment
 Key: FLINK-32647
 URL: https://issues.apache.org/jira/browse/FLINK-32647
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32569) Fix the incomplete serialization of ResolvedCatalogTable caused by the newly introduced time travel interface

2023-07-10 Thread Feng Jin (Jira)

Feng Jin created FLINK-32569:


 Summary: Fix the incomplete serialization of ResolvedCatalogTable 
caused by the newly introduced  time travel interface
 Key: FLINK-32569
 URL: https://issues.apache.org/jira/browse/FLINK-32569
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems Award

2023-07-04 Thread Feng Jin

Congratulations!

Best,
Feng Jin



On Tue, Jul 4, 2023 at 4:13 PM Yuxin Tan  wrote:

> Congratulations!
>
> Best,
> Yuxin
>
>
> Dunn Bangui  于2023年7月4日周二 16:04写道：
>
> > Congratulations!
> >
> > Best,
> > Bangui Dunn
> >
> > Yangze Guo  于2023年7月4日周二 15:59写道：
> >
> > > Congrats everyone!
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Jul 4, 2023 at 3:53 PM Rui Fan <1996fan...@gmail.com> wrote:
> > > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Rui Fan
> > > >
> > > > On Tue, Jul 4, 2023 at 2:08 PM Zhu Zhu  wrote:
> > > >
> > > > > Congratulations everyone!
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Hang Ruan  于2023年7月4日周二 14:06写道：
> > > > > >
> > > > > > Congratulations!
> > > > > >
> > > > > > Best,
> > > > > > Hang
> > > > > >
> > > > > > Jingsong Li  于2023年7月4日周二 13:47写道：
> > > > > >
> > > > > > > Congratulations!
> > > > > > >
> > > > > > > Thank you! All of the Flink community!
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > > > On Tue, Jul 4, 2023 at 1:24 PM tison 
> > wrote:
> > > > > > > >
> > > > > > > > Congrats and with honor :D
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > > >
> > > > > > > > Mang Zhang  于2023年7月4日周二 11:08写道：
> > > > > > > >
> > > > > > > > > Congratulations!--
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Mang Zhang
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 在 2023-07-04 01:53:46，"liu ron"  写道：
> > > > > > > > > >Congrats everyone
> > > > > > > > > >
> > > > > > > > > >Best,
> > > > > > > > > >Ron
> > > > > > > > > >
> > > > > > > > > >Jark Wu  于2023年7月3日周一 22:48写道：
> > > > > > > > > >
> > > > > > > > > >> Congrats everyone!
> > > > > > > > > >>
> > > > > > > > > >> Best,
> > > > > > > > > >> Jark
> > > > > > > > > >>
> > > > > > > > > >> > 2023年7月3日 22:37，Yuval Itzchakov 
> 写道：
> > > > > > > > > >> >
> > > > > > > > > >> > Congrats team!
> > > > > > > > > >> >
> > > > > > > > > >> > On Mon, Jul 3, 2023, 17:28 Jing Ge via user <
> > > > > > > u...@flink.apache.org
> > > > > > > > > >> <mailto:u...@flink.apache.org>> wrote:
> > > > > > > > > >> >> Congratulations!
> > > > > > > > > >> >>
> > > > > > > > > >> >> Best regards,
> > > > > > > > > >> >> Jing
> > > > > > > > > >> >>
> > > > > > > > > >> >>
> > > > > > > > > >> >> On Mon, Jul 3, 2023 at 3:21 PM yuxia <
> > > > > > > luoyu...@alumni.sjtu.edu.cn
> > > > > > > > > >> <mailto:luoyu...@alumni.sjtu.edu.cn>> wrote:
> > > > > > > > > >> >>> Congratulations!
> > > > > > > > > >> >>>
> > > > > > > > > >> >>> Best regards,
> > > > > > > > > >> >>> Yuxia
> > > > > > > > > >> >>

[jira] [Created] (FLINK-32475) Add doc for time travel

2023-06-28 Thread Feng Jin (Jira)

Feng Jin created FLINK-32475:


 Summary: Add doc for time travel
 Key: FLINK-32475
 URL: https://issues.apache.org/jira/browse/FLINK-32475
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32474) Support time travel in table planner

2023-06-28 Thread Feng Jin (Jira)

Feng Jin created FLINK-32474:


 Summary: Support time travel in table planner 
 Key: FLINK-32474
 URL: https://issues.apache.org/jira/browse/FLINK-32474
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32473) Introduce base interfaces for time travel

2023-06-28 Thread Feng Jin (Jira)

Feng Jin created FLINK-32473:


 Summary: Introduce base interfaces for time travel
 Key: FLINK-32473
 URL: https://issues.apache.org/jira/browse/FLINK-32473
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32472) FLIP-308: Support Time Travel

2023-06-28 Thread Feng Jin (Jira)

Feng Jin created FLINK-32472:


 Summary: FLIP-308: Support Time Travel
 Key: FLINK-32472
 URL: https://issues.apache.org/jira/browse/FLINK-32472
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Feng Jin


Umbrella issue for 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Feng Jin

+1 (no-binding)


Best
Feng

On Wed, Jun 28, 2023 at 11:03 PM Jing Ge  wrote:

> +1(binding)
>
> On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:
>
> > +1 (no-binding)
> >
> >
> > --
> >
> > Best regards,
> > Mang Zhang
> >
> >
> >
> >
> >
> > At 2023-06-28 17:48:15, "yuxia"  wrote:
> > >Hi everyone,
> > >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> > SELECT statement[1]. Based on the discussion [2], we have come to a
> > consensus, so I would like to start a vote.
> > >The vote will be open for at least 72 hours (until July 3th, 10:00AM
> GMT)
> > unless there is an objection or an insufficient number of votes.
> > >
> > >
> > >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> > >[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> > >
> > >
> > >Best regards,
> > >Yuxia
> >
>

Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener

2023-06-27 Thread Feng Jin

Hi Shammon
Thank you for proposing this FLIP. I think the Flink Job lineage is a very
useful feature.
I have few question:

1. For DataStream Jobs, users need to set up lineage relationships when
building DAGs for their custom sources and sinks.
However, for some common connectors such as Kafka Connector and JDBC
Connector, we can add a lineage interface like `supportReportLineage`, so
that these connectors can implement it.
This way, in the scenario of DataStream Jobs, lineages can be automatically
reported. What do you think?


2. From the current design, it seems that we need to analyze column lineage
through pipeline. As far as I know, it is relatively easy to obtain column
lineage through Calcite MetaQuery API.
Would you consider using this approach? Or do we need to implement another
parsing process based on the pipeline?
```
RelMetadataQuery metadataQuery = relNode.getCluster().getMetadataQuery();
metadataQuery.getColumnOrigins(inputRel, i);
```
Best,
Feng


On Sun, Jun 25, 2023 at 8:06 PM Shammon FY  wrote:

> Hi yuxia and Yun,
>
> Thanks for your input.
>
> For yuxia:
> > 1: What kinds of JobStatus will the `JobExecutionStatusEven` including?
>
> At present, we only need to notify the listener when a job goes to
> termination, but I think it makes sense to add generic `oldStatus` and
> `newStatus` in the listener and users can update the job state in their
> service as needed.
>
> > 2: I'm really confused about the `config()` included in `LineageEntity`,
> where is it from and what is it for ?
>
> The `config` in `LineageEntity` is used for users to get options for source
> and sink connectors. As the examples in the FLIP, users can add
> server/group/topic information in the config for kafka and create lineage
> entities for `DataStream` jobs, then the listeners can get this information
> to identify the same connector in different jobs. Otherwise, the `config`
> in `TableLineageEntity` will be the same as `getOptions` in
> `CatalogBaseTable`.
>
> > 3: Regardless whether `inputChangelogMode` in `TableSinkLineageEntity` is
> needed or not, since `TableSinkLineageEntity` contains
> `inputChangelogMode`, why `TableSourceLineageEntity` don't contain
> changelogmode?
>
> At present, we do not actually use the changelog mode. It can be deleted,
> and I have updated FLIP.
>
> > Btw, since there're a lot interfaces proposed, I think it'll be better to
> give an example about how to implement a listener in this FLIP to make us
> know better about the interfaces.
>
> I have added the example in the FLIP and the related interfaces and
> examples are in branch [1].
>
> For Yun:
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP does not touch them, and will them become part of the
> List sources() or adding another interface?
>
> You're right, currently lookup join dim tables were not considered in the
> 'proposed changed' section of this FLIP. But the interface for lineage is
> universal and we can give `TableLookupSourceLineageEntity` which implements
> `TableSourceLineageEntity` in the future without modifying the public
> interface.
>
> > By the way, if you want to focus on job lineage instead of data column
> lineage in this FLIP, why we must introduce so many column-lineage related
> interface here?
>
> The lineage information in SQL jobs includes table lineage and column
> lineage. Although SQL jobs currently do not support column lineage, we
> would like to support this in the next step. So we have comprehensively
> considered the table lineage and column lineage interfaces here, and
> defined these two interfaces together clearly
>
>
> [1]
>
> https://github.com/FangYongs/flink/commit/d4bfe57e7a5315b790e79b8acef8b11e82c9187c
>
> Best,
> Shammon FY
>
>
> On Sun, Jun 25, 2023 at 4:17 PM Yun Tang  wrote:
>
> > Hi Shammon,
> >
> > I like the idea in general and it will help to analysis the job lineages
> > no matter FlinkSQL or Flink jar jobs in production environments.
> >
> > For Qingsheng's concern, I'd like the name of JobType more than
> > RuntimeExecutionMode, as the latter one is not easy to understand for
> users.
> >
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP
> > does not touch them, and will them become part of the List
> > sources() or adding another interface?
> >
> > By the way, if you want to focus on job lineage instead of data column
> > lineage in this FLIP, why we must introduce so many column-lineage
> related
> > interface here?
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Shammon FY 
> > Sent: Sunday, June 25, 2023 16:13
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener
> >
> > Hi Qingsheng,
> >
> > Thanks for your valuable feedback.
> >
> > > 1. Is there any specific use case to expose the batch / streaming info
> to
> > listeners or meta services?
> >
> > I agree with you that Flink is evolving towards batch-streaming
> > uni

[jira] [Created] (FLINK-32433) Add build-in FileCatalogStore

2023-06-25 Thread Feng Jin (Jira)

Feng Jin created FLINK-32433:


 Summary: Add build-in FileCatalogStore 
 Key: FLINK-32433
 URL: https://issues.apache.org/jira/browse/FLINK-32433
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32432) Support CatalogStore in Flink SQL gateway

2023-06-25 Thread Feng Jin (Jira)

Feng Jin created FLINK-32432:


 Summary: Support CatalogStore in Flink SQL gateway
 Key: FLINK-32432
 URL: https://issues.apache.org/jira/browse/FLINK-32432
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Gateway
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32431) Support configuring CatalogStore in Table API

2023-06-25 Thread Feng Jin (Jira)

Feng Jin created FLINK-32431:


 Summary: Support configuring CatalogStore in Table API
 Key: FLINK-32431
 URL: https://issues.apache.org/jira/browse/FLINK-32431
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 >

1 - 100 of 154 matches

Mail list logo