Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Benchao Li
+1 (binding)

Yuxin Tan  于2023年6月25日周日 12:27写道:

> +1 (non-binding)
>
> Best,
> Yuxin
>
>
> Yangze Guo  于2023年6月25日周日 12:21写道:
>
> > +1 (binding)
> >
> > Best,
> > Yangze Guo
> >
> > On Sun, Jun 25, 2023 at 11:41 AM Jark Wu  wrote:
> > >
> > > +1 (binding)
> > >
> > > Best,
> > > Jark
> > >
> > > > 2023年6月25日 10:04,Xia Sun  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Best Regards,
> > > >
> > > > Xia
> > > >
> > > > yuxia  于2023年6月25日周日 09:23写道:
> > > >
> > > >> +1 (binding)
> > > >> Thanks Lijie driving it.
> > > >>
> > > >> Best regards,
> > > >> Yuxia
> > > >>
> > > >> - 原始邮件 -
> > > >> 发件人: "Yuepeng Pan" 
> > > >> 收件人: "dev" 
> > > >> 发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
> > > >> 主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch
> Jobs
> > > >>
> > > >> +1 (non-binding)
> > > >>
> > > >> Thanks,
> > > >> Yuepeng Pan
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> At 2023-06-23 23:49:53, "Lijie Wang" 
> > wrote:
> > > >>> Hi all,
> > > >>>
> > > >>> Thanks for all the feedback about the FLIP-324: Introduce Runtime
> > Filter
> > > >>> for Flink Batch Jobs[1]. This FLIP was discussed in [2].
> > > >>>
> > > >>> I'd like to start a vote for it. The vote will be open for at least
> > 72
> > > >>> hours (until June 29th 12:00 GMT) unless there is an objection or
> > > >>> insufficient votes.
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> > > >>> [2]
> https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
> > > >>>
> > > >>> Best,
> > > >>> Lijie
> > > >>
> > >
> >
>


-- 

Best,
Benchao Li


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Yuxin Tan
+1 (non-binding)

Best,
Yuxin


Yangze Guo  于2023年6月25日周日 12:21写道:

> +1 (binding)
>
> Best,
> Yangze Guo
>
> On Sun, Jun 25, 2023 at 11:41 AM Jark Wu  wrote:
> >
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > > 2023年6月25日 10:04,Xia Sun  写道:
> > >
> > > +1 (non-binding)
> > >
> > > Best Regards,
> > >
> > > Xia
> > >
> > > yuxia  于2023年6月25日周日 09:23写道:
> > >
> > >> +1 (binding)
> > >> Thanks Lijie driving it.
> > >>
> > >> Best regards,
> > >> Yuxia
> > >>
> > >> - 原始邮件 -
> > >> 发件人: "Yuepeng Pan" 
> > >> 收件人: "dev" 
> > >> 发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
> > >> 主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
> > >>
> > >> +1 (non-binding)
> > >>
> > >> Thanks,
> > >> Yuepeng Pan
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> At 2023-06-23 23:49:53, "Lijie Wang" 
> wrote:
> > >>> Hi all,
> > >>>
> > >>> Thanks for all the feedback about the FLIP-324: Introduce Runtime
> Filter
> > >>> for Flink Batch Jobs[1]. This FLIP was discussed in [2].
> > >>>
> > >>> I'd like to start a vote for it. The vote will be open for at least
> 72
> > >>> hours (until June 29th 12:00 GMT) unless there is an objection or
> > >>> insufficient votes.
> > >>>
> > >>> [1]
> > >>>
> > >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> > >>> [2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
> > >>>
> > >>> Best,
> > >>> Lijie
> > >>
> >
>


Re: [DISCUSS] Persistent SQL Gateway

2023-06-24 Thread Jark Wu
Hi Ferenc,

Making SQL Gateway to be an easy-to-use platform infrastructure of Flink
SQL
is one of the important roadmaps [1].

The persistence ability of the SQL Gateway is a major work in 1.18 release.
One of the persistence demand is that the registered catalogs are currently
kept in memory and lost when Gateway restarts. There is an accepted FLIP
(FLIP-295)[2] target to resolve this issue and make Gateway can persist the
registered catalogs information into files or databases.

I'm not sure whether this is something you are looking for?

Best,
Jark


[1]: https://flink.apache.org/roadmap/#a-unified-sql-platform
[2]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations

On Fri, 23 Jun 2023 at 00:25, Ferenc Csaky 
wrote:

> Hello devs,
>
> I would like to open a discussion about persistence possibilitis for the
> SQL Gateway. At Cloudera, we are happy to see the work already done on this
> project and looking for ways to utilize it on our platform as well, but
> currently it lacks some features that would be essential in our case, where
> we could help out.
>
> I am not sure if any thought went into gateway persistence specifics
> already, and this feature could be implemented in fundamentally differnt
> ways, so I think the frist step could be to agree on the basics.
>
> First, in my opinion, persistence should be an optional feature of the
> gateway, that can be enabled if desired. There can be a lot of
> implementation details, but there can be some major directions to follow:
>
> - Utilize Hive catalog: The Hive catalog can already be used to have
> persistenct meta-objects, so the crucial thing that would be missing in
> this case is other catalogs. Personally, I would not pursue this option,
> because in my opinion it would limit the usability of this feature too much.
> - Serialize the session as is: Saving the whole session (or its context)
> [1] as is to durable storage, so it can be kept and picked up again.
> - Serialize the required elements (catalogs, tables, functions, etc.), not
> necessarily as a whole: The main point here would be to serialize a
> different object, so the persistent data will not be that sensitive to
> changes of the session (or its context). There can be numerous factors
> here, like try to keep the model close to the session itself, so the
> boilerplate required for the mapping can be kept to minimal, or focus on
> saving what is actually necessary, making the persistent storage more
> portable.
>
> WDYT?
>
> Cheers,
> F
>
> [1]
> https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Yangze Guo
+1 (binding)

Best,
Yangze Guo

On Sun, Jun 25, 2023 at 11:41 AM Jark Wu  wrote:
>
> +1 (binding)
>
> Best,
> Jark
>
> > 2023年6月25日 10:04,Xia Sun  写道:
> >
> > +1 (non-binding)
> >
> > Best Regards,
> >
> > Xia
> >
> > yuxia  于2023年6月25日周日 09:23写道:
> >
> >> +1 (binding)
> >> Thanks Lijie driving it.
> >>
> >> Best regards,
> >> Yuxia
> >>
> >> - 原始邮件 -
> >> 发件人: "Yuepeng Pan" 
> >> 收件人: "dev" 
> >> 发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
> >> 主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
> >>
> >> +1 (non-binding)
> >>
> >> Thanks,
> >> Yuepeng Pan
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> At 2023-06-23 23:49:53, "Lijie Wang"  wrote:
> >>> Hi all,
> >>>
> >>> Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
> >>> for Flink Batch Jobs[1]. This FLIP was discussed in [2].
> >>>
> >>> I'd like to start a vote for it. The vote will be open for at least 72
> >>> hours (until June 29th 12:00 GMT) unless there is an objection or
> >>> insufficient votes.
> >>>
> >>> [1]
> >>>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> >>> [2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
> >>>
> >>> Best,
> >>> Lijie
> >>
>


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Jark Wu
+1 (binding)

Best,
Jark

> 2023年6月25日 10:04,Xia Sun  写道:
> 
> +1 (non-binding)
> 
> Best Regards,
> 
> Xia
> 
> yuxia  于2023年6月25日周日 09:23写道:
> 
>> +1 (binding)
>> Thanks Lijie driving it.
>> 
>> Best regards,
>> Yuxia
>> 
>> - 原始邮件 -
>> 发件人: "Yuepeng Pan" 
>> 收件人: "dev" 
>> 发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
>> 主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
>> 
>> +1 (non-binding)
>> 
>> Thanks,
>> Yuepeng Pan
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2023-06-23 23:49:53, "Lijie Wang"  wrote:
>>> Hi all,
>>> 
>>> Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
>>> for Flink Batch Jobs[1]. This FLIP was discussed in [2].
>>> 
>>> I'd like to start a vote for it. The vote will be open for at least 72
>>> hours (until June 29th 12:00 GMT) unless there is an objection or
>>> insufficient votes.
>>> 
>>> [1]
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
>>> [2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
>>> 
>>> Best,
>>> Lijie
>> 



Re: [DISCUSS] FLIP-321: Introduce an API deprecation process

2023-06-24 Thread Jingsong Li
Thanks Becket and all for your discussion.

> 1. We say this FLIP is enforced starting release 2.0. For current 1.x APIs,
we provide a migration period with best effort, while allowing exceptions
for immediate removal in 2.0. That means we will still try with best effort
to get the ProcessFuncion API ready and deprecate the DataStream API in
1.x, but will also be allowed to remove DataStream API in 2.0 if it's not
deprecated 2 minor releases before the major version bump.

> 2. We strictly follow the process in this FLIP, and will quickly bump the
major version from 2.x to 3.0 once the migration period for DataStream API
is reached.

Sorry, I didn't read the previous detailed discussion because the
discussion list was so long.

I don't really like either of these options.

Considering that DataStream is such an important API, can we offer a third
option:

3. Maintain the DataStream API throughout 2.X and remove it until 3.x. But
there's no need to assume that 2.X is a short version, it's still a normal
major version.

Best,
Jingsong

Becket Qin 于2023年6月22日 周四16:02写道:

> Thanks much for the input, John, Stefan and Jing.
>
> I think Xingtong has well summarized the pros and cons of the two options.
> Let's collect a few more opinions here and we can move forward with the one
> more people prefer.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Jun 21, 2023 at 3:20 AM Jing Ge 
> wrote:
>
> > Hi all,
> >
> > Thanks Xingtong for the summary. If I could only choose one of the given
> > two options, I would go with option 1. I understood that option 2 worked
> > great with Kafka. But the bridge release will still confuse users and my
> > gut feeling is that many users will skip 2.0 and be waiting for 3.0 or
> even
> > 3.x. And since fewer users will use Flink 2.x, the development focus will
> > be on Flink 3.0 with the fact that the current Flink release is 1.17 and
> we
> > are preparing 2.0 release. That is weird for me.
> >
> > THB, I would not name the change from @Public to @Retired as a demotion.
> > The purpose of @Retire is to extend the API lifecycle with one more
> stage,
> > like in the real world, people born, studied, graduated, worked, and
> > retired. Afaiu from the previous discussion, there are two rules we'd
> like
> > to follow simultaneously:
> >
> > 1. Public APIs can only be changed between major releases.
> > 2. A smooth migration phase should be offered to users, i.e. at least 2
> > minor releases after APIs are marked as @deprecated. There should be new
> > APIs as the replacement.
> >
> > Agree, those rules are good to improve the user friendliness. Issues we
> > discussed are rising because we want to fulfill both of them. If we take
> > care of deprecation very seriously, APIs can be marked as @Deprecated,
> only
> > when the new APIs as the replacement provide all functionalities the
> > deprecated APIs have. In an ideal case without critical bugs that might
> > stop users adopting the new APIs. Otherwise the expected "replacement"
> will
> > not happen. Users will still stick to the deprecated APIs, because the
> new
> > APIs can not be used. For big features, it will need at least 4 minor
> > releases(ideal case), i.e. 2+ years to remove deprecated APIs:
> >
> > - 1st minor release to build the new APIs as the replacement and waiting
> > for feedback. It might be difficult to mark the old API as deprecated in
> > this release, because we are not sure if the new APIs could cover 100%
> > functionalities.
> > -  In the lucky case,  mark all old APIs as deprecated in the 2nd minor
> > release. (I would even suggest having the new APIs released at least for
> > two minor releases before marking it as deprecated to make sure they can
> > really replace the old APIs, in case we care more about smooth migration)
> > - 3rd minor release for the migration period
> > -  In another lucky case, the 4th release is a major release, the
> > deprecated APIs could be removed.
> >
> > The above described scenario works only in an ideal case. In reality, it
> > might take longer to get the new APIs ready and mark the old API
> > deprecated. Furthermore, if the 4th release is not a major release, we
> will
> > have to maintain both APIs for many further minor releases. The question
> is
> > how to know the next major release in advance, especially 4 minor
> releases'
> > period, i.e. more than 2 years in advance? Given that Flink contains many
> > modules, it is difficult to ask devs to create a 2-3 years deprecation
> plan
> > for each case. In case we want to build major releases at a fast pace,
> > let's say every two years, it means devs must plan any API deprecation
> > right after each major release. Afaiac, it is quite difficult.
> >
> > The major issue is, afaiu, if we follow rule 2, we have to keep all
> @Public
> > APIs, e.g. DataStream, that are not marked as deprecated yet, to 2.0.
> Then
> > we have to follow rule 1 to keep it unchanged until we have 3.0. That is
> > why @Retired is 

Re: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing

2023-06-24 Thread yuxia
Hi, Bangui Dunn
Review done. 
But please remember we should reach consensus about it in the dicsussion before 
we can merge it.
Let's keep the discussion for a while to see any further feedback.

Best regards,
Yuxia

- 原始邮件 -
发件人: "Dunn Bangui" 
收件人: "dev" 
发送时间: 星期日, 2023年 6 月 25日 上午 10:34:50
主题: Re: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing

Hi, Yuxia.

Could you please provide me with an update on the review process? I am
eager to move forward and would appreciate any guidance or feedback you can
provide.

Best regards, late Dragon Boat Festival blessing : )
Bangui Dunn

Gary Ccc  于2023年6月21日周三 14:44写道:

> Hi, Yuxia.
>
> Thank you for your suggestion! I agree with your point and have made the
> necessary modification.
> If you have any additional feedback or further suggestions, please feel
> free to let me know. : )
>
> Best regards,
> Bangui Dunn
>
> yuxia  于2023年6月21日周三 10:56写道:
>
>> Correct what I said in the previous email:
>>
>> "But will it better to make it asList ? Something
>> like:`.stringType().asList()`."  =>  "But will it better to make it no
>> default value? Something like:`.stringType().asList().noDefaultValue`."
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "yuxia" 
>> 收件人: "dev" 
>> 发送时间: 星期三, 2023年 6 月 21日 上午 10:25:47
>> 主题: Re: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing
>>
>> Hi, Bangui Dunn.
>> Thanks for reaching us out.
>> Generally + 1 for the configuration.  But will it better to make it
>> asList? Something like:
>> `.stringType().asList()`.
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "Gary Ccc" 
>> 收件人: "dev" 
>> 发送时间: 星期二, 2023年 6 月 20日 下午 5:44:10
>> 主题: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing
>>
>> To the Apache Flink Community Members,
>> I hope this email finds you well. I am writing to discuss a potential
>> improvement for the implementation of a custom commit policy in Flink.
>>
>> Background:
>> We have encountered challenges in utilizing a custom commit policy due to
>> the inability to pass parameters.
>> This limitation restricts our ability to add additional functionality to
>> the commit policy, such as monitoring the files associated with each
>> commit.
>>
>> Purpose:
>> The purpose of this improvement is to allow the passing of parameters to
>> the custom PartitionCommitPolicy. By enabling parameter passing, users can
>> extend the functionality of their custom commit policy.
>>
>> Example:
>> Suppose we have a custom commit policy called "MyPolicy" that requires
>> parameters such as "key" and "url" for proper functionality.
>> Currently, it is not possible to pass these parameters when using a custom
>> commit policy.
>> However, by introducing the concept of
>> SINK_PARTITION_COMMIT_POLICY_CLASS_PARAMETERS, users can now pass
>> parameters in the following way:
>>
>> By adding SINK_PARTITION_COMMIT_POLICY_CLASS_PARAMETERS, you can pass
>> parameters when using a custom commit policy, for example
>> 'sink.partition-commit.policy.kind'='custom',
>> 'sink.partition-commit.policy.class’='MyPolicy',
>> 'sink.partition-commit.policy.class.parameters’=‘key;url'
>>
>> Eeffect:
>> By adding PartitionCommitPolicyFactory constructor, to ensure backward
>> compatibility for existing user programs.
>>
>> Code PR:
>> To support this improvement, I have submitted a pull request with the
>> necessary code changes.
>> You can find the details of the pull request at the following link:
>> https://github.com/apache/flink/pull/22831/files
>>
>> Best regards,
>> Bangui Dunn
>>
>


[jira] [Created] (FLINK-32423) Flink-sql-runner-example application fails if multiple execute() called in one sql file

2023-06-24 Thread Guozhen Yang (Jira)
Guozhen Yang created FLINK-32423:


 Summary: Flink-sql-runner-example application fails if multiple 
execute() called in one sql file
 Key: FLINK-32423
 URL: https://issues.apache.org/jira/browse/FLINK-32423
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Guozhen Yang


h2. Summary:

flink-sql-runner-example application fails if multiple execute() called in one 
sql file
h2. Background:

We have a series of batch jobs running on a table partitioned by date. The jobs 
need to be run sequencially in chronological order. Which means only after the 
batch job #1 finishes running 2023-06-01 partition, the batch job #2 running 
2023-06-02 partition starts running. So we loop through dates and submit 
multiple jobs in a single application, and the flink application is deployed in 
application mode with HA turned off.

According to [flink 
document|https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/overview/#application-mode],
 the Application Mode allows the submission of applications consisting of 
multiple jobs, but High-Availability is not supported in these cases.
h2. The problem:

The application consisted of multiple jobs fails when the second job is 
executed.

Stack trace is shown as below:
{noformat}
2023-06-21 03:21:44,720 ERROR 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error occurred 
in the cluster entrypoint.
java.util.concurrent.CompletionException: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) 
~[?:?]
at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) 
~[?:?]
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(Unknown 
Source) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) 
~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(Unknown 
Source) ~[?:?]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:337)
 ~[flink-dist-1.1
6.2.jar:1.16.2]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.lambda$runApplicationAsync$2(ApplicationDispatcherBootstrap.java:254)
 ~[flink-dist
-1.16.2.jar:1.16.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at 
org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:171)
 ~[flink-rpc-a
kka_0e3d2618-241c-420f-a71d-2f4d1edcb5a1.jar:1.16.2]
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
 ~[flink-rpc-akka_0e3d2618-241c-420f-a71d-2f4d1ed
cb5a1.jar:1.16.2]
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$withContextClassLoader$0(ClassLoadingUtils.java:41)
 ~[flink-rpc-akka_0e3d2618-241c-420f-a71d-2
f4d1edcb5a1.jar:1.16.2]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49) 
[flink-rpc-akka_0e3d2618-241c-420f-a71d-2f4d1edcb5a1.jar:1.16.2]
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48)
 [flink-rpc-akka_0e3d2618-241c-420f-a71d-2f4d1edcb5a1.jar
:1.16.2]
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) 
[?:?]
at java.util.concurrent.ForkJoinPool.scan(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) [?:?]
Caused by: 
org.apache.flink.client.deployment.application.ApplicationExecutionException: 
Could not execute application.
... 14 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The main 
method caused an error: Failed to execute sql
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
 ~[flink-dist-1.16.2.jar:1.16.2]
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
 ~[flink-dist-1.16.2.jar:1.16.2]
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:98) 
~[flink-dist-1.16.2.jar:1.16.2]
at 
org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:301)
 ~[flink-dist-1.1
6.2.jar:1.16.2]
... 13 more
Caused by: org.apache.flink.table.api.TableException: Failed to execute sql
at 

Re: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing

2023-06-24 Thread Dunn Bangui
Hi, Yuxia.

Could you please provide me with an update on the review process? I am
eager to move forward and would appreciate any guidance or feedback you can
provide.

Best regards, late Dragon Boat Festival blessing : )
Bangui Dunn

Gary Ccc  于2023年6月21日周三 14:44写道:

> Hi, Yuxia.
>
> Thank you for your suggestion! I agree with your point and have made the
> necessary modification.
> If you have any additional feedback or further suggestions, please feel
> free to let me know. : )
>
> Best regards,
> Bangui Dunn
>
> yuxia  于2023年6月21日周三 10:56写道:
>
>> Correct what I said in the previous email:
>>
>> "But will it better to make it asList ? Something
>> like:`.stringType().asList()`."  =>  "But will it better to make it no
>> default value? Something like:`.stringType().asList().noDefaultValue`."
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "yuxia" 
>> 收件人: "dev" 
>> 发送时间: 星期三, 2023年 6 月 21日 上午 10:25:47
>> 主题: Re: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing
>>
>> Hi, Bangui Dunn.
>> Thanks for reaching us out.
>> Generally + 1 for the configuration.  But will it better to make it
>> asList? Something like:
>> `.stringType().asList()`.
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "Gary Ccc" 
>> 收件人: "dev" 
>> 发送时间: 星期二, 2023年 6 月 20日 下午 5:44:10
>> 主题: [DISCUSS] Flexible Custom Commit Policy with Parameter Passing
>>
>> To the Apache Flink Community Members,
>> I hope this email finds you well. I am writing to discuss a potential
>> improvement for the implementation of a custom commit policy in Flink.
>>
>> Background:
>> We have encountered challenges in utilizing a custom commit policy due to
>> the inability to pass parameters.
>> This limitation restricts our ability to add additional functionality to
>> the commit policy, such as monitoring the files associated with each
>> commit.
>>
>> Purpose:
>> The purpose of this improvement is to allow the passing of parameters to
>> the custom PartitionCommitPolicy. By enabling parameter passing, users can
>> extend the functionality of their custom commit policy.
>>
>> Example:
>> Suppose we have a custom commit policy called "MyPolicy" that requires
>> parameters such as "key" and "url" for proper functionality.
>> Currently, it is not possible to pass these parameters when using a custom
>> commit policy.
>> However, by introducing the concept of
>> SINK_PARTITION_COMMIT_POLICY_CLASS_PARAMETERS, users can now pass
>> parameters in the following way:
>>
>> By adding SINK_PARTITION_COMMIT_POLICY_CLASS_PARAMETERS, you can pass
>> parameters when using a custom commit policy, for example
>> 'sink.partition-commit.policy.kind'='custom',
>> 'sink.partition-commit.policy.class’='MyPolicy',
>> 'sink.partition-commit.policy.class.parameters’=‘key;url'
>>
>> Eeffect:
>> By adding PartitionCommitPolicyFactory constructor, to ensure backward
>> compatibility for existing user programs.
>>
>> Code PR:
>> To support this improvement, I have submitted a pull request with the
>> necessary code changes.
>> You can find the details of the pull request at the following link:
>> https://github.com/apache/flink/pull/22831/files
>>
>> Best regards,
>> Bangui Dunn
>>
>


[DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

2023-06-24 Thread Yunfeng Zhou
Hi all,

Dong(cc'ed) and I are opening this thread to discuss our proposal to
support configuring end-to-end allowed latency for Flink jobs, which
has been documented in FLIP-325
.

By configuring the latency requirement for a Flink job, users would be
able to optimize the throughput and overhead of the job while still
acceptably increasing latency. This approach is particularly useful
when dealing with records that do not require immediate processing and
emission upon arrival.

Please refer to the FLIP document for more details about the proposed
design and implementation. We welcome any feedback and opinions on
this proposal.

Best regards.

Dong and Yunfeng


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Xia Sun
+1 (non-binding)

Best Regards,

Xia

yuxia  于2023年6月25日周日 09:23写道:

> +1 (binding)
> Thanks Lijie driving it.
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "Yuepeng Pan" 
> 收件人: "dev" 
> 发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
> 主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
>
> +1 (non-binding)
>
> Thanks,
> Yuepeng Pan
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> At 2023-06-23 23:49:53, "Lijie Wang"  wrote:
> >Hi all,
> >
> >Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
> >for Flink Batch Jobs[1]. This FLIP was discussed in [2].
> >
> >I'd like to start a vote for it. The vote will be open for at least 72
> >hours (until June 29th 12:00 GMT) unless there is an objection or
> >insufficient votes.
> >
> >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> >[2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
> >
> >Best,
> >Lijie
>


Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

2023-06-24 Thread yuxia
Thanks Jing for briging this to dicuss.
I agree it's not a blocker for graduting the FileSink to @PublicEvolving since 
the Sink which is the rootcause has marked as @PublicEvolving.
But I do also share the same concern with Galen. At least it should be a 
blocker for removing StreamingFileSink.
Btw, seems it's really a big headache for migrating to Sink, we may need to pay 
more attention to this ticket and try to fix it.

Best regards,
Yuxia

- 原始邮件 -
发件人: "Galen Warren" 
收件人: "dev" 
发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Thanks Jing. I can only offer my perspective on this, others may view it
differently.

If FileSink is subject to data loss in the "stop-on-savepoint then restart"
scenario, that makes it unusable for me, and presumably for anyone who uses
it in a long-running streaming application and who cannot tolerate data
loss. I still use the (deprecated!) StreamingFileSink for this reason.

The bigger picture here is that StreamingFileSink is deprecated and will
presumably ultimately be removed, to be replaced with FileSink. Graduating
the status of FileSink seems to be a step along that path; I'm concerned
about continuing down that path with such a critical issue present.
Ultimately, my concern is that FileSink will graduate fully and that
StreamingFileSink will be removed and that there will be no remaining
option to reliably stop/start streaming jobs that write to files without
incurring the risk of data loss.

I'm sure I'd feel better about things if there were an ongoing effort to
address this FileSink issue and/or a commitment that StreamingFileSink
would not be removed until this issue is addressed.

My two cents -- thanks.


On Fri, Jun 23, 2023 at 1:47 AM Jing Ge  wrote:

> Hi Galen,
>
> Thanks for the hint which is helpful for us to have a clear big picture.
> Afaiac, this will not be a blocking issue for the graduation. There will
> always be some (potential) bugs in the implementation. The API is very
> stable from 2020. The timing is good to graduate. WDYT?
> Furthermore, I'd like to have more opinions. All opinions together will
> help the community build a mature API graduation process.
>
> Best regards,
> Jing
>
> On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
>  wrote:
>
> > Is this issue still unresolved?
> >
> > https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
> >
> > Based on prior discussion, I believe this could lead to data loss with
> > FileSink.
> >
> >
> >
> > On Tue, Jun 20, 2023, 5:41 AM Jing Ge 
> wrote:
> >
> > > Hi all,
> > >
> > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > > According to FLIP-197[2], I would like to propose to graduate it
> > > to @PublicEvloving in the upcoming 1.18 release.
> > >
> > > On the other hand, as a related topic, FileSource was marked
> > > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> > too.
> > > To keep this discussion lean and efficient, let's focus on FlieSink in
> > this
> > > thread. There will be another discussion thread for the FileSource.
> > >
> > > I was wondering if anyone might have any concerns. Looking forward to
> > > hearing from you.
> > >
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > >
> > >
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > [3]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> > >
> >
>


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread yuxia
+1 (binding)
Thanks Lijie driving it.

Best regards,
Yuxia

- 原始邮件 -
发件人: "Yuepeng Pan" 
收件人: "dev" 
发送时间: 星期六, 2023年 6 月 24日 下午 9:06:53
主题: Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

+1 (non-binding)

Thanks,
Yuepeng Pan















At 2023-06-23 23:49:53, "Lijie Wang"  wrote:
>Hi all,
>
>Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
>for Flink Batch Jobs[1]. This FLIP was discussed in [2].
>
>I'd like to start a vote for it. The vote will be open for at least 72
>hours (until June 29th 12:00 GMT) unless there is an objection or
>insufficient votes.
>
>[1]
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
>[2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
>
>Best,
>Lijie


Re: [DISCUSS] Status of Statefun Project

2023-06-24 Thread Galen Warren
Great -- thanks!

I'm going to be out of town for about a week but I'll take a look at this
when I'm back.

On Tue, Jun 20, 2023 at 8:46 AM Martijn Visser  wrote:

> Hi Galen,
>
> Yes, I'll be more than happy to help with Statefun releases.
>
> Best regards,
>
> Martijn
>
> On Tue, Jun 20, 2023 at 2:21 PM Galen Warren 
> wrote:
>
>> Thanks.
>>
>> Martijn, to answer your question, I'd need to do a small amount of work
>> to get a PR ready, but not much. Happy to do it if we're deciding to
>> restart Statefun releases -- are we?
>>
>> -- Galen
>>
>> On Sat, Jun 17, 2023 at 9:47 AM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> > Perhaps he could weigh in on whether the combination of automated
>>> tests plus those smoke tests should be sufficient for testing with new
>>> Flink versions
>>>
>>> What we usually did at the bare minimum for new StateFun releases was
>>> the following:
>>>
>>>1. Build tests (including the smoke tests in the e2e module, which
>>>covers important tests like exactly-once verification)
>>>2. Updating the flink-statefun-playground repo and manually running
>>>all language examples there.
>>>
>>> If upgrading Flink versions was the only change in the release, I'd
>>> probably say that this is sufficient.
>>>
>>> Best,
>>> Gordon
>>>
>>> On Thu, Jun 15, 2023 at 5:25 AM Martijn Visser 
>>> wrote:
>>>
 Let me know if you have a PR for a Flink update :)

 On Thu, Jun 8, 2023 at 5:52 PM Galen Warren via user <
 u...@flink.apache.org> wrote:

> Thanks Martijn.
>
> Personally, I'm already using a local fork of Statefun that is
> compatible with Flink 1.16.x, so I wouldn't have any need for a released
> version compatible with 1.15.x. I'd be happy to do the PRs to modify
> Statefun to work with new versions of Flink as they come along.
>
> As for testing, Statefun does have unit tests and Gordon also sent me
> instructions a while back for how to do some additional smoke tests which
> are pretty straightforward. Perhaps he could weigh in on whether the
> combination of automated tests plus those smoke tests should be sufficient
> for testing with new Flink versions (I believe the answer is yes).
>
> -- Galen
>
>
>
> On Thu, Jun 8, 2023 at 8:01 AM Martijn Visser <
> martijnvis...@apache.org> wrote:
>
>> Hi all,
>>
>> Apologies for the late reply.
>>
>> I'm willing to help out with merging requests in Statefun to keep them
>> compatible with new Flink releases and create new releases. I do
>> think that
>> validation of the functionality of these releases depends a lot on
>> those
>> who do these compatibility updates, with PMC members helping out with
>> the
>> formal process.
>>
>> > Why can't the Apache Software Foundation allow community members to
>> bring
>> it up to date?
>>
>> There's nothing preventing anyone from reviewing any of the current
>> PRs or
>> opening new ones. However, none of them are approved [1], so there's
>> also
>> nothing to merge.
>>
>> > I believe that there are people and companies on this mailing list
>> interested in supporting Apache Flink Stateful Functions.
>>
>> If so, then now is the time to show.
>>
>> Would there be a preference to create a release with Galen's merged
>> compatibility update to Flink 1.15.2, or do we want to skip that and
>> go
>> straight to a newer version?
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>>
>> https://github.com/apache/flink-statefun/pulls?q=is%3Apr+is%3Aopen+review%3Aapproved
>>
>> On Tue, Jun 6, 2023 at 3:55 PM Marco Villalobos <
>> mvillalo...@kineteque.com>
>> wrote:
>>
>> > Why can't the Apache Software Foundation allow community members to
>> bring
>> > it up to date?
>> >
>> > What's the process for that?
>> >
>> > I believe that there are people and companies on this mailing list
>> > interested in supporting Apache Flink Stateful Functions.
>> >
>> > You already had two people on this thread express interest.
>> >
>> > At the very least, we could keep the library versions up to date.
>> >
>> > There are only a small list of new features that might be
>> worthwhile:
>> >
>> > 1. event time processing
>> > 2. state rest api
>> >
>> >
>> > On Jun 6, 2023, at 3:06 AM, Chesnay Schepler 
>> wrote:
>> >
>> > If you were to fork it *and want to redistribute it* then the short
>> > version is that
>> >
>> >1. you have to adhere to the Apache licensing requirements
>> >2. you have to make it clear that your fork does not belong to
>> the
>> >Apache Flink project. (Trademarks and all that)
>> >
>> > Neither should be significant hurdles (there should also be plenty
>> of
>> > online resources 

Re:[VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Yuepeng Pan
+1 (non-binding)

Thanks,
Yuepeng Pan















At 2023-06-23 23:49:53, "Lijie Wang"  wrote:
>Hi all,
>
>Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
>for Flink Batch Jobs[1]. This FLIP was discussed in [2].
>
>I'd like to start a vote for it. The vote will be open for at least 72
>hours (until June 29th 12:00 GMT) unless there is an objection or
>insufficient votes.
>
>[1]
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
>[2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
>
>Best,
>Lijie


Re: [VOTE] FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-06-24 Thread Zhu Zhu
+1 (binding)

Thanks,
Zhu

Rui Fan <1996fan...@gmail.com> 于2023年6月24日周六 12:28写道:
>
> +1(binding), thanks for driving this improvement.
>
> Best,
> Rui Fan
>
> On Sat, Jun 24, 2023 at 4:55 AM Jing Ge  wrote:
>
> > +1(binding)
> >
> > Best Regards,
> > Jing
> >
> > On Fri, Jun 23, 2023 at 5:50 PM Lijie Wang 
> > wrote:
> >
> > > Hi all,
> > >
> > > Thanks for all the feedback about the FLIP-324: Introduce Runtime Filter
> > > for Flink Batch Jobs[1]. This FLIP was discussed in [2].
> > >
> > > I'd like to start a vote for it. The vote will be open for at least 72
> > > hours (until June 29th 12:00 GMT) unless there is an objection or
> > > insufficient votes.
> > >
> > > [1]
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> > > [2] https://lists.apache.org/thread/mm0o8fv7x7k13z11htt88zhy7lo8npmg
> > >
> > > Best,
> > > Lijie
> > >
> >