Re: [Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Xiaolong Wang
Hi, Dongwoo, Since Flink SQL gateway should run upon a Flink session cluster, I think it'd be easier to add more fields to the CRD of `FlinkSessionJob`. e.g. apiVersion: flink.apache.org/v1beta1 kind: FlinkSessionJob metadata: name: sql-gateway spec: sqlGateway: endpoint: "hiveserver2"

Re: [Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Dongwoo Kim
Hi all, *@Gyula* Thanks for the consideration Gyula. My initial idea for the CR was roughly like below. I focused on simplifying the setup in k8s environment, but I agree with your opinion that for the sql gateway we don't need custom operator logic to handle and most of the requirements can be

Re: [DISCUSS] FLIP-331: Support EndOfStreamWindows and isOutputOnEOF operator attribute to optimize task deployment

2023-09-14 Thread Dong Lin
Hi Xintong, Thanks for the comments! Please see my reply inline. On Thu, Sep 14, 2023 at 4:17 PM Xintong Song wrote: > Thanks for preparing this FLIP, Dong & Jinhao. > > I'm overall +1 to this proposal. This is helpful for some cases that we are > dealing with. > - Wencong and I are preparing

Re: [DISCUSS] FLIP-331: Support EndOfStreamWindows and isOutputOnEOF operator attribute to optimize task deployment

2023-09-14 Thread Dong Lin
Hi Wencong, Thanks for your comments! Please see my reply inline. On Thu, Sep 14, 2023 at 12:30 PM Wencong Liu wrote: > Dear Dong, > > I have thoroughly reviewed the proposal for FLIP-331 and believe it would > be > a valuable addition to Flink. However, I do have a few questions that I >

Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-14 Thread Jane Chan
Hi, Zhanghao, Dewei, Thanks for initiating this discussion. This feature is valuable in providing more flexibility for performance tuning for SQL pipelines. Here are my two cents, 1. In the FLIP, you mentioned concerns about the parallelism of the calc node and concluded to "leave the behavior

Re: [DISCUSS] FLIP-328: Allow source operators to determine isProcessingBacklog based on watermark lag

2023-09-14 Thread Dong Lin
Hi Jark, Please see my comments inline. On Fri, Sep 15, 2023 at 10:35 AM Jark Wu wrote: > Hi Dong, > > Please see my comments inline below. > > Hmm.. can you explain what you mean by "different watermark delay > > definitions for each source"? > > For example, "table1" defines a watermark

Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-14 Thread Yun Tang
Thanks for creating this FLIP, Many users have demands to configure the source parallelism just as configuring the sink parallelism via DDL. Look forward for this feature. BTW, I think setting parallelism for each operator should also be valuable. And this shall work with compiled plan [1]

Re: [DISCUSS] FLIP-328: Allow source operators to determine isProcessingBacklog based on watermark lag

2023-09-14 Thread Jark Wu
Hi Dong, Please see my comments inline below. > Hmm.. can you explain what you mean by "different watermark delay > definitions for each source"? For example, "table1" defines a watermark with delay 5 seconds, "table2" defines a watermark with delay 10 seconds. They have different watermark

Re: [Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Shammon FY
Hi, Currently `sql-gateway` can be started with the script `sql-gateway.sh` in an existing node, it is more like a simple "standalone" node. I think it's valuable if we can do more work to start it in k8s. For xiaolong: Do you want to start a sql-gateway instance in the jobmanager pod? I think

Re: [Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Xiaolong Wang
Hi, I've experiment this feature on K8S recently, here is some of my trial: 1. Create a new kubernetes-jobmanager.sh script with the following content #!/usr/bin/env bash $FLINK_HOME/bin/sql-gateway.sh start $FLINK_HOME/bin/kubernetes-jobmanager1.sh kubernetes-session 2. Build your own Flink

Re: [DISCUSS] FLIP-328: Allow source operators to determine isProcessingBacklog based on watermark lag

2023-09-14 Thread Dong Lin
Hi Jark, Do you have any follow-up comment? My gut feeling is that suppose we need to support per-source watermark lag specification in the future (not sure we have a use-case for this right now), we can add such a config in the future with a follow-up FLIP. The job-level config will still be

[jira] [Created] (FLINK-33091) Limit on outgoing connections to 64 seems unnecessary

2023-09-14 Thread Rogan Morrow (Jira)
Rogan Morrow created FLINK-33091: Summary: Limit on outgoing connections to 64 seems unnecessary Key: FLINK-33091 URL: https://issues.apache.org/jira/browse/FLINK-33091 Project: Flink Issue

Re: Flink and Flink shaded dependency

2023-09-14 Thread Sergey Nuyanzin
Yes, that's a reasonable question, thanks for raising it. I think this is not only about flink-shaded, rather about dependencies in general I guess there is no rule of thumb, or at least I'm not aware of Here are my thoughts 1. If bumping dependency doesn't require breaking changes and passes

Re: [VOTE] Apache Flink Stateful Functions Release 3.3.0, release candidate #2

2023-09-14 Thread Robert Metzger
I did a shallow pass over the release for it to get the +3 votes. Please verify other aspects of the release when voting ;) +1 (binding) - maven clean install on the source tgz (not on an M1 macbook because of protoc, but on x86 linux) (not on Java 17 either ;) ) - staging repo seem fine -

Re: [DISCUSS]clean up the savepoints compatibility table

2023-09-14 Thread Jing Ge
Hi, According the community update policy[1] for old releases, there is another cleaner option: option 3: only keep the last three versions. Compatibility info of older versions could still be found in the old releases. If there are no concerns, the upcoming 1.18 release will choose option 3

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Jing Ge
Thanks for the information! On Thu, Sep 14, 2023 at 9:07 PM Gyula Fóra wrote: > https://flink.apache.org/downloads/#update-policy-for-old-releases > > On Thu, Sep 14, 2023 at 8:47 PM Jing Ge > wrote: > > > +1 Thanks! I have an off-track question: where could we find the > reference > > that

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Gyula Fóra
https://flink.apache.org/downloads/#update-policy-for-old-releases On Thu, Sep 14, 2023 at 8:47 PM Jing Ge wrote: > +1 Thanks! I have an off-track question: where could we find the reference > that the community only supports the last 2 minor releases? Thanks! > > > Best Regards, > Jing > > On

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Jing Ge
+1 Thanks! I have an off-track question: where could we find the reference that the community only supports the last 2 minor releases? Thanks! Best Regards, Jing On Thu, Sep 14, 2023 at 3:13 PM Ahmed Hamdy wrote: > Makes sense, > Thanks for the clarification. > Best Regards > Ahmed Hamdy > >

[jira] [Created] (FLINK-33090) CheckpointsCleaner clean individual checkpoint states in parallel

2023-09-14 Thread Yi Zhang (Jira)
Yi Zhang created FLINK-33090: Summary: CheckpointsCleaner clean individual checkpoint states in parallel Key: FLINK-33090 URL: https://issues.apache.org/jira/browse/FLINK-33090 Project: Flink

[jira] [Created] (FLINK-33089) Drop Flink 1.14 support

2023-09-14 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-33089: -- Summary: Drop Flink 1.14 support Key: FLINK-33089 URL: https://issues.apache.org/jira/browse/FLINK-33089 Project: Flink Issue Type: Improvement

Re: Inconsistent build Flink from source in readme and docs

2023-09-14 Thread Sergey Nuyanzin
Hi David The links you have mentioned seems correct Movement[1] towards to maven 3.8.6 happened since 1.18.x, and this is mentioned in doc for master branch At the same time 1.17.x still requires 3.3.x as mentioned in the doc for 1.17 you've mentioned. so it depends on the Flink version [1]

Inconsistent build Flink from source in readme and docs

2023-09-14 Thread David Radley
Hello, I am looking to build Flink from source I notice that the documentation https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/flinkdev/building/ says https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/flinkdev/building/ In addition you need Maven 3 and a JDK (Java

Re: [Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Gyula Fóra
Hi! I don't completely understand what would be a content of such CRD, could you give a minimal example how the Flink SQL Gateway CR yaml would look like? Adding a CRD would mean you need to add some operator/controller logic as well. Why not simply use a Deployment / StatefulSet in Kubernetes?

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Ahmed Hamdy
Makes sense, Thanks for the clarification. Best Regards Ahmed Hamdy On Thu, 14 Sept 2023 at 14:07, Gyula Fóra wrote: > Hi Ahmed! > > As I mentioned in the first email, the Flink Operator explicitly aims to > make running Flink and Flink Platforms on Kubernetes easy. As most users > are

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Gyula Fóra
Hi Ahmed! As I mentioned in the first email, the Flink Operator explicitly aims to make running Flink and Flink Platforms on Kubernetes easy. As most users are platform teams supporting Flink inside a company or running a service it's basically always required to support several Flink versions at

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Ahmed Hamdy
Thanks Gyula, +1 for the proposal in general. May I ask why are we interested in supporting more than the ones supported by the community? for example I understand all versions prior to 1.16 are now out of support, why should we tie our compatibility 4 versions behind? Best Regards Ahmed Hamdy

[jira] [Created] (FLINK-33088) Fix NullPointerException in RemoteTierConsumerAgent of tiered storage

2023-09-14 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-33088: - Summary: Fix NullPointerException in RemoteTierConsumerAgent of tiered storage Key: FLINK-33088 URL: https://issues.apache.org/jira/browse/FLINK-33088 Project: Flink

Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-14 Thread Benchao Li
Thanks Zhanghao, Dewei for preparing the FLIP, I think this is a long awaited feature, and I appreciate your effort, especially the "Other concerns" part you listed. Regarding the parallelism of transformations following the source transformation, it's indeed a problem that we initially want to

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread ConradJam
+1 Yang Wang 于2023年9月14日周四 16:15写道: > Since the users could always use the old Flink Kubernetes Operator version > along with old Flink versions, I am totally in favor of this proposal to > reduce maintenance burden. > > Best, > Yang > > Biao Geng 于2023年9月6日周三 18:15写道: > > > +1 for the

[DISCUSS]clean up the savepoints compatibility table

2023-09-14 Thread Jing Ge
Hi folks, The compatibility table[1] contains old Flink versions. Does it make sense to clean up and remove some very old versions? My proposal would be: option 1: keeping less than the last 10 versions in this table so that in most cases reader does not need to scroll to the right to read

[Discuss] CRD for flink sql gateway in the flink k8s operator

2023-09-14 Thread Dongwoo Kim
Hi all, I've been working on setting up a flink SQL gateway in a k8s environment and it got me thinking — what if we had a CRD for this? So I have quick questions below. 1. Is there ongoing work to create a CRD for the Flink SQL Gateway? 2. If not, would the community be open to considering a

Re: [VOTE] FLIP-357: Deprecate Iteration API of DataStream

2023-09-14 Thread Yangze Guo
+1 (binding) Best, Yangze Guo On Thu, Sep 14, 2023 at 6:02 PM Yuxin Tan wrote: > > +1 (non-binding) > > Best, > Yuxin > > > Xintong Song 于2023年9月14日周四 17:14写道: > > > +1 (binding) > > > > Best, > > > > Xintong > > > > > > > > On Thu, Sep 14, 2023 at 3:48 PM Jing Ge > > wrote: > > > > >

Re: [VOTE] FLIP-357: Deprecate Iteration API of DataStream

2023-09-14 Thread Yuxin Tan
+1 (non-binding) Best, Yuxin Xintong Song 于2023年9月14日周四 17:14写道: > +1 (binding) > > Best, > > Xintong > > > > On Thu, Sep 14, 2023 at 3:48 PM Jing Ge > wrote: > > > +1(binding) > > > > Best regards, > > Jing > > > > On Thu, Sep 14, 2023 at 7:31 AM Dong Lin wrote: > > > > > Thanks Wencong

Re: [VOTE] FLIP-361: Improve GC Metrics

2023-09-14 Thread Maximilian Michels
+1 (binding) On Thu, Sep 14, 2023 at 4:26 AM Venkatakrishnan Sowrirajan wrote: > > +1 (non-binding) > > On Wed, Sep 13, 2023, 6:55 PM Matt Wang wrote: > > > +1 (non-binding) > > > > > > Thanks for driving this FLIP > > > > > > > > > > -- > > > > Best, > > Matt Wang > > > > > > Replied

Re: [VOTE] FLIP-357: Deprecate Iteration API of DataStream

2023-09-14 Thread Xintong Song
+1 (binding) Best, Xintong On Thu, Sep 14, 2023 at 3:48 PM Jing Ge wrote: > +1(binding) > > Best regards, > Jing > > On Thu, Sep 14, 2023 at 7:31 AM Dong Lin wrote: > > > Thanks Wencong for the FLIP. > > > > +1 (binding) > > > > On Thu, Sep 14, 2023 at 12:36 PM Wencong Liu > wrote: > > >

Re: [DISCUSS] FLIP-327: Support stream-batch unified operator to improve job throughput when processing backlog data

2023-09-14 Thread Xintong Song
Sorry to join the discussion late. Overall, I think it's a good idea to support dynamically switching the operator algorithms between Streaming (optimized towards low latency + checkpointing supports) and Batch (optimized towards throughput). This is indeed a big and complex topic, and I really

Re: [DISCUSS] FLIP-331: Support EndOfStreamWindows and isOutputOnEOF operator attribute to optimize task deployment

2023-09-14 Thread Xintong Song
Thanks for preparing this FLIP, Dong & Jinhao. I'm overall +1 to this proposal. This is helpful for some cases that we are dealing with. - Wencong and I are preparing guidelines for migrating from DataSet API to DataStream API. We noticed that users have to define a custom trigger in order to

Re: [DISSCUSS] Kubernetes Operator Flink Version Support Policy

2023-09-14 Thread Yang Wang
Since the users could always use the old Flink Kubernetes Operator version along with old Flink versions, I am totally in favor of this proposal to reduce maintenance burden. Best, Yang Biao Geng 于2023年9月6日周三 18:15写道: > +1 for the proposal. > > Best, > Biao Geng > > Gyula Fóra 于2023年9月6日周三

Re: [VOTE] FLIP-355: Add parent dir of files to classpath using yarn.provided.lib.dirs

2023-09-14 Thread Biao Geng
+1 (non-binding) Best, Biao Geng Yang Wang 于2023年9月14日周四 12:07写道: > +1 (binding) > > Best, > Yang > > Becket Qin 于2023年9月14日周四 11:01写道: > > > +1 (binding) > > > > Thanks for the FLIP, Archit. > > > > Cheers, > > > > Jiangjie (Becket) Qin > > > > > > On Thu, Sep 14, 2023 at 10:31 AM Dong Lin

Re: [VOTE] FLIP-357: Deprecate Iteration API of DataStream

2023-09-14 Thread Jing Ge
+1(binding) Best regards, Jing On Thu, Sep 14, 2023 at 7:31 AM Dong Lin wrote: > Thanks Wencong for the FLIP. > > +1 (binding) > > On Thu, Sep 14, 2023 at 12:36 PM Wencong Liu wrote: > > > Hi dev, > > > > > > I'd like to start a vote on FLIP-357. > > > > > > Discussion thread: > >

[jira] [Created] (FLINK-33087) FlinkSql unable to parse field annotation information

2023-09-14 Thread yuanfenghu (Jira)
yuanfenghu created FLINK-33087: -- Summary: FlinkSql unable to parse field annotation information Key: FLINK-33087 URL: https://issues.apache.org/jira/browse/FLINK-33087 Project: Flink Issue

[DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-14 Thread Chen Zhanghao
Hi Devs, Dewei (cced) and I would like to start a discussion on FLIP-367: Support Setting Parallelism for Table/SQL Sources [1]. Currently, Flink Table/SQL jobs do not expose fine-grained control of operator parallelism to users. FLIP-146 [2] brings us support for setting parallelism for