What the default partition assignment strategy for KafkaSourceBuilder

2024-07-02 Thread Lei Wang
A simple flink task that consumes a kafka topic message and does some calculation. The number of partitions of the topic is 48, I set the parallel also 48 and expect one parallel consumes one partition. But after submitting the task I found that there's 5 parallels consuming two partitions and 5

Re:Re: Checkpoints and windows size

2024-06-19 Thread Feifan Wang
se. Maybe someone with more experience can give more valuable advice. > How can I find idle check point size of my project, I found below link but it > is not talking about parallelism. What do you mean "idle checkpoint size" ? —— Best regards, Feifan Wang 在

Re:Checkpoints and windows size

2024-06-19 Thread Feifan Wang
overhead, so this is a trade-off. My personal suggestion is to set the checkpoint interval to 5 minutes when using rocksdb incremental checkpoint. You can also make your own choice based on the impacts mentioned above. —— Best regards, Feifan Wang At 2024-06-19 12:08:57, "

Re: Problems with multiple sinks using postgres-cdc connector

2024-06-17 Thread Hongshun Wang
-> process1 -> process2 -> sink2 > `--> sink1 > > I get the errors described, where it appears that a second process is > created that attempts to use the current slot twice. > > On Mon, Jun 17, 2024 at 1:58 AM Hongshun Wang > wrote: > >> Hi David, >> >

Re: Problems with multiple sinks using postgres-cdc connector

2024-06-17 Thread Hongshun Wang
Hi David, > When I add this second sink, the postgres-cdc connector appears to add a second reader from the replication log, but with the same slot name. I don't understand what you mean by adding a second sink. Do they share the same source, or does each have a separate pipeline? If the former

Re: Force to commit kafka offset when stop a job.

2024-06-11 Thread Lei Wang
; Zhanghao Chen > ------ > *From:* Lei Wang > *Sent:* Thursday, June 6, 2024 16:54 > *To:* Zhanghao Chen ; ruanhang1...@gmail.com < > ruanhang1...@gmail.com> > *Cc:* user > *Subject:* Re: Force to commit kafka offset when stop a job. > > Thanks Zhanghao && Hang. >

Re: Force to commit kafka offset when stop a job.

2024-06-06 Thread Lei Wang
avepoint [1]. Flink which will > trigger a final offset commit on the final savepoint. > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint > > Best, > Zhanghao Chen > ---

Force to commit kafka offset when stop a job.

2024-06-05 Thread Lei Wang
When stopping a flink job that consuming kafka message, how to force it to commit kafka offset Thanks, Lei

Re: problem with the heartbeat interval feature

2024-05-18 Thread Hongshun Wang
/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/source/reader/IncrementalSourceReaderWithCommit.java#L93 On Sat, May 18, 2024 at 12:34 AM Thomas Peyric wrote: > thanks Hongshun for your response ! > > Le ven. 17 mai 2024 à 07:51, Hongshun Wang a > écrit : &g

Re: problem with the heartbeat interval feature

2024-05-16 Thread Hongshun Wang
Hi Thomas, In debezium dos says: For the connector to detect and process events from a heartbeat table, you must add the table to the PostgreSQL publication specified by the publication.name

Re: Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-23 Thread Lei Wang
share the related rocksdb log which > may contain more detailed info ? > > On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote: > >> >> I enable RocksDB native metrics and do some performance tuning. >> >> state.backend.rocksdb.block.cache-size is set to 128m,4 slots f

Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-11 Thread Lei Wang
I enable RocksDB native metrics and do some performance tuning. state.backend.rocksdb.block.cache-size is set to 128m,4 slots for each TaskManager. The observed result for one specific parallel slot: state.backend.rocksdb.metrics.block-cache-capacity is about 14.5M

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
Thanks very much, it finally works On Thu, Apr 11, 2024 at 8:27 PM Zhanghao Chen wrote: > Add a space between -yD and the param should do the trick. > > Best, > Zhanghao Chen > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 19:40

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
tyled CLI for YARN jobs where "-yD" instead of "-D" > should be used. > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 12:39 > *To:* Biao Geng > *Cc:* user > *Subject:* Re: How to enable RocksDB native metrics? &g

Re: Optimize exact deduplication for tens of billions data per day

2024-04-11 Thread Lei Wang
Hi Peter, I tried,this improved performance significantly,but i don't know exactly why. According to what i know, the number of keys in RocksDB doesn't decrease. Any specific technical material about this? Thanks, Lei On Fri, Mar 29, 2024 at 9:49 PM Lei Wang wrote: > Perhaps I can ke

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
ocksdb-native-metrics> >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> >> Sent from my iPhone >> >> On Apr 7, 2024, at 6:03 PM, Lei Wang wrote: >> >>  >> I want to enable

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
cs> >> [image: favicon.png] >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> >> >

Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-09 Thread Xiaolong Wang
Hi, I found a ClassNotFound exception when using Flink 1.19's AsyncScalarFunction. Stack trace: Caused by: java.lang.ClassNotFoundException: > org.apache.commons.text.StringSubstitutor > > at java.net.URLClassLoader.findClass(Unknown Source) ~[?:?] > > at java.lang.ClassLoader.loadClass(Unknown

Re: How to enable RocksDB native metrics?

2024-04-07 Thread Lei Wang
g/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics> > > Sent from my iPhone > > On Apr 7, 2024, at 6:03 PM, Lei Wang wrote: > >  > I want to enable it only for specified jobs, how can I specify the > configurations on cmd line when submitti

Re: How to enable RocksDB native metrics?

2024-04-07 Thread Lei Wang
; https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics >> (RocksDB Native Metrics) >> >> >> Best, >> Zakelly >> >> On Sun, Apr 7, 2024 at 4:47 PM Lei Wang wrote: >> >>> >>> U

How to enable RocksDB native metrics?

2024-04-06 Thread Lei Wang
Using big state and want to do some performance tuning, how can i enable RocksDB native metrics? I am using Flink 1.14.4 Thanks, Lei

Re: Optimize exact deduplication for tens of billions data per day

2024-03-29 Thread Lei Wang
helps, > Peter > > On Fri, Mar 29, 2024, 09:08 Lei Wang wrote: > >> >> Use RocksDBBackend to store whether the element appeared within the last >> one day, here is the code: >> >> *public class DedupFunction extends KeyedProcessFunction {* >>

Optimize exact deduplication for tens of billions data per day

2024-03-29 Thread Lei Wang
Use RocksDBBackend to store whether the element appeared within the last one day, here is the code: *public class DedupFunction extends KeyedProcessFunction {* *private ValueState isExist;* *public void open(Configuration parameters) throws Exception {* *ValueStateDescriptor

Re:Understanding checkpoint/savepoint storage requirements

2024-03-27 Thread Feifan Wang
Hi Robert : Your understanding are right ! Add some more information : JobManager not only responsible for cleaning old checkpoints, but also needs to write metadata file to checkpoint storage after all taskmanagers have taken snapshots. --- Best Feifan Wang

Re: Understanding RocksDBStateBackend in Flink on Yarn on AWS EMR

2024-03-26 Thread Yang Wang
Usually, you should use the HDFS nameservice instead of the NameNode hostname:port to avoid NN failover. And you could find the supported nameservice in the hdfs-site.xml in the key *dfs.nameservices*. Best, Yang On Fri, Mar 22, 2024 at 8:33 PM Sachin Mittal wrote: > So, when we create an EMR

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Huajie Wang
Congratulations Best, Huajie Wang Leonard Xu 于2024年3月20日周三 21:36写道: > Hi devs and users, > > We are thrilled to announce that the donation of Flink CDC as a > sub-project of Apache Flink has completed. We invite you to explore the new > resources available: > > - Git

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Huajie Wang
Congratulations Best, Huajie Wang Leonard Xu 于2024年3月20日周三 21:36写道: > Hi devs and users, > > We are thrilled to announce that the donation of Flink CDC as a > sub-project of Apache Flink has completed. We invite you to explore the new > resources available: > > - Git

Re: flink operator 高可用任务偶发性报错unable to update ConfigMapLock

2024-03-20 Thread Yang Wang
这种一般是因为APIServer那边有问题导致单次的ConfigMap renew lease annotation的操作失败,Flink默认会重试的 如果你发现因为这个SocketTimeoutException原因导致了任务Failover,可以把下面两个参数调大 high-availability.kubernetes.leader-election.lease-duration: 60s high-availability.kubernetes.leader-election.renew-deadline: 60s Best, Yang On Tue, Mar 12,

Re: Re:RE: RE: flink cdc动态加表不生效

2024-03-07 Thread Hongshun Wang
Hi, casel chan, 社区已经对增量框架实现动态加表(https://github.com/apache/flink-cdc/pull/3024 ),预计3.1对mongodb和postgres暴露出来,但是Oracle和Sqlserver目前并没暴露,你可以去社区参照这两个框架,将参数打开,并且测试和适配。 Best, Hongshun

Re: Jobmanager restart after it has been requested to stop

2024-02-02 Thread Yang Wang
If you could find the "Deregistering Flink Kubernetes cluster, clusterId" in the JobManager log, then it is not the expected behavior. Having the full logs of JobManager Pod before restarted will help a lot. Best, Yang On Fri, Feb 2, 2024 at 1:26 PM Liting Liu (litiliu) via user <

Re: [DISCUSS] Hadoop 2 vs Hadoop 3 usage

2024-01-15 Thread Yang Wang
I could share some metrics about Alibaba Cloud EMR clusters. The ratio of Hadoop2 VS Hadoop3 is 1:3. Best, Yang On Thu, Dec 28, 2023 at 8:16 PM Martijn Visser wrote: > Hi all, > > I want to get some insights on how many users are still using Hadoop 2 > vs how many users are using Hadoop 3.

Re: Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2024-01-15 Thread Yang Wang
Could you please configure the same HA configurations for TaskManager as well? It seems that the TaskManager container does not use a correct URL when contacting with ResourceManager. Best, Yang On Fri, Dec 29, 2023 at 11:13 PM Alessio Bernesco Làvore < alessio.berne...@gmail.com> wrote: >

Re: Deploying the K8S operator sample on GKE Autopilot : Association with remote system [akka.tcp://flink@basic-example.default:6123] has failed,

2024-01-15 Thread Yang Wang
Could you please directly use the JobManager Pod IP address instead of K8s service name(basic-example.default) and have a try with curl/wget? It seems that the JobManager K8s service could not be accessed. Best, Yang On Sat, Jan 13, 2024 at 1:24 AM LINZ, Arnaud wrote: > Hi, > > Some more

Re: Flink Kubernetes HA

2024-01-15 Thread Yang Wang
The fabric8 K8s client is using PATCH to replace get-and-update in v6.6.2. That's why you also need to give PATCH permission for the K8s service account. This would help to decrease the pressure of K8s APIServer. You could find more information here[1]. [1].

[flink-k8s-connector] In-place scaling up often takes several times till it succeeds.

2023-12-06 Thread Xiaolong Wang
Hi, I'm playing with a Flink 1.18 demo with the auto-scaler and the adaptive scheduler. The operator can correctly collect data and order the job to scale up, but it'll take the job several times to reach the required parallelism. E.g. The original parallelism for each vertex is something like

Re: Flink Kubernetes operator keeps reporting REST client timeout.

2023-11-20 Thread Xiaolong Wang
Seems the operator didn't get restarted automatically after the configmap is changed. After a roll-out restart, the exception disappeared. Never mind this issue. Thanks. On Tue, Nov 21, 2023 at 11:31 AM Xiaolong Wang wrote: > Hi, > > Recently I upgraded the flink-kubernetes-operator f

Flink Kubernetes operator keeps reporting REST client timeout.

2023-11-20 Thread Xiaolong Wang
Hi, Recently I upgraded the flink-kubernetes-operator from 1.4.0 to 1.6.1 to use Flink 1.18. After that, the operator kept reporting the following exception: 2023-11-21 03:26:50,505 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO > ][sn-push/sn-push-decision-maker-log-s3-hive-prd] Resource

Re:Custom Metrics not showing in prometheus

2023-09-18 Thread Matt Wang
to see if there is any error information about the metrics; [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#metrics-reporter-%3Cname%3E-filter-excludes -- Best, Matt Wang Replied Message | From | patricia lee | | Date | 09/18/2023 16:58

Re: Flink K8S operator does not support IPv6

2023-09-05 Thread Xiaolong Wang
FYI, adding environment variables of ` KUBERNETES_DISABLE_HOSTNAME_VERIFICATION=true` works for me. This env variable needs to be added to both the Flink operator and the Flink job definition. On Tue, Aug 8, 2023 at 12:03 PM Xiaolong Wang wrote: > Ok, thank you. > > On Tue, Aug 8, 2

Re: Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Ok, thank you. On Tue, Aug 8, 2023 at 11:22 AM Peter Huang wrote: > We will handle it asap. Please check the status of this jira > https://issues.apache.org/jira/browse/FLINK-32777 > > On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang > wrote: > >> Hi, >> >> I w

Flink K8S operator does not support IPv6

2023-08-07 Thread Xiaolong Wang
Hi, I was testing flink-kubernetes-operator in an IPv6 cluster and found out the below issues: *Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname > fd70:e66a:970d::1 not verified:* > > *certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=* > > *DN:

[Bug-report]Flink-operator 1.6.0 repo does not exist yet

2023-08-02 Thread Xiaolong Wang
Hi, I noticed that the newest documentation of the flink-operator has pointed to v1.6.0, yet when using the `helm repo add flink-operator-repo https://downloads.apache.org/flink/flink-kubernetes-operator-1.6.0/` command to install, it turns out that the given URL does not exist. I suppose that

Re: Parallelism under reactive scaling with slot sharing groups

2023-07-31 Thread Allen Wang
; using the master branch. > > > Best, > Weihua > > > On Tue, Jul 25, 2023 at 2:56 AM Allen Wang wrote: > >> Hello, >> >> Our job has operators of source -> sink -> global committer. We have >> created two slot sharing groups, one for source and

Re: 退订

2023-07-26 Thread Edward Wang
退订 wang <24248...@163.com> 于2023年7月13日周四 07:34写道: > 退订 -- Best Regards, *Yaohua Wang 王耀华* School of Software Technology, Xiamen University Tel: (+86)187-0189-5935 E-mail: wangyaohua2...@gmail.com

退订

2023-07-26 Thread Edward Wang
*退订*

Parallelism under reactive scaling with slot sharing groups

2023-07-24 Thread Allen Wang
Hello, Our job has operators of source -> sink -> global committer. We have created two slot sharing groups, one for source and sink and one for global committer. The global committer has specified max parallelism of 1. No max parallelism set with the source/sink while there is a system level

Re: How to resume a job from checkpoint with the SQL gateway.

2023-07-18 Thread Xiaolong Wang
able/sqlclient/#terminating-a-job > > Best, > Shammon FY > > > On Tue, Jul 18, 2023 at 9:56 AM Xiaolong Wang > wrote: > >> Hi, Shammon, >> >> I know that the job manager can auto-recover via HA configurations, but >> what if I want to upgrade the running F

Unsubscribe

2023-07-17 Thread wang
Unsubscribe

Unsubscribe

2023-07-17 Thread William Wang

Hadoop Error on ECS Fargate

2023-07-13 Thread Wang, Mengxi X via user
persist. Can anybody help please? Best wishes, Mengxi Wang This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, malicious content and monitoring of electronic messages. If you

How to resume a job from checkpoint with the SQL gateway.

2023-07-12 Thread Xiaolong Wang
Hi, I'm currently working on providing a SQL gateway to submit both streaming and batch queries. My question is, if a streaming SQL is submitted and then the jobmanager crashes, is it possible to resume the streaming SQL from the latest checkpoint with the SQL gateway ?

退订

2023-07-12 Thread wang
退订

Unsubscribe

2023-07-12 Thread wang
Unsubscribe

Part files generated in reactive mode

2023-07-04 Thread Wang, Mengxi X via user
Hi, We want to process one 2GB file and the output should also be a single 2GB file, but after we enabled reactive mode it generated several hundred small output files instead of one 2GB file. Can anybody help please? Best wishes, Mengxi Wang This message is confidential and subject to terms

SQL-gateway Failed to Run

2023-07-03 Thread Xiaolong Wang
Hi, I've tested the Flink SQL-gateway to run some simple Hive queries but met some exceptions. Environment Description: Run on : Kubernetes Deployment Mode: Session Mode (created by a flink-kubernetes-operator) Steps to run: 1. Apply a `flinkdeployment` of flink session cluster to flink operator

Re: Default Log4j properties in Native Kubernetes

2023-06-21 Thread Yang Wang
I assume you are using "*bin/flink run-application*" to submit a Flink application to K8s cluster. Then you could simply update your local log4j-console.properties, it will be shipped and mounted to JobManager/TaskManager pods via ConfigMap. Best, Yang Vladislav Keda 于2023年6月20日周二 22:15写道: >

Re: 退订

2023-05-10 Thread Hongshun Wang
如果需要取消订阅 user-zh@flink.apache.org 邮件组,请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org ,参考[1] [1] https://flink.apache.org/zh/community/ On Wed, May 10, 2023 at 1:38 AM Zhanshun Zou wrote: > 退订 >

Re: 使用Flink SQL如何实现支付对帐超时告警?

2023-05-10 Thread Hongshun Wang
Hi casel.chen, 我理解你的意思是: 希望在ThirdPartyPaymentStream一条数据达到的30分钟后,*再触发查询* ,如果此时该数据在PlatformPaymentStream中还未出现,说明超时未支付,则输入到下游。而不是等ThirdPartyPaymentStream数据达到时再判断是否超时,因为此时虽然超时达到,但是也算已支付,没必要再触发报警了。 如果是流计算,可以采用timer定时器延时触发。 对于sql, 我个人的一个比较绕的想法是(供参考,不一定对):是通过Pulsar

Re: 退订

2023-05-09 Thread Hongshun Wang
Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh-unsubscr...@flink.apache.org , and you can refer[1][2] for more details. 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理邮件订阅。

Re: 退订

2023-05-06 Thread Hongshun Wang
Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh-unsubscr...@flink.apache.org , and you can refer[1][2] for more details. 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理邮件订阅。

Re: streaming.api.operators和streaming.runtime.operators的区别是啥?

2023-05-06 Thread Hongshun Wang
我来谈一下我个人的看法,streaming.api.operators是提供给用户使用的stream api, 用户可以使用和扩展该接口。而streaming.runtime.operators是用户侧不感知,在执行时由flink自动调用的。比如: Sink用户可以自己设置,如kafkaSink。但是输出时的state处理和事务commit(CommitterOperator)是Flink根据不同类型的Sink自动生成的统一逻辑,用户无需自己设置和实现。 Best Hongshun On Sat, May 6, 2023 at 11:57 AM yidan zhao wrote:

Re: flink issue可以登录,但是flink中文邮箱账号密码错误,是出现什么原因了嘛

2023-05-05 Thread Hongshun Wang
> > flink issue可以登录 这个是jira账号吗? flink中文邮箱账号密码 什么是flink中文邮箱账号 ?有无登陆页面链接 On Wed, Apr 19, 2023 at 11:36 AM kcz <573693...@qq.com.invalid> wrote: > 请帮忙看看是我哪里出问题了嘛?我的账号是kcz。我想咨询大佬flink avro的问题 > > > > > kcz > 573693...@qq.com > > > >

Re: 退订

2023-05-05 Thread Hongshun Wang
Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh-unsubscr...@flink.apache.org , and you can refer[1][2] for more details. 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理邮件订阅。

Re: 退订

2023-05-05 Thread Hongshun Wang
Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh-unsubscr...@flink.apache.org , and you can refer[1][2] for more details. 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理邮件订阅。

Re: 退订

2023-05-05 Thread Hongshun Wang
Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh-unsubscr...@flink.apache.org , and you can refer[1][2] for more details. 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理邮件订阅。 On

Re: 退订

2023-05-05 Thread Hongshun Wang
如果需要取消订阅 u...@flink.apache.org 和 d...@flink.apache.org 邮件组,请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 和 dev-unsubscr...@flink.apache.org ,参考[1] [1] https://flink.apache.org/zh/community/ On Fri, May 5, 2023 at 3:24 PM wuzhongxiu wrote: > 退订 > > > > | | > go574...@163.com > | > | >

Re: Flink实时计算平台在k8s上以Application模式启动作业如何实时同步作业状态到平台?

2023-04-11 Thread Yang Wang
可以通过JobResultStore[1]来获取任务最终的状态,flink-kubernetes-operator也是这样来获取的 [1]. https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+Introduce+the+JobResultStore Best, Yang Weihua Hu 于2023年3月22日周三 10:27写道: > Hi > > 我们内部最初版本是通过 cluster-id 来唯一标识一个 application,同时认为流式任务是长时间运行的,不应该主动退出。如果该 >

Re: 监控flink的prometheus经常OOM

2023-04-11 Thread Yang Wang
可以通过给Prometheus server来配置metric_relabel_configs[1]来控制采集哪些metrics [1]. https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs Best, Yang casel.chen 于2023年3月22日周三 13:47写道: >

Whether Flink SQL window operations support "Allow Lateness and SideOutput"?

2023-02-20 Thread wang
Hi dear engineers, One question as title: Whether Flink SQL window operations support "Allow Lateness and SideOutput"? Just as supported in Datastream api (allowedLateness and sideOutputLateData) like: SingleOutputStreamOperator<>sumStream = dataStream.keyBy().timeWindow()

Whether Flink SQL window operations support "Allow Lateness and SideOutput"?

2023-02-20 Thread wang
Hi dear engineers, One question as title: Whether Flink SQL window operations support "Allow Lateness and SideOutput"? Just as supported in Datastream api (allowedLateness and sideOutputLateData) like: SingleOutputStreamOperator<>sumStream = dataStream.keyBy().timeWindow()

Re: "Error while retrieving the leader gateway" when using Kubernetes HA

2023-01-30 Thread Yang Wang
I assume you are using the standalone mode. Right? For the native K8s mode, the leader address should be *akka.tcp://flink@JM_POD_IP:6123/user/rpc/dispatcher_1 *when HA enabled. Best, Yang Anton Ippolitov via user 于2023年1月31日周二 00:21写道: > This is actually what I'm already doing, I'm only

Re: Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Yang Wang
The "JAR file does not exist" exception comes from the JobManager side, not on the client. Please be aware that the local:// scheme in the jarURI means the path in the JobManager pod. You could use an init-container to download your user jar and mount it to the JobManager main-container. Refer to

Re: Supplying jar stored at S3 to flink to run the job in kubernetes

2023-01-16 Thread Yang Wang
Do you build your own flink-kubernetes-operator image with the flink-s3-fs plugin bundled[1]? [1]. https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.3/docs/custom-resource/overview/#flinksessionjob-spec-overview Best, Yang Weihua Hu 于2023年1月17日周二 10:47写道: > Hi, Rahul

Re: Flink Job Manager Recovery from EKS Node Terminations

2023-01-11 Thread Yang Wang
First, JobManager does not store any persistent data to local when the Kubernetes HA + S3 used. It means that you do not need to mount a PV for JobMananger deployment. Secondly, node failures or terminations should not cause the CrashLoopBackOff status. One possible reason I could imagine is a

Re: The use of zookeeper in flink

2023-01-03 Thread Yang Wang
The reason why the running jobs try to failover with zookeeper outage is that the JobManager lost leadership. Having a standby JobManager or not makes no difference. Best, Yang Matthias Pohl via user 于2023年1月2日周一 20:51写道: > And I screwed up the reply again. -.- Here's my previous response for

Re: How to get failed streaming Flink job log in Flink Native K8s mode?

2023-01-03 Thread Yang Wang
I think you might need a sidecar container or daemonset to collect the Flink logs and store into a persistent storage. You could find more information here[1]. [1]. https://www.alibabacloud.com/blog/best-practices-of-kubernetes-log-collection_596356 Best, Yang hjw 于2022年12月22日周四 23:28写道: > On

Re: 退订

2022-12-27 Thread Lijie Wang
Hi,退订请发送任意内容至邮箱user-zh-unsubscr...@flink.apache.org Best, Lijie DannyLau 于2022年12月27日周二 09:54写道: > 退订 > | | > 刘朝兵 > |

Re: Stand alone K8s HA mode with Static Tokens Used by Service Accounts

2022-11-24 Thread Yang Wang
IIUC, the fabric8 Kubernetes-client 5.5.0 should already support to reload the latest kube config if received 401 error. Refer to the following PR[1] for more information. Please share your feedback here if it still could not work. [1]. https://github.com/fabric8io/kubernetes-client/pull/2731

Re: flink作业提交运行后如何监听作业状态发生变化?

2022-11-23 Thread Yang Wang
其实可以参考Flink Kubernetes Operator里面的做法,设置execution.shutdown-on-application-finish参数为false 然后通过轮询Flink RestAPI拿到job的状态,job结束了再主动停掉Application cluster Best, Yang JasonLee <17610775...@163.com> 于2022年11月24日周四 09:59写道: > Hi > > > 可以通过 Flink 的 Metric 和 Yarn 的 Api 去获取任务的状态(任务提交到 yarn 的话) > > > Best >

Re: Optimize ApplicationDeployer API design

2022-11-23 Thread Yang Wang
Just Kindly remind, you attached images could not show normally. Given that *ApplicationDeployer* is not only used for Yarn application mode, but also native Kubernetes, I am not sure which way you are referring to return the applicationId. We already print the applicationId in the client logs.

Re: support escaping `#` in flink job spec in Flink-operator

2022-11-08 Thread Yang Wang
This is a known limit of the current Flink options parser. Refer to FLINK-15358[1] for more information. [1]. https://issues.apache.org/jira/browse/FLINK-15358 Best, Yang Gyula Fóra 于2022年11月8日周二 14:41写道: > It is also possible that this is a problem of the Flink native Kubernetes >

Re: flink on k8s 提交作业,使用 oss 作为 checkpoint 地址,但找不到 oss

2022-11-07 Thread Lijie Wang
flink-oss-fs-hadoop-1.13.6.jar 这个 jar 需要放到 flink 的 lib 目录下 Best, Lijie highfei2011 于2022年11月1日周二 16:23写道: > 包冲突了。 > > > 在 2022年11月1日 15:39,highfei2011 写道: > > > flink 版本:apache flink 1.13.6 flink operator 版本: 1.2.0 > 提交命令:kubernetes-jobmanager.sh kubernetes-application 异常: Caused by: >

Re: [DISCUSS ] add --jars to support users dependencies jars.

2022-10-27 Thread Yang Wang
Thanks Jacky Lau for starting this discussion. I understand that you are trying to find a convenient way to specify dependency jars along with user jar. However, let's try to narrow down by differentiating deployment modes. # Standalone mode No matter you are using the standalone mode on virtual

Re: configMap value error when using flink-operator?

2022-10-26 Thread Yang Wang
Maybe we could change the values of *taskmanager.numberOfTaskSlots* and *parallelism.default *in flink-conf.yaml of Kubernetes operator to 1, which are aligned with the default values in Flink codebase. Best, Yang Gyula Fóra 于2022年10月26日周三 15:17写道: > Hi! > > I agree that this might be

Re: batch job 结束时, flink-k8s-operator crd 状态展示不清晰

2022-10-25 Thread Yang Wang
从1.15开始,任务结束不会主动把JobManager删除掉了。所以Kubernetes Operator就可以正常查到Job状态并且更新 Best, Yang ¥¥¥ 于2022年10月25日周二 15:58写道: > 退订 > > > > > --原始邮件-- > 发件人: > "user-zh" > > 发送时间:2022年10月25日(星期二) 下午3:33 > 收件人:"user-zh" > 主题:batch

Re: Flink Native K8S RBAC

2022-10-20 Thread Yang Wang
I have created a ticket[1] to fill the missing part in the native K8s documentation. [1]. https://issues.apache.org/jira/browse/FLINK-29705 Best, Yang Gyula Fóra 于2022年10月20日周四 13:37写道: > Hi! > > As a reference you can look at how the Flink Kubernetes Operator manages > RBAC settings: > > >

Re: Activate Flink HA without checkpoints on k8S

2022-10-19 Thread Yang Wang
Add some more information to Gyula's comment. For application mode without checkpoint, you do not need to activate the HA since it will not take any effect and the Flink job will be submitted again after the JobManager restarted. Because the job submission happens on the JobManager side. For

Watermark generating mechanism in Flink SQL

2022-10-17 Thread wang
Hi dear engineers, I have one question about watermark generating mechanism in Flink SQL. There are two mechanisms called Periodic Watermarks and Punctuated Watermarks, I want to use Periodic Watermarks with interval 5 seconds (meaning watermarks will be generated every 5 seconds), how

Watermark generating mechanism in Flink SQL

2022-10-17 Thread wang
Hi dear engineers, I have one question about watermark generating mechanism in Flink SQL. There are two mechanisms called Periodic Watermarks and Punctuated Watermarks, I want to use Periodic Watermarks with interval 5 seconds (meaning watermarks will be generated every 5 seconds), how

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
Currently, exporting the env "HADOOP_CONF_DIR" could only work for native K8s integration. The flink client will try to create the hadoop-config-volume automatically if hadoop env found. If you want to set the HADOOP_CONF_DIR in the docker image, please also make sure the specified hadoop conf

Re: PartitionNotFoundException

2022-09-28 Thread Lijie Wang
Hi, 可以尝试增大一下 taskmanager.network.request-backoff.max 的值。默认值是 1,也就是 10 s。 上下游可能是并发部署的,所以是有可能下游请求 partition 时,上游还没部署完成,增大 taskmanager.network.request-backoff.max 可以增加下游的等待时间和重试次数,减小出现 PartitionNotFoundException 的概率。 Best, Lijie yidan zhao 于2022年9月28日周三 17:35写道: >

Re: Flink任务异常停止

2022-09-27 Thread Lijie Wang
建议 dump 下 TM 内存看下具体内存使用情况 Best, Lijie lxk 于2022年9月28日周三 09:46写道: > 最近Flink任务运行一段时间后就会自动停止。从JM和TM能看到的有效信息只有下面这段: > > 2022-09-24 07:18:16,303 INFO > org.apache.flink.yarn.YarnTaskExecutorRunner [] - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. > 2022-09-24

Re: native k8s部署模式下使用HA架构的HDFS集群无法正常连接

2022-09-21 Thread Yang Wang
你在Flink client端提交任务前设置一下HADOOP_CONF_DIR环境变量 然后再运行flink run-application命令 Best, Yang yanfei lei 于2022年9月22日周四 11:04写道: > Hi Tino, > 从org.apache.flink.core.fs.FileSystem.java > < >

Re: serviceAccount permissions issue for high availability in operator 1.1

2022-09-20 Thread Yang Wang
ay to change that to use standalone K8s? I haven't > seen anything about that in the docs, besides a mention that standalone > support is coming in version 1.2 of the operator. > > Thanks, > > Javier > > > On Thu, Sep 8, 2022, 22:50 Yang Wang wrote: > >> Since t

Re: serviceAccount permissions issue for high availability in operator 1.1

2022-09-08 Thread Yang Wang
Since the flink-kubernetes-operator is using native K8s integration[1] by default, you need to give the permissions of pod and deployment as well as ConfigMap. You could find more information about the RBAC here[2]. [1].

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-08 Thread Yang Wang
uld be a warning then > > What about the 1st error we encountered regarding the kube/config file > exception? > > > Thank you so much, > Best, > Tamir > > -- > *From:* Yang Wang > *Sent:* Thursday, September 8, 2022 7:08 AM > *To:*

Re: [Flink Kubernetes Operator] FlinkSessionJob crd spec jarURI

2022-09-08 Thread Yang Wang
Given that the "local://" schema means the jar is available in the image/container of JobManager, so it could only be supported in the K8s application mode. If you configure the jarURI to "file://" schema for session cluster, it means that this jar file should be available in the

Re: Deploying Jobmanager on k8s as a Deployment

2022-09-07 Thread Yang Wang
For native K8s integration, the Flink ResourceManager will delete the JobManager K8s deployment as well as the HA data once the job reached a globally terminal state. However, it is indeed a problem for standalone mode since the JobManager will be restarted again even the job has finished. I

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-07 Thread Yang Wang
-events-insertion-cluster-config-map" already > exists. > > Log file is enclosed. > > Thanks, > Tamir. > > -- > *From:* Yang Wang > *Sent:* Monday, September 5, 2022 3:03 PM > *To:* Tamir Sagi > *Cc:* user@flink.apache.org ; Lihi Peretz < > lihi.per...@

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

2022-09-05 Thread Yang Wang
Could you please check whether the "kubernetes.config.file" is configured to /opt/flink/.kube/config in the Flink configmap? It should be removed before creating the Flink configmap. Best, Yang Tamir Sagi 于2022年9月4日周日 18:08写道: > Hey All, > > We recently updated to Flink 1.15.1. We deploy

  1   2   3   4   5   6   7   8   9   10   >