Flink 的 Taskmanager 间网络连接数、task之间的result sub partition 数对任务性能的影响。

2022-10-13 Thread yidan zhao
任务假设: 任务从kafka读取数据,经过若干复杂处理(process、window、join、等等),然后sink到kafka。 并发最高240(kafka分区数),当前采用全部算子相同并发方式部署。 算子间存在 hash、forward、rebalance 等分区情况。 此处假设 A 和 B 算子之间是 rebalance。 C 和 D 算子直接是 hash 分区(无数据倾斜)。ABCD都是240并发。 其他算子暂忽略。 TM连接数: Flink 的 taskmanager 之间的共享 tcp

Re: Flink SQL 中同时写入多个 sink 时,是否能够保证先后次序

2022-10-13 Thread Zhiwen Sun
好的,谢谢大家,之前也想过这个方案,复用/继承 JdbcDynamicTableSink 相关代码自定义 connector 。 Zhiwen Sun On Fri, Oct 14, 2022 at 10:08 AM yidan zhao wrote: > 在一个自定义sink中实现先写database,再发消息。 > > 或者2个都是自定义的,但是不能通过sink,因为sink后就没数据了。通过process,第一个process完成写入database后,后续process发送消息。 > > Shuo Cheng 于2022年10月12日周三 16:59写道: > > >

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Jark Wu
Hi Xintong, In terms of code, I think it's not complicated. It's all about we need a public discussion for the new metric name. And we don't want to block the release for the rarely used metric. Best, Jark On Fri, 14 Oct 2022 at 10:07, Xintong Song wrote: > @Qingsheng, > > I'm overall +1 to

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Xintong Song
@Qingsheng, I'm overall +1 to your proposal, with only one question: How complicated is it to come up with a metric for the internal traffic? I'm asking because, as the new feature is already out for 1.15 & 1.16, it would be nice if the corresponding new metrics can also be available in these

Re: Utilizing Kafka headers in Flink Kafka connector

2022-10-13 Thread Shengkai Fang
hi. You can use SQL API to parse or write the header in the Kafka record[1] if you are using Flink SQL. Best, Shengkai [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/#available-metadata Yaroslav Tkachenko 于2022年10月13日周四 02:21写道: > Hi, > > You can

Re: Re: flink cdc能否同步DDL语句?

2022-10-13 Thread Shengkai Fang
Hi. 可以从这个地方入手看看 https://github.com/ververica/flink-cdc-connectors/blob/master/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/source/reader/MySqlRecordEmitter.java#L95 Best, Shengkai casel.chen 于2022年10月11日周二 10:58写道: > 可以给一些hints吗?看哪些类? > > > > > > > > > > > > > >

Re: 退订

2022-10-13 Thread Shengkai Fang
你好,可以发送邮件到 user-zh-unsubscr...@flink.apache.org 来退订。 Best, Shengkai 13341000780 <13341000...@163.com> 于2022年10月10日周一 18:21写道: > > 退订 > > > > > -- > 发自我的网易邮箱手机智能版

Re: Build failing when Flink version upgrade from 1.11.6 to 1.15.0

2022-10-13 Thread Shengkai Fang
Hi. I read the trace and I find nothing is related about the flink... could you also give us some code snippets about the blocking test. Best, Shengkai Pappula, Prasanna via user 于2022年10月14日周五 00:06写道: > > > I have upgraded the flink version from 1.11.6 to 1.15.0. Build is failing. > It

Re: HA not working in standalone mode for operator 1.2

2022-10-13 Thread Javier Vegas
The jars that my build version creates have a version number, something like myapp-2.2.11.jar. I am lazy and want to avoid having to update the jarURI param (required in native mode) every time I deploy a new version of my app, and just update the Docker image I am using. Another solution would be

Re: Activate Flink HA without checkpoints on k8S

2022-10-13 Thread Gyula Fóra
Without HA, if the jobmanager goes down, job information is lost so the job won’t be restarted after the JM comes back up. Gyula On Thu, 13 Oct 2022 at 19:07, marco andreas wrote: > > > Hello, > > Can someone explain to me what is the point of using HA when deploying an > application cluster

Re: HA not working in standalone mode for operator 1.2

2022-10-13 Thread Gyula Fóra
Before we dive further into this can you please explain the jarURI problem your are trying to solve by switching to standalone? The native mode should work well in almost any setup. Gyula On Thu, 13 Oct 2022 at 21:41, Javier Vegas wrote: > Hi, I have a S3 HA Flink app that works as expected

Re: Validation error trying to use standalone mode with operator 1.2.0

2022-10-13 Thread Javier Vegas
Thanks, that fixed the problem! Sadly I am now running into a different problem with S3 HA when running in standalone mode, see https://lists.apache.org/thread/rf62htkr6govpr41fj3br4mzplsg9vg8 Cheers, Javier El vie, 7 oct 2022 a las 22:02, Gyula Fóra () escribió: > Hi! > > Seems like you still

HA not working in standalone mode for operator 1.2

2022-10-13 Thread Javier Vegas
Hi, I have a S3 HA Flink app that works as expected deployed via operator 1.2 in native mode, but I am seeing errors when switching to standalone mode (which I want to do mostly to save me having to set jarURI explicitly). I can see the job manager writes the JobGraph in S3, and in the web UI I

Re: allowNonRestoredState doesn't seem to be working

2022-10-13 Thread Yaroslav Tkachenko
I'm confident I'm hitting a bug, I guess I'm the first one trying this recovery in the standalone mode :-D Created https://issues.apache.org/jira/browse/FLINK-29633 On Thu, Oct 13, 2022 at 8:45 AM Yaroslav Tkachenko wrote: > Thanks folks, I understand this can be a limitation when redeploying.

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Qingsheng Ren
Hi devs and users, It looks like we are getting an initial consensus in the discussion so I started a voting thread [1] just now. Looking forward to your feedback! [1] https://lists.apache.org/thread/ozlf82mkm6ndx2n1vdgq532h156p4lt6 Best, Qingsheng On Thu, Oct 13, 2022 at 10:41 PM Jing Ge

Activate Flink HA without checkpoints on k8S

2022-10-13 Thread marco andreas
Hello, Can someone explain to me what is the point of using HA when deploying an application cluster with a single JM and the checkpoints are not activated. AFAK when the pod of the JM goes down kubernetes will restart it anyway so we don't need to activate the HA in this case. Maybe there's

Re: allowNonRestoredState doesn't seem to be working

2022-10-13 Thread Yaroslav Tkachenko
Thanks folks, I understand this can be a limitation when redeploying. I did try to delete my job and start it from scratch using "initialSavepointPath"... and I got the same issue. Going to investigate this more today. On Thu, Oct 13, 2022 at 12:18 AM Evgeniy Lyutikov wrote: > The problem is

Limiting backpressure during checkpoints

2022-10-13 Thread Robin Cassan via user
Hello all, hope you're well :) We are attempting to build a Flink job with minimal and stable latency (as much as possible) that consumes data from Kafka. Currently our main limitation happens when our job checkpoints the RocksDB state: backpressure is applied on the stream, causing latency. I am

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Jing Ge
Hi Qingsheng, Thanks for the clarification. +1, I like the idea. Pointing both numXXXOut and numXXXSend to the same external data transfer metric does not really break the new SinkV2 design, since there was no requirement to monitor the internal traffic. So, I think both developer and user can

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Qingsheng Ren
Hi Jing, Thanks for the reply! Let me rephrase my proposal: we’d like to use numXXXOut registered on SinkWriterOperator to reflect the traffic to the external system for compatibility with old versions before 1.15, and make numXXXSend have the same value as numXXXOut for compatibility within

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
Currently, exporting the env "HADOOP_CONF_DIR" could only work for native K8s integration. The flink client will try to create the hadoop-config-volume automatically if hadoop env found. If you want to set the HADOOP_CONF_DIR in the docker image, please also make sure the specified hadoop conf

Re: [DISCUSS] FLIP-265 Deprecate and remove Scala API support

2022-10-13 Thread Márton Balassi
Hi Martjin, After some more careful consideration I am in favor of dropping the Scala API support in with Flink 2.0 given that we add Java 17 support earlier or latest at the same time. Best, Marton On Thu, Oct 13, 2022 at 12:01 PM Chesnay Schepler wrote: > Support for records has not been

Re: [DISCUSS] FLIP-265 Deprecate and remove Scala API support

2022-10-13 Thread Chesnay Schepler
Support for records has not been investigated yet. We're still at the stage of getting things to run at all on Java 17. It _may_ be possible, it _may_ not be. On 13/10/2022 07:39, Salva Alcántara wrote: Hi Martijn, Maybe a bit of an off-topic, but regarding Java 17 support, will it be

Re: [ANNOUNCE] Apache Flink Table Store 0.2.1 released

2022-10-13 Thread Martijn Visser
Congratulations and thanks to all those involved! On Thu, Oct 13, 2022 at 4:47 AM Jingsong Lee wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink Table Store 0.2.1. > > Apache Flink Table Store is a unified storage to build dynamic tables > for both

Re: [ANNOUNCE] Apache Flink Table Store 0.2.1 released

2022-10-13 Thread Martijn Visser
Congratulations and thanks to all those involved! On Thu, Oct 13, 2022 at 4:47 AM Jingsong Lee wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink Table Store 0.2.1. > > Apache Flink Table Store is a unified storage to build dynamic tables > for both

Re: Is Flink 1.15.3 planned in foreseeable future?

2022-10-13 Thread Xintong Song
Actually, this is an on-going discussion related to 1.15.3. The community discovered a breaking change in 1.15.x and is discussing how to resolve this right now [1]. There is very likely a 1.15.3 release after this is resolved. Best, Xintong [1]

Re: Job Manager getting restarted while restarting task manager

2022-10-13 Thread Puneet Duggal
Hi,We use session deployment mode with HA setup. Currently we have 3 job managers and 3 task managers running on flink version 1.12.1. Please find attached the complete job manager logs. jobManager.log Description: Binary data On 13-Oct-2022, at 7:28 AM, Xintong Song

Re: Job uptime metric in Flink Operator managed cluster

2022-10-13 Thread Mason Chen
Hi all, I think what Meghajit is trying to understand is how to measure the uptime of a submitted Flink job. Prior to the K8s operator, perhaps the job manager was torn down with the job shutdown so the uptime value would stop; therefore, the uptime value also measures how long the job was

Is Flink 1.15.3 planned in foreseeable future?

2022-10-13 Thread Maciek Próchniak
Hello, I suppose that committers are heavily concentrated on 1.16, but are there plans to have 1.15.3 out? We've been affected by https://issues.apache.org/jira/browse/FLINK-28488 and it's preventing us from using 1.15.x at this moment. thanks, maciek

Re: allowNonRestoredState doesn't seem to be working

2022-10-13 Thread Evgeniy Lyutikov
The problem is that changing the FlinkDeployment specification (new jar version, changing pod resources, etc.) for JobManager is just a restart. 2022-09-16 09:30:52,526 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Restoring job from

Re: Kubernetes operator assign Job ID

2022-10-13 Thread Gyula Fóra
Hi! This change aligns with how newer Flink 1.16+ versions handle application job ids. There are some good reasons for doing this please see: https://issues.apache.org/jira/browse/FLINK-19358 https://issues.apache.org/jira/browse/FLINK-29109 If you want to go back to the old behaviour you need

Re: KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-13 Thread Mason Chen
Hi Andrew and Martijn, Thanks for looping me in, this is an interesting discussion! I'm trying to solve a higher level problem about Kafka topic routing/assignment with FLIP-246. The main idea is that there can exist an external service that can provide the coordination between Kafka and Flink to

Kubernetes operator assign Job ID

2022-10-13 Thread Evgeniy Lyutikov
Hi everyone After updating kuberneter operator to version 1.2.0 noticed that it started generating jobid for all deployments. 2022-10-13 06:18:30,724 o.a.f.k.o.c.FlinkDeploymentController [INFO ][infojob/infojob] Starting reconciliation 2022-10-13 06:18:30,725 o.a.f.k.o.l.AuditUtils

Re: Job uptime metric in Flink Operator managed cluster

2022-10-13 Thread Gyula Fóra
Sorry, what I said applies to Flink 1.15+ and the savepoint upgrade mode (not stateless). In any case if there is no job manager there are no metrics... So not sure how to answer your question. Gyula On Thu, Oct 13, 2022 at 8:24 AM Meghajit Mazumdar < meghajit.mazum...@gojek.com> wrote: > Hi

Re: Job uptime metric in Flink Operator managed cluster

2022-10-13 Thread Meghajit Mazumdar
Hi Gyula, Thanks for the prompt response. > The Flink operator currently does not delete the jobmanager pod when a deployment is suspended. Are you sure this is true ? I have re-tried this many times, but each time the pods get deleted, along with the deployment resources. Additionally, the

Re: allowNonRestoredState doesn't seem to be working

2022-10-13 Thread Gyula Fóra
Hi! If you have last-state upgrade mode configured it may happen that the allowNonRestoredState config is ignored by Flink (as the last-state upgrade mechanism somewhat bypasses the regular submission). Worst case scenario, you can suspend the deployment, manually record the last