退订

2024-03-11 Thread 王阳
退订

退订

2024-03-11 Thread 熊柱
退订

回复:flink operator 高可用任务偶发性报错unable to update ConfigMapLock

2024-03-11 Thread kellygeorg...@163.com
有没有高手指点一二???在线等 回复的原邮件 | 发件人 | kellygeorg...@163.com | | 日期 | 2024年03月11日 20:29 | | 收件人 | user-zh | | 抄送至 | | | 主题 | flink operator 高可用任务偶发性报错unable to update ConfigMapLock | jobmanager的报错如下所示,请问是什么原因? Exception occurred while renewing lock:Unable to update ConfigMapLock Caused

flink operator 高可用任务偶发性报错unable to update ConfigMapLock

2024-03-11 Thread kellygeorg...@163.com
jobmanager的报错如下所示,请问是什么原因? Exception occurred while renewing lock:Unable to update ConfigMapLock Caused by:io.fabric8.kubernetes.client.Kubernetes Client Exception:Operation:[replace] for kind:[ConfigMap] with name:[flink task xx- configmap] in namespace:[default] Caused by:

Re: TTL in pyflink does not seem to work

2024-03-11 Thread Ivan Petrarka
Thanks! We’ve created and issue for that:  https://issues.apache.org/jira/browse/FLINK-34625 Yeap, planning to use timers as workaround for now On Mar 10, 2024 at 02:59 +0400, David Anderson , wrote: > My guess is that this only fails when pyflink is used with the heap state > backend, in which

Flink performance

2024-03-11 Thread Kamal Mittal via user
Hello, Can you please point me to documentation if any such available where flink talks about or documented performance numbers w.r.t certain use cases? Rgds, Kamal

Re: 退订

2024-03-10 Thread Hang Ruan
请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh-unsubscr...@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8 [2] https://flink.apache.org/community.html#mailing-lists 王新隆 于2024年3月11日周一

Re: Entering a busy loop when adding a new sink to the graph + checkpoint enabled

2024-03-10 Thread Hang Ruan
Hi, Xuyang & Daniel. I have checked this part of code. I think it is an expected behavior. As marked in code comments, this loop makes sure that the transactions before this checkpoint id are re-created. The situation Daniel mentioned will happen only when all checkpoint between 1 and 2

退订

2024-03-10 Thread 王新隆
退订

Re:Entering a busy loop when adding a new sink to the graph + checkpoint enabled

2024-03-10 Thread Xuyang
Hi, Danny. When the problem occurs, can you use flame graph to confirm whether the loop in this code is causing the busyness? Since I'm not particularly familiar with kafka connector, I can't give you an accurate reply. I think Hang Ruan is an expert in this field :). Hi, Ruan Hang. Can you

Re: FlinkSQL connector jdbc 用户密码是否可以加密/隐藏

2024-03-10 Thread gongzhongqiang
hi, 东树 隐藏sql中的敏感信息,这个需要外部的大数据平台来做。 比如:StreamPark 的变量管理,可以提前维护好配置信息,编写sql时引用配置,由平台提交至flink时解析sql并替换变量。 Best, Zhongqiang Gong 杨东树 于2024年3月10日周日 21:50写道: > 各位好, >考虑到数据库用户、密码安全性问题,使用FlinkSQL connector > jdbc时,请问如何对数据库的用户密码进行加密/隐藏呢。例如如下常用sql中的password: > CREATE TABLE wordcount_sink ( >

Re: FlinkSQL connector jdbc 用户密码是否可以加密/隐藏

2024-03-10 Thread Feng Jin
1. 目前 JDBC connector 本身不支持加密, 我理解可以在提交 SQL 给 SQL 文本来做加解密的操作,或者做一些变量替换来隐藏密码。 2. 可以考虑提前创建好 jdbc catalog,从而避免编写 DDL 暴露密码。 Best, Feng On Sun, Mar 10, 2024 at 9:50 PM 杨东树 wrote: > 各位好, >考虑到数据库用户、密码安全性问题,使用FlinkSQL connector > jdbc时,请问如何对数据库的用户密码进行加密/隐藏呢。例如如下常用sql中的password: > CREATE TABLE

Re:Re: Schema Evolution & Json Schemas

2024-03-10 Thread Jensen
退订 At 2024-02-26 20:55:19, "Salva Alcántara" wrote: Awesome Andrew, thanks a lot for the info! On Sun, Feb 25, 2024 at 4:37 PM Andrew Otto wrote: > the following code generator Oh, and FWIW we avoid code generation and POJOs, and instead rely on Flink's Row or RowData

FlinkSQL connector jdbc 用户密码是否可以加密/隐藏

2024-03-10 Thread 杨东树
各位好, 考虑到数据库用户、密码安全性问题,使用FlinkSQL connector jdbc时,请问如何对数据库的用户密码进行加密/隐藏呢。例如如下常用sql中的password: CREATE TABLE wordcount_sink ( word String, cnt BIGINT, primary key (word) not enforced ) WITH ( 'connector' = 'jdbc', 'url' = 'jdbc:mysql://localhost:3306/flink', 'username' = 'root',

Re: TTL in pyflink does not seem to work

2024-03-09 Thread David Anderson
My guess is that this only fails when pyflink is used with the heap state backend, in which case one possible workaround is to use the RocksDB state backend instead. Another workaround would be to rely on timers in the process function, and clear the state yourself. David On Fri, Mar 8, 2024 at

Re: Re: Running Flink SQL in production

2024-03-08 Thread Robin Moffatt via user
That makes sense, thank you. I found FLIP-316 [1] and will keep an eye on it too. Thanks, Robin. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Support+application+mode+for+SQL+Gateway On Fri, 8 Mar 2024 at 13:56, Zhanghao Chen wrote: > Hi Robin, > > It's better to use

Re: Re: Running Flink SQL in production

2024-03-08 Thread Zhanghao Chen
Hi Robin, It's better to use Application mode [1] for mission-critical long-running SQL jobs as it provides better isolation, you can utilize the table API to package a jar as suggested by Feng to do so. Neither SQL client nor SQL gateway supports submitting SQL in Application mode for now,

Re: 回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-08 Thread Zhanghao Chen
事实上是可行的。你可以直接修改 StreamExecutionEnvironment 的源码,默认给作业作业注册上一个你们定制的 listener,然后通过某种那个方式把这个信息透出来。在 FLIP-314 [1] 中,我们计划直接在 Flink 里原生提供一个这样的接口让你去注册自己的 listener 获取血缘信息,不过还没发布,可以先自己做。 [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-314:+Support+Customized+Job+Lineage+Listener

回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-08 Thread 阿华田
我们想修改源码 实现任意任务提交实时平台,初始化DAG的时候获取到血缘信息,StreamExecutionEnvironment注册 这种只能写在任务里 不满足需求 | | 阿华田 | | a15733178...@163.com | 签名由网易邮箱大师定制 在2024年03月8日 18:23,Zhanghao Chen 写道: 你可以看下 OpenLineage 和 Flink 的集成方法 [1],它是在 StreamExecutionEnvironment 里注册了一个 JobListener(通过这个可以拿到 JobClient 进而拿到 job id)。然后从

Re: 回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-08 Thread Zhanghao Chen
你可以看下 OpenLineage 和 Flink 的集成方法 [1],它是在 StreamExecutionEnvironment 里注册了一个 JobListener(通过这个可以拿到 JobClient 进而拿到 job id)。然后从 execution environment 里可以抽取到 transformation 信息处理 [2]。 [1] https://openlineage.io/docs/integrations/flink/ [2]

回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-08 Thread 阿华田
”JobGraph 可以获得 transformation 信息“, JobGraph可以直接获取transformation的信息吗?, 我们是在 SourceTransformation 和SinkTransformation反射拿到链接信息 ,但是这个地方拿不到flinkJobid, JobGraph可以拿到source和sink的链接信息和flinkJobid? | | 阿华田 | | a15733178...@163.com | JobGraph 可以获得 transformation 信息transformation 签名由网易邮箱大师定制

Re: TTL in pyflink does not seem to work

2024-03-08 Thread lorenzo.affetti.ververica.com via user
Hello Ivan! Could you please create a JIRA issue out of this? That seem the proper place where to discuss this. It seems a bug as the two versions of the code you posted look identical, and the behavior should be consistent. On Mar 7, 2024 at 13:09 +0100, Ivan Petrarka , wrote: > Note, that in

Re: 回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-08 Thread Zhanghao Chen
JobGraph 里有个字段就是 jobid。 Best, Zhanghao Chen From: 阿华田 Sent: Friday, March 8, 2024 14:14 To: user-zh@flink.apache.org Subject: 回复: Flink DataStream 作业如何获取到作业血缘? 获取到Source 或者 DorisSink信息之后, 如何知道来自那个flink任务,好像不能获取到flinkJobId | | 阿华田 | | a15733178...@163.com |

回复: Flink DataStream 作业如何获取到作业血缘?

2024-03-07 Thread 阿华田
获取到Source 或者 DorisSink信息之后, 如何知道来自那个flink任务,好像不能获取到flinkJobId | | 阿华田 | | a15733178...@163.com | 签名由网易邮箱大师定制 在2024年02月26日 20:04,Feng Jin 写道: 通过 JobGraph 可以获得 transformation 信息,可以获得具体的 Source 或者 Doris Sink,之后再通过反射获取里面的 properties 信息进行提取。 可以参考 OpenLineage[1] 的实现. 1.

Re: Flink Checkpoint & Offset Commit

2024-03-07 Thread Yanfei Lei
Hi Jacob, > I have multiple upstream sources to connect to depending on the business > model which are not Kafka. Based on criticality of the system and publisher > dependencies, we cannot switch to Kafka for these. Sounds like you want to implement some custom connectors, [1][2] may be

Re:Re:flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread iasiuide
好的,已经贴了sql片段 在 2024-03-08 11:02:34,"Xuyang" 写道: >Hi, 你的图挂了,可以用图床或者直接贴SQL > > > > >-- > >Best! >Xuyang > > > > >在 2024-03-08 10:54:19,"iasiuide" 写道: > > > > > >下面的sql片段中 >ods_ymfz_prod_sys_divide_order 为kafka source表 >dim_ymfz_prod_sys_trans_log 为mysql为表

Re:Re: flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread iasiuide
你好,我们用的是1.13.2和1.15.4版本的,看了下flink ui,这两种版本针对下面sql片段的lookup执行计划中的关联维表条件是一样的 在 2024-03-08 11:08:51,"Yu Chen" 写道: >Hi iasiuide, >方便share一下你使用的flink版本与jdbc connector的版本吗?据我所了解,jdbc >connector在FLINK-33365[1]解决了lookup join条件丢失的相关问题。 > >[1] https://issues.apache.org/jira/browse/FLINK-33365 > >祝好~ >

Re: flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread Yu Chen
Hi iasiuide, 方便share一下你使用的flink版本与jdbc connector的版本吗?据我所了解,jdbc connector在FLINK-33365[1]解决了lookup join条件丢失的相关问题。 [1] https://issues.apache.org/jira/browse/FLINK-33365 祝好~ > 2024年3月8日 11:02,iasiuide 写道: > > > > > 图片可能加载不出来,下面是图片中的sql片段 > .. > END AS trans_type, > >

Re:flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread Xuyang
Hi, 你的图挂了,可以用图床或者直接贴SQL -- Best! Xuyang 在 2024-03-08 10:54:19,"iasiuide" 写道: 下面的sql片段中 ods_ymfz_prod_sys_divide_order 为kafka source表 dim_ymfz_prod_sys_trans_log 为mysql为表 dim_ptfz_ymfz_merchant_info 为mysql为表 flink web ui界面的执行计划片段如下:

Re:flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread iasiuide
图片可能加载不出来,下面是图片中的sql片段 .. END AS trans_type, a.div_fee_amt, a.ts FROM ods_ymfz_prod_sys_divide_order a LEFT JOIN dim_ymfz_prod_sys_trans_log FOR SYSTEM_TIME AS OF a.proc_time AS b ON a.bg_rel_trans_id = b.bg_rel_trans_id AND

flink sql关联维表在lookup执行计划中的关联条件问题

2024-03-07 Thread iasiuide
下面的sql片段中 ods_ymfz_prod_sys_divide_order 为kafka source表 dim_ymfz_prod_sys_trans_log 为mysql为表 dim_ptfz_ymfz_merchant_info 为mysql为表 flink web ui界面的执行计划片段如下: [1]:TableSourceScan(table=[[default_catalog, default_database, ods_ymfz_prod_sys_divide_order, watermark=[-(CASE(IS

Re:Re: Using User-defined Table Aggregates Functions in SQL

2024-03-07 Thread Xuyang
Hi, Jad. IIUC, TableAggregateFunfunction has not been supported in SQL. The original Flip[1] only implements it in Table API. You can send an email to dev maillist for more detail and create an improvement jira[2] for it. [1]

Re: Flink Checkpoint & Offset Commit

2024-03-07 Thread xia rui
Hi Jacob. Flink uses "notification" to let an operator callback the completion of a checkpoint. After gathering all checkpoint done messages from TMs, JM sends a "notify checkpoint completed" RPC to all TMs. Operators will handle this notification, where checkpoint success callbacks are invoked.

Re:Re: Handling late events with Table API / SQL

2024-03-07 Thread Xuyang
Hi, Sunny. A watermark always comes from one subtask of this window operator's input(s), and this window operator will retain all watermarks about multi input subtasks. The `currentWatermark` in the window operator is the min value of these watermarks. -- Best! Xuyang At

Re: Re:RE: RE: flink cdc动态加表不生效

2024-03-07 Thread Hongshun Wang
Hi, casel chan, 社区已经对增量框架实现动态加表(https://github.com/apache/flink-cdc/pull/3024 ),预计3.1对mongodb和postgres暴露出来,但是Oracle和Sqlserver目前并没暴露,你可以去社区参照这两个框架,将参数打开,并且测试和适配。 Best, Hongshun

Re:Window properties can only be used on windowed tables

2024-03-07 Thread 周尹
public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); StreamTableEnvironment tEnv = StreamTableEnvironment.create(env); ListPerson list = new ArrayList();list.add(new Person("Fred",35));

Re:Window properties can only be used on windowed tables

2024-03-07 Thread 周尹
在非窗口化的表上使用窗口属性 At 2024-03-08 09:28:10, "ha.fen...@aisino.com" wrote: >public static void main(String[] args) { >StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); >StreamTableEnvironment tEnv = StreamTableEnvironment.create(env); >

Re:Window properties can only be used on windowed tables

2024-03-07 Thread Xuyang
Hi, fengqi. 这看起来像是select语句中,不能直接使用非来源于window agg的proctime或者event函数。目前不确定这是不是预期行为,方便的话可以在社区jira[1]上提一个bug看看。 快速绕过的话,可以试试下面的代码: DataStream flintstones = env.fromCollection(list); // Table select = table.select($("name"), $("age"), $("addtime").proctime()); Table table = tEnv.fromDataStream(

Re: Re: Running Flink SQL in production

2024-03-07 Thread Feng Jin
Hi, If you need to use Flink SQL in a production environment, I think it would be better to use the Table API [1] and package it into a jar. Then submit the jar to the cluster environment. [1] https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/common/#sql Best, Feng On

Re: Using User-defined Table Aggregates Functions in SQL

2024-03-07 Thread Jad Naous
Hi Junrui, Thank you for the pointer. I had read that page, and I can use the function with the Java Table API ok, but I'm trying to use the Top2 accumulator with a SQL function. I can't use a left lateral join on it since the planner fails with "not a table function". I don't think a join is the

Fwd: Flink Checkpoint & Offset Commit

2024-03-07 Thread Jacob Rollings
Hello, I am implementing proof of concepts based Flink realtime streaming solutions. I came across below lines in out-of-the-box Flink Kafka connector documents. *https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/*

Re: Using User-defined Table Aggregates Functions in SQL

2024-03-07 Thread Junrui Lee
Hi Jad, You can refer to the CREATE FUNCTION section ( https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-function) and the Table Aggregate Functions section (

Re: Handling late events with Table API / SQL

2024-03-07 Thread Sunny S
Thanks for the response! Sad that that side output for late data is not supported in Table API and SQL. I will start the discussions regarding this. In the meanwhile, I am trying to use the built-in function CURRENT_WATERMARK(rowtime) to be able to collect late data. The scenario I have is : I

Using User-defined Table Aggregates Functions in SQL

2024-03-07 Thread Jad Naous
Hi, The docs don't mention the correct syntax for using UDTAGGs in SQL. Is it possible to use them with SQL? Thanks, Jad Naous Grepr, CEO/Founder ᐧ

Re: TTL in pyflink does not seem to work

2024-03-07 Thread Ivan Petrarka
Note, that in Java code, it prints `State: Null`, `State: Null`, as I was expecting in, unlike pyflink code On Mar 7, 2024 at 15:59 +0400, Ivan Petrarka , wrote: > Hi! I’ve created a basic pyflink pipeline with ttl and it does not seem to > work. I have reproduced the exact same code in Java and

TTL in pyflink does not seem to work

2024-03-07 Thread Ivan Petrarka
Hi! I’ve created a basic pyflink pipeline with ttl and it does not seem to work. I have reproduced the exact same code in Java and it works! Is this a pyflink bug? If so - how can I report it? If not - what can I try to do? Flink: 1.18.0 image: flink:1.18.0-scala_2.12-java11 Code to

使用avro schema注册confluent schema registry失败

2024-03-07 Thread casel.chen
我使用注册kafka topic对应的schema到confluent schema registry时报错,想知道问题的原因是什么?如何fix? io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema being registered is incompatible with an earlier schema for subject "rtdp_test-test_schema-value", details:

Re:Re: Running Flink SQL in production

2024-03-07 Thread Xuyang
Hi. Hmm, if I'm mistaken, please correct me. Using a SQL client might not be very convenient for those who need to verify the results of submissions, such as checking for exceptions related to submission failures, and so on. -- Best! Xuyang 在 2024-03-07 17:32:07,"Robin

RE: SecurityManager in Flink

2024-03-06 Thread Kirti Dhar Upadhyay K via user
Hi Gabor, The issue is that, read permission is not getting checked when Flink FileSource is listing the files under given source directory. This is happening as Security Manager is coming as null. public String[] list() { SecurityManager security = System.getSecurityManager(); -> Here

RE: SecurityManager in Flink

2024-03-06 Thread Kirti Dhar Upadhyay K via user
Hi Yanfei, I am facing this issue on jdk1.8/11. Thanks for giving pointer, I will try to set Security manager and check the behaviour. Regards, Kirti Dhar -Original Message- From: Yanfei Lei Sent: Wednesday, March 6, 2024 4:37 PM To: Kirti Dhar Upadhyay K Cc: User@flink.apache.org

RE: SecurityManager in Flink

2024-03-06 Thread Kirti Dhar Upadhyay K via user
Hi Hang, You got it right. The problem is exactly at the same line where you pointed [1]. I have used below solution as of now. ``` If(!Files.isReadable(Paths.get(fileStatus.getPath().getPath( { throw new FlinkRuntimeException("Cannot list files under " + fileStatus.getPath()); }

Re:Reading Iceberg Tables in Batch mode performance

2024-03-06 Thread Xuyang
Hi, can you provide more details about this Flink batch job? For instance, through a flame graph, the threads are found spending most of their time on some certain tasks. -- Best! Xuyang At 2024-03-07 08:40:32, "Charles Tan" wrote: Hi all, I have been looking into using

Re:Running Flink SQL in production

2024-03-06 Thread Xuyang
Hi, IMO, both the SQL Client and the Restful API can provide connections to the SQL Gateway service for submitting jobs. A slight difference is that the SQL Client also offers a command-line visual interface for users to view results. In your production scenes, placing the SQL to be submitted

Reading Iceberg Tables in Batch mode performance

2024-03-06 Thread Charles Tan
Hi all, I have been looking into using Flink in batch mode to process Iceberg tables. I noticed that the performance for queries in Flink's batch mode is quite slow, especially when compared to Spark. I'm wondering if there are any configurations that I'm missing to get better performance out of

Running Flink SQL in production

2024-03-06 Thread Robin Moffatt via user
I'm reading the deployment guide[1] and wanted to check my understanding. For deploying a SQL job into production, would the pattern be to write the SQL in a file that's under source control, and pass that file as an argument to SQL Client with -f argument (as in this docs example[2])? Or script a

Re: SecurityManager in Flink

2024-03-06 Thread Gabor Somogyi
Hi Kirti, Not sure what is the exact issue here but I'm not convinced that having FlinkSecurityManager is going to solve it. Here is the condition however: * cluster.intercept-user-system-exit != DISABLED (this must be changed) * cluster.processes.halt-on-fatal-error == false (this is good by

Re: SecurityManager in Flink

2024-03-06 Thread Hang Ruan
Hi, Kirti. Could you please provide the stack trace of this NPE? I check the code and I think maybe the problem lies in LocalFileSystem#listStatus. The code in line 161[1] may return null, which will let LocalFileSystem#listStatus return null. Then the `containedFiles` is null and the NPE occurs.

Re: SecurityManager in Flink

2024-03-06 Thread Yanfei Lei
Hi Kirti Dhar, What is your java version? I guess this problem may be related to FLINK-33309[1]. Maybe you can try adding "-Djava.security.manager" to the java options. [1] https://issues.apache.org/jira/browse/FLINK-33309 Kirti Dhar Upadhyay K via user 于2024年3月6日周三 18:10写道: > > Hi Team, > > >

Re: Question about time-based operators with RocksDB backend

2024-03-06 Thread xia rui
Hi Gabriele, use (or extend) the window operator provided by Flink is a better idea. A window operator in Flink manages two types of state: - Window state: accumlate data for windows, and provide data to window function when a window comes to its end time. - Timer state: store the end

SecurityManager in Flink

2024-03-06 Thread Kirti Dhar Upadhyay K via user
Hi Team, I am using Flink File Source with Local File System. I am facing an issue, if source directory does not has read permission, it is returning the list of files as null instead of throwing permission exception (refer the highlighted line below), resulting in NPE. final FileStatus[]

Re: Question about time-based operators with RocksDB backend

2024-03-05 Thread Jinzhong Li
Hi Gabriele, The keyed state APIs (ValueState、ListState、etc) are supported by all types of state backend (hashmap、rocksdb、etc.). And the built-in window operators are implemented with these state APIs internally. So you can use these built-in operators/functions with the RocksDB state backend

Re: I used the yarn per job mode to submit tasks, which will end in 4 minutes

2024-03-05 Thread Junrui Lee
Hello, The issue you're encountering is related to a new heartbeat mechanism between the client and job in Flink-1.17. If the job does not receive any heartbeats from the client within a specific timeout, it will cancel itself to avoid hanging indefinitely. To address this, you have two options:

I used the yarn per job mode to submit tasks, which will end in 4 minutes

2024-03-05 Thread 程意
In versions 1.17.1 and 1.18.1, I used the yarn per job mode to submit tasks, which will end in 4 minutes. But I tried it on Flink 1.13.1, 1.15.2, and 1.16.3, all of which were normal. command line at 1.17.1 version: ``` ./bin/flink run -t yarn-per-job -ys 1 -yjm 1G -ytm 3G -yqu default -p 1

Re: Handling late events with Table API / SQL

2024-03-05 Thread Feng Jin
You can use the CURRENT_WATERMARK(rowtime) function for some filtering, please refer to [1] for details. https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/ Best, Feng On Wed, Mar 6, 2024 at 1:56 AM Sunny S wrote: > Hi, > > I am using Flink SQL to

Re:Handling late events with Table API / SQL

2024-03-05 Thread Xuyang
Hi, for out of order events, watermark can handle them. However, for late events, Flink Table & SQL are not supported to output them to a side channel like DataStream API. There have been some JIRAs related this.[1][2] If you really need this feature, you may consider initiating related

Handling late events with Table API / SQL

2024-03-05 Thread Sunny S
Hi, I am using Flink SQL to create a table something like this : CREATE TABLE some-table (   ...,   ...,   ...,   ...,   event_timestamp as TO_TIMESTAMP_LTZ(event_time*1000, 3),   WATERMARK FOR event_timestamp AS event_timestamp - INTERVAL '30' SECOND ) WITH (   

Re: Temporal join on rolling aggregate

2024-03-05 Thread Gyula Fóra
Hi Everyone! I have discussed this with Sébastien Chevalley, he is going to prepare and drive the FLIP while I will assist him along the way. Thanks Gyula On Tue, Mar 5, 2024 at 9:57 AM wrote: > I do agree with Ron Liu. > This would definitely need a FLIP as it would impact SQL and extend it

Re: Temporal join on rolling aggregate

2024-03-05 Thread lorenzo.affetti.ververica.com via user
I do agree with Ron Liu. This would definitely need a FLIP as it would impact SQL and extend it with the equivalent of TimestampAssigners in the Java API. Is there any existing JIRA here, or is anybody willing to drive a FLIP? On Feb 26, 2024 at 02:36 +0100, Ron liu , wrote: > +1, > But I think

Re: Batch mode execution

2024-03-04 Thread irakli.keshel...@sony.com
Thank you both! I'll try to switch the scheduler to "AdaptiveBatchScheduler". Best, Irakli From: Junrui Lee Sent: 05 March 2024 03:50 To: user Subject: Re: Batch mode execution Hello Irakli, The error is due to the fact that the Adaptive Scheduler doesn’t

Re: 退订

2024-03-04 Thread Shawn Huang
Hi,退订可以发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅来自 user-zh@flink.apache.org 邮件列表的邮件,邮件列表的订阅管理,可以参考[1] [1] https://flink.apache.org/zh/what-is-flink/community/ Best, Shawn Huang 雷刚 于2024年2月29日周四 14:41写道: > 退订

Re: flink sql作业如何统计端到端延迟

2024-03-04 Thread Shawn Huang
Flink有一个端到端延迟的指标,可以参考以下文档[1],看看是否有帮助。 [1] https://nightlies.apache.org/flink/flink-docs-release-1.18/zh/docs/ops/metrics/#end-to-end-latency-tracking Best, Shawn Huang casel.chen 于2024年2月21日周三 15:31写道: > flink sql作业从kafka消费mysql过来的canal >

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zakelly Lan
Hi Gabriele, Quick answer: You can use the built-in window operators which have been integrated with state backends including RocksDB. Thanks, Zakelly On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen wrote: > Hi Gabriele, > > I'd recommend extending the existing window function whenever

Re: Batch mode execution

2024-03-04 Thread Junrui Lee
Hello Irakli, The error is due to the fact that the Adaptive Scheduler doesn’t support batch jobs, as detailed in the Flink documentation[1]. When operating in reactive mode, Flink automatically decides the type of scheduler to use. For batch execution, the default scheduler is

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zhanghao Chen
Hi Gabriele, I'd recommend extending the existing window function whenever possible, as Flink will automatically cover state management for you and no need to be concerned with state backend details. Incremental aggregation for reduce state size is also out of the box if your usage can be

Re:Table中的java.util.Date类型对应sql中的什么类型

2024-03-04 Thread Xuyang
Hi, java.util.Date没有sql中的常规类型和它对应,因此使用的兜底的Raw类型(结构化类型)。实际上java.sql.Date 对应的是sql中的Date。 具体可以参考下这张表:https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/types/#data-type-extraction -- Best! Xuyang 在 2024-03-05 09:23:38,"ha.fen...@aisino.com" 写道: >从流转换成Table

Re: Batch mode execution

2024-03-04 Thread lorenzo.affetti.ververica.com via user
Hello Irakli and thank you for your question. I guess that somehow Flink enters the "reactive" mode while the adaptive scheduler is not configured. I would go with 2 options to isolate your issue: • Try with forcing the scheduling mode

Batch mode execution

2024-03-04 Thread irakli.keshel...@sony.com
Hello, I have a Flink job which is processing bounded number of events. Initially, I was running the job in the "STREAMING" mode, but I realized that running it in the "BATCH" mode was better as I don't have to deal with the Watermark Strategy. The job is reading the data from the Kafka topic

Re: flink-operator-1.5.0 supports which versions of Kubernetes

2024-03-04 Thread Gyula Fóra
It should be compatible. There is no compatibility matrix but it is compatible with most versions that are in use (at the different companies/users etc) Gyula On Thu, Feb 29, 2024 at 6:21 AM 吴圣运 wrote: > Hi, > > I'm using flink-operator-1.5.0 and I need to deploy it to Kubernetes 1.20. > I

Question about time-based operators with RocksDB backend

2024-03-04 Thread Gabriele Mencagli
Dear Flink Community, I am using Flink with the DataStream API and operators implemented using RichedFunctions. I know that Flink provides a set of window-based operators with time-based semantics and tumbling/sliding windows. By reading the Flink documentation, I understand that there is

Re: Support for ConfigMap for Runtime Arguments in Flink Kubernetes Operator

2024-03-04 Thread Surendra Singh Lilhore
Hi Arjun, I have raised a Jira for this case and attached a patch: https://issues.apache.org/jira/browse/FLINK-34565 -Surendra On Wed, Feb 21, 2024 at 12:48 AM Surendra Singh Lilhore < surendralilh...@apache.org> wrote: > Hi Arjun, > > Yes, direct support for external configuration files

Re: 根据flink job web url可以获取到JobGraph信息么?

2024-03-03 Thread Zhanghao Chen
我在 Yanquan 的回答基础上补充下,通过 /jobs/:jobid/plan 实际上拿到的就是 JSON 表示的 JobGraph 信息(通过 JsonPlanGenerator 这个类生成,包含了绝大部分 jobgraph 里常用的信息),应该能满足你的需要 From: casel.chen Sent: Saturday, March 2, 2024 14:17 To: user-zh@flink.apache.org Subject: 根据flink job web url可以获取到JobGraph信息么?

Re: 退订

2024-03-03 Thread Hang Ruan
请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh-unsubscr...@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8 [2] https://flink.apache.org/community.html#mailing-lists 4kings...@gmail.com

退订

2024-03-02 Thread 4kings...@gmail.com
退订 4kings...@gmail.com 邮箱:4kings...@gmail.com

Re: 根据flink job web url可以获取到JobGraph信息么?

2024-03-01 Thread Yanquan Lv
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-plan 通过 /jobs/:jobid/plan 能获得 ExecutionGraph 的信息,不知道能不能包含你需要的信息。 casel.chen 于2024年3月2日周六 14:19写道: > 正在运行的flink作业能够通过其对外暴露的web url获取到JobGraph信息么?

根据flink job web url可以获取到JobGraph信息么?

2024-03-01 Thread casel.chen
正在运行的flink作业能够通过其对外暴露的web url获取到JobGraph信息么?

Re: mysql cdc streamapi与sqlapi 输出表现不相同

2024-03-01 Thread Feng Jin
这两个 print 的实现是不一样的。 dataStream().print 是增加的 PrintSinkFunction, 该算子接受到数据会立刻打印出来, 且结果是在 TM 上打印出来。 而 table.execute().print() 是会把最终的结果通过 collect_sink 收集之后,回传到 client, 结果是在 client 的 stdout 打印出来, 且只有在做 checkpoint 时才会回传至 client, 它的可见周期会受限于 checkpoint 的间隔。 Best, Feng Jin On Fri, Mar 1, 2024 at 4:45 

Re: flink cdc底层的debezium是如何注册schema到confluent schema registry的?

2024-02-29 Thread Hang Ruan
Hi,casel.chen。 这个部分应该是在 CDC 项目里没有涉及到,CDC 依赖 debezium 的 engine 部分直接读取出变更数据,并没有像 debezium 本身一样去写入到 Kafka 中。 可以考虑去 Debezium 社区咨询一下这部分的内容,Debezium开发者们应该更熟悉这部分的内容。 祝好, Hang casel.chen 于2024年2月29日周四 18:11写道: > 搜索了debezium源码但没有发现哪里有调用 > SchemaRegistryClient.register方法的地方,请问它是如何注册schema到confluent

Re: mysql cdc streamapi与sqlapi 输出表现不相同

2024-02-29 Thread Hang Ruan
你好,ha.fengqi。 MySQL CDC 连接器只有在多并发时,会依赖checkpoint的完成来切换到增量阶段。从你提供的代码上来看,是单并发的运行作业,所以应该Source 在这两者之间的行为不会有区别。 这个不同是不是有可能是下游在两种使用方式上,有什么区别? 可以通过观察具体的IO指标看到Source是否真的及时发出消息,如果比较熟悉代码,也可以自己添加一下打印日志来验证。 祝好, Hang

[ANNOUNCE] Apache flink-connector-parent 1.1.0 released

2024-02-29 Thread Etienne Chauchot
The Apache Flink community is very happy to announce the release of Apache flink-connector-parent 1.1.0. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for

flink cdc底层的debezium是如何注册schema到confluent schema registry的?

2024-02-29 Thread casel.chen
搜索了debezium源码但没有发现哪里有调用 SchemaRegistryClient.register方法的地方,请问它是如何注册schema到confluent schema registry的?

flink-operator-1.5.0 supports which versions of Kubernetes

2024-02-28 Thread 吴圣运
Hi, I'm using flink-operator-1.5.0 and I need to deploy it to Kubernetes 1.20. I want to confirm if this version of flink-operator is compatible with Kubernetes 1.20. I cannot find the compatible matrix in the github page. Could you help me confirm? Thanks  shengyun.wu

Re: 退订

2024-02-28 Thread Shawn Huang
Hi,退订可以发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅来自 user-zh@flink.apache.org 邮件列表的邮件,邮件列表的订阅管理,可以参考[1] [1] https://flink.apache.org/zh/what-is-flink/community/ Best, Shawn Huang 18679131354 <18679131...@163.com> 于2024年2月27日周二 14:32写道: > 退订

Re: flink重启机制

2024-02-27 Thread Yanquan Lv
图片没有显示出来。container 调度是由 yarn 控制的,yarn 会优先选择运行中的节点。按理说 container 不会调度到下线的节点,你通过 yarn web 或者 yarn node -list 确认了吗? chenyu_opensource 于2024年2月27日周二 18:30写道: > 你好,flink任务提交到yarn上,由于某个节点下线导致flink任务失败,如下: > > 同时重试超过次数,任务失败,如下图: > > 我想问一下,flink重试机制中 >

flink重启机制

2024-02-27 Thread chenyu_opensource
你好,flink任务提交到yarn上,由于某个节点下线导致flink任务失败,如下: 同时重试超过次数,任务失败,如下图: 我想问一下,flink重试机制中 任务不会重新调度到新节点的container吗?为什么一直在同一个节点从而导致整体任务失败。这个调度是由yarn控制还是flink自身代码控制的?如有相关代码也请告知,谢谢。 期待回复,谢谢!

退订

2024-02-26 Thread 18679131354
退订

来自杨作青的邮件

2024-02-26 Thread 杨作青
退订

Re: Schema Evolution & Json Schemas

2024-02-26 Thread Salva Alcántara
Awesome Andrew, thanks a lot for the info! On Sun, Feb 25, 2024 at 4:37 PM Andrew Otto wrote: > > the following code generator > Oh, and FWIW we avoid code generation and POJOs, and instead rely on > Flink's Row or RowData abstractions. > > > > > > On Sun, Feb 25, 2024 at 10:35 AM Andrew Otto

Re: Flink DataStream 作业如何获取到作业血缘?

2024-02-26 Thread Feng Jin
通过 JobGraph 可以获得 transformation 信息,可以获得具体的 Source 或者 Doris Sink,之后再通过反射获取里面的 properties 信息进行提取。 可以参考 OpenLineage[1] 的实现. 1. https://github.com/OpenLineage/OpenLineage/blob/main/integration/flink/shared/src/main/java/io/openlineage/flink/visitor/wrapper/FlinkKafkaConsumerWrapper.java Best,

Flink DataStream 作业如何获取到作业血缘?

2024-02-26 Thread casel.chen
一个Flink DataStream 作业从mysql cdc消费处理后写入apache doris,请问有没有办法(从JobGraph/StreamGraph)获取到source/sink connector信息,包括连接字符串、数据库名、表名等?

Re: Flink Scala Positions in India or USA !

2024-02-26 Thread Martijn Visser
Hi, Please don't use the mailing list for this purpose. Best regards, Martijn On Wed, Feb 21, 2024 at 4:08 PM sri hari kali charan Tummala wrote: > > Hi Folks, > > I am currently seeking full-time positions in Flink Scala in India or the USA > (non consulting) , specifically at the Principal

<    4   5   6   7   8   9   10   11   12   13   >