Re:flink1.18 regular join 支持两侧流各自配置state ttl

2024-01-05 Thread Xuyang
Hi, 文档中“The current TTL value for both left and right side is "0 ms", which means the state retention is not enabled.”,指的其实是并没有开启state ttl的意思,也就是并不会清理state、永久保留state,对应的是public involving api中的StateTTLConfig#UpdateType.Disabled[1],文档上的表述确实可以更加清晰一些,方便的话可以提一个jira improve一下文档。

Re: flink1.18 regular join 支持两侧流各自配置state ttl

2024-01-05 Thread Thomas Yang
1 *杨勇* Thomas Yang 于2024年1月5日周五 17:17写道: > 本地测试发现 默认生成0ms,实际测试是两侧保留了永久状态,但是官方文档意思是0ms表示两侧都不保存状态. 是不是文档有错误? > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#idle-state-retention-time > > 另外咨询下: 如果0表示永久保留state 那么想不保存state应该使用什么值? > *谢谢!* > > > *杨勇* >

Continuous Reading of File using FileSource does not process the existing files in version 1.17

2024-01-05 Thread Prasanna kumar
Hi Flink Community, I hope this email finds you well. I am currently in the process of migrating my Flink application from version 1.12.7 to 1.17.2 and have encountered a behavior issue with the FileSource while reading data from an S3 bucket. In the previous version (1.12.7), I was utilizing

flink1.18 regular join 支持两侧流各自配置state ttl

2024-01-05 Thread Thomas Yang
本地测试发现 默认生成0ms,实际测试是两侧保留了永久状态,但是官方文档意思是0ms表示两侧都不保存状态. 是不是文档有错误? https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#idle-state-retention-time 另外咨询下: 如果0表示永久保留state 那么想不保存state应该使用什么值? *谢谢!* *杨勇*

Re:CUMULATE 窗口状态过大导致CK超时

2024-01-04 Thread ouywl
HI Jiaotong: 我的建议如下: 1. 本地存储使用高吞吐的SSD 2. taskmanager.memory.managed.size 增加并且确保rocksdb memtable内存增加,减少rocksdb 刷磁盘的量 3. 如果有物化sink算子,关闭物化sink算子,减小state。 The following is the content of the forwarded email

Re:CUMULATE 窗口状态过大导致CK超时

2024-01-04 Thread Xuyang
Hi, 一般来说,业务上如果坚持要使用大state,可以尝试下尽可能的给多并发(让每个并发都持有一部分key的state,摊平大state)和内存(尽可能减少访问落盘的数据,减少IO)来提高性能。 对于你这个case来说,CUMULATE Window TVF在实现层面已经尽可能将小窗口的数据进行merge了[1]。可以dump下来看下具体是哪里的问题,是不是有进一步优化的空间。 [1]

Create slack channel for Flink Kubernetes operator

2024-01-03 Thread Tamir Sagi
Hi Would you please create a dedicated channel for Flink Kubernetes operator. it would be very helpful. Thanks, Tamir Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions

RE: Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2024-01-03 Thread Schwalbe Matthias
Hi Vladimir, I might be mistaken, here my observations: * List res = CollectionUtil.iteratorToList(result.execute().collect()); will block until the job is finished * However, we have a unbounded streaming job which will not finish until you cancel it * If you just want to print

Re:回复: Re: 滑动窗口按照处理时间触发的问题

2024-01-02 Thread haishui
比如并行度是4,任务执行图是: Source(p=1) ==reblance=> flatMap和Timestamp/watermrk(p=4) =hash=> window(p=4) window的水位线取上游四个算子水位线的最小值, 你需要写4个数据,才能让四个子任务水位线更新,window的水位线才有一次更新 Best regards, haishui 在 2024-01-03 14:25:48,"ha.fen...@aisino.com" 写道: >设置并行度1确实可以了。env.setParallelism(1);

Re:回复: Re: 滑动窗口按照处理时间触发的问题

2024-01-02 Thread haishui
Hi, 应该是并行度的原因,你可以先将并行度设置为1试试。 Best regards, haishui 在 2024-01-03 12:24:20,"ha.fen...@aisino.com" 写道: >帮我看看代码,感觉是代码的问题,使用滚动窗口问题一样,5分钟的滚动,也是输入1704130441000才触发函数的 >public static void main(String[] args) throws Exception { >StreamExecutionEnvironment env = >

Re:Re: 滑动窗口按照处理时间触发的问题

2024-01-02 Thread Xuyang
Hi, 基本思路和Jinsui说的差不多,我怀疑也是watermark没有推进导致窗口没有开窗。具体可以debug一下EventTimeTrigger里的‘onElement’方法和‘onEventTime’方法。 -- Best! Xuyang 在 2024-01-02 23:31:54,"Jinsui Chen" 写道: >Hi, > >请问是否可以将所有代码贴出来,尤其是水位线相关的。因为事件时间的推进和水位线策略紧密相关。 > >假设这样一种情况,将时间戳作为事件时间,假设你的水位线容错间隔设置为10min,就会出现上述情况,原因如下: >1.

Re: 滑动窗口按照处理时间触发的问题

2024-01-02 Thread Jinsui Chen
Hi, 请问是否可以将所有代码贴出来,尤其是水位线相关的。因为事件时间的推进和水位线策略紧密相关。 假设这样一种情况,将时间戳作为事件时间,假设你的水位线容错间隔设置为10min,就会出现上述情况,原因如下: 1. 首先是时间窗口的对齐逻辑。窗口是根据 Epoch 时间(1970-01-01 00:00:00 UTC)来对齐的。例如,如果窗口大小为5分钟,那么窗口的开始时间会是00:00、00:05、00:10等等很整的值,而不是事件时间。这也是为什么你的第一条数据会落在 00:20 - 01:20 这个时间窗口上。 2.

RE: 如何在IDE里面调试flink cdc 3.0作业?

2024-01-02 Thread Jiabao Sun
Hi, 可以参考下这篇文档[1],进行简单的测试。 Best, Jiabao [1] https://docs.google.com/document/d/1L6cJiqYkAsZ_nDa3MgRwV3SKQuw5OrMbqGC4YgzgKR4/edit#heading=h.aybxdd96r62i On 2024/01/02 08:02:10 "casel.chen" wrote: > 我想在Intellij Idea里面调试mysql-to-doris.yaml作业要如何操作呢?flink-cdc-cli模块需要依赖 >

Re: Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2024-01-02 Thread Alexey Novakov via user
Hi Vladimir, As I see, your SQL query is reading data from the Kafka topic and pulls all data to the client side. The "*.collect" method is quite network/memory intensive. You probably do want that. If you want to debug/print the ingested data via SQL, I would recommend the "print" connector.

如何在IDE里面调试flink cdc 3.0作业?

2024-01-02 Thread casel.chen
我想在Intellij Idea里面调试mysql-to-doris.yaml作业要如何操作呢?flink-cdc-cli模块需要依赖 flink-cdc-pipeline-connector-mysql 和 flink-cdc-pipeline-connector-doris 模块么?

Re: Flink CDC中如何在Snapshot阶段读取数据时进行限流?

2024-01-01 Thread Jiabao Sun
Hi, GuavaFlinkConnectorRateLimiter 目前只在 flink-connector-gcp-pubsub[1] 有使用。 Flink CDC 还未支持限流[2],目前可以尝试降低 snapshot 并发数来缓解数据库压力。 Best, Jiabao [1]

Flink CDC中如何在Snapshot阶段读取数据时进行限流?

2024-01-01 Thread casel.chen
业务表存量数据很大,如果不加限流直接使用flink cdc读取snapshot阶段数据的话会造成业务库压力,触发数据库告警,影响在线业务。 请问Flink CDC中如何在Snapshot阶段读取数据时进行限流? 我看到社区之前有人提议过,但issue一直是open状态 https://issues.apache.org/jira/browse/FLINK-18740 另外,我在flink最新master分支源码中有找到 GuavaFlinkConnectorRateLimiter,但没有找到调用它的例子,请问如何在flink作业中使用限流呢?

Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2023-12-29 Thread Alessio Bernesco Làvore
Hello, i'm trying to setup a testing environment using: - Flink HA with Zookeeper - Docker Compose While starting the TaskManager generates an exception and then after some restarts if fails. The exception is: "Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing

Re: FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread ying lin
从同一个source里select,在flink sql中用statement set 执行两条insert语句到不同的sink表即可 Jiabao Sun 于2023年12月29日周五 16:55写道: > Hi, > > 使用 SQL 的话不太好实现写入多个路径, > 使用 DataStream 的话可以考虑自己实现一个 RichSinkFunction。 > > Best, > Jiabao > > On 2023/12/29 08:37:34 jinzhuguang wrote: > > Flink版本:1.16.0 > > > > 看官网上的案例: > > CREATE

RE: FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread Jiabao Sun
Hi, 使用 SQL 的话不太好实现写入多个路径, 使用 DataStream 的话可以考虑自己实现一个 RichSinkFunction。 Best, Jiabao On 2023/12/29 08:37:34 jinzhuguang wrote: > Flink版本:1.16.0 > > 看官网上的案例: > CREATE TABLE MyUserTable ( > column_name1 INT, > column_name2 STRING, > ... > part_name1 INT, > part_name2 STRING > )

FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread jinzhuguang
Flink版本:1.16.0 看官网上的案例: CREATE TABLE MyUserTable ( column_name1 INT, column_name2 STRING, ... part_name1 INT, part_name2 STRING ) PARTITIONED BY (part_name1, part_name2) WITH ( 'connector' = 'filesystem', -- 必选:指定连接器类型 'path' = 'file:///path/to/whatever', -- 必选:指定路径

RE: Flink SQL Windowing TVFs

2023-12-28 Thread Jiabao Sun
Hi, 在 1.14.0 版本中,CUMULATE 函数是需要用在GROUP BY聚合场景下的[1]。 部署到生产的 SQL 是否包含了 GROUP BY 表达式? 本地测试的Flink版本是不是1.14.0? Best, Jiabao [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/dev/table/sql/queries/window-tvf/#cumulate On 2023/12/29 04:57:09 "jiaot...@mail.jj.cn" wrote: > Hi,

[DISCUSS] Hadoop 2 vs Hadoop 3 usage

2023-12-28 Thread Martijn Visser
Hi all, I want to get some insights on how many users are still using Hadoop 2 vs how many users are using Hadoop 3. Flink currently requires a minimum version of Hadoop 2.10.2 for certain features, but also extensively uses Hadoop 3 (like for the file system implementations) Hadoop 2 has a

Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread Jiabao Sun
Hi, 是的,目前来说会 block 住。 flush + apply schema change 一般来说不会持续太长时间, 且 schema 变更一般来说是低频事件,即使 block 也不会有太大性能影响。 Best, Jiabao > 2023年12月28日 12:57,casel.chen 写道: > > > > > 感谢解惑! > 还有一个问题:如果一个 pipeline 涉及多张表数据同步,而只有一个表出现 schema 变更的话,其他表的数据处理也会 block 住吗? > > > > > > > > > 在 2023-12-28

Re:Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread casel.chen
感谢解惑! 还有一个问题:如果一个 pipeline 涉及多张表数据同步,而只有一个表出现 schema 变更的话,其他表的数据处理也会 block 住吗? 在 2023-12-28 01:16:40,"Jiabao Sun" 写道: >Hi, > >> 为什么最后output.collect(new StreamRecord<>(schemaChangeEvent)); >> 还要发送一次SchemaChangeEvent呢? > >Sink 也会收到 SchemaChangeEvent,因为 Sink 可能需要根据 Schema 变更的情况来调整

Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread Jiabao Sun
Hi, > 为什么最后output.collect(new StreamRecord<>(schemaChangeEvent)); > 还要发送一次SchemaChangeEvent呢? Sink 也会收到 SchemaChangeEvent,因为 Sink 可能需要根据 Schema 变更的情况来调整 serializer 或 writer,参考 DorisEventSerializer > 最后一行requestReleaseUpstream()执行被block的原因是什么?是如何hold upstream然后再release > upstream的呢? 被 block

flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread casel.chen
看了infoq介绍flink cdc 3.0文章 https://xie.infoq.cn/article/a80608df71c5291186153600b,我对其中schema. evolution设计原理想不明白,框架是如何做到schema change顺序性的。文章介绍得并不详细。 从mysql binlog产生changeEvent来看,所有的变更都是时间线性的,例如s1, d1, d2, s2, d3, d4, d5, s3, d6 其中d代表数据变更,s代表schema变更 这意味着d1,d2使用的是s1 schema,而d3~d5用的是s2

Re: Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-27 Thread Martijn Visser
Hi, If there's nothing that pushes the watermark forward, then the window won't be able to close. That's a common thing and expected for every operator that relies on watermarks. You can also decide to configure an idleness in order to push the watermark forward if needed. Best regards, Martijn

Re:Re:RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Xuyang
Hi, 可以贴一下flink的版本么?如果方便的话,也可以贴一下plan和最小可复现数据集。 -- Best! Xuyang 在 2023-12-26 09:25:30,"杨光跃" 写道: > > > > > > >CompiledPlan plan = env.compilePlanSql("insert into out_console " + >" select r.apply_id from t_purch_apply_sent_route r " + >" left join t_purch_apply_sent_route_goods

Re: RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Benchao Li
这个问题应该是跟 FLINK-33365[1] 中说的是同一个问题,这个已经在修复中了,在最新的 JDBC Connector 版本中会修复它。 [1] https://issues.apache.org/jira/browse/FLINK-33365 杨光跃 于2023年12月26日周二 09:25写道: > > > > > > > > CompiledPlan plan = env.compilePlanSql("insert into out_console " + > " select r.apply_id from t_purch_apply_sent_route r "

Re:RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread 杨光跃
CompiledPlan plan = env.compilePlanSql("insert into out_console " + " select r.apply_id from t_purch_apply_sent_route r " + " left join t_purch_apply_sent_route_goods FOR SYSTEM_TIME AS OF r.pt as t " + "ON t.apply_id = r.apply_id and t.isdel = r.isdel" + " where r.apply_id = 61558439941351

RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Jiabao Sun
Hi, 邮件中的图片没显示出来,麻烦把 SQL 贴出来一下。 Best, Jiabao On 2023/12/25 12:22:41 杨光跃 wrote: > 我的sql如下: > 、 > > > t_purch_apply_sent_route 是通过flink cdc创建的 > t_purch_apply_sent_route_goods 是普通的jdbc > 我期望的结果是返回符合过滤条件的;但现在执行的结果,会返回t_purch_apply_sent_route表所有数据 > 这显然不符合我的预期,原因应该是因为过滤条件进行了过早的下推 >

lock up表过滤条件下推导致的bug

2023-12-25 Thread 杨光跃
我的sql如下: 、 t_purch_apply_sent_route 是通过flink cdc创建的 t_purch_apply_sent_route_goods 是普通的jdbc 我期望的结果是返回符合过滤条件的;但现在执行的结果,会返回t_purch_apply_sent_route表所有数据 这显然不符合我的预期,原因应该是因为过滤条件进行了过早的下推 这应该算是bug吧,或者要满足我的预期,该怎么写sql?

Re: Re:RE: Flink - Java 17 support

2023-12-24 Thread xiangyu feng
Hi Praveen, I'm not sure what do you need from Java 17. In Bytedance, we have tried to compile Flink on Java 8/Java 11 and run on Java 17. It works well in production and also takes advantage of new Java 17 features. You can try this way if you have urgent need for Java 17. Hope this helps u.

Re: Re: 退订

2023-12-24 Thread Zhanshun Zou
退订 李国辉 于2023年11月22日周三 22:28写道: > > 退订 > > > > > -- > 发自我的网易邮箱手机智能版 > > > > - Original Message - > From: "Junrui Lee" > To: user-zh@flink.apache.org > Sent: Wed, 22 Nov 2023 10:19:32 +0800 > Subject: Re: 退订 > > Hi, > > 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。 > > Best,

Re: Re: Flink CDC MySqlSplitReader问题

2023-12-24 Thread Hang Ruan
Hi, 我记得这段逻辑是为了保证在新增表后,binlog 读取能和新增表的快照读取一起进行,保证binlog读取不会中断。 这里应该是会先读binlog,然后再读snapshot,再是binlog。这样的切换,来保证binlog 能一直有数据读出来。 Best, Hang casel.chen 于2023年12月22日周五 10:44写道: > 那意思是会优先处理已经在增量阶段的表,再处理新增快照阶段的表?顺序反过来的话会有什么影响?如果新增表全量数据比较多会导致其他表增量处理变慢是么? > > > > > > > > > > > > > > > > > > 在

Re: [Flink Kubernetes Operator] Restoring from an outdated savepoint.

2023-12-22 Thread Gyula Fóra
Please upgrade the operator to the latest release, and if the issue still exists please open a Jira ticket with the details. Gyula On Fri, 22 Dec 2023 at 21:17, Ruibin Xing wrote: > I wanted to talk about an issue we've hit recently with Flink Kubernetes > Operator 1.6.1 and Flink 1.17.1. > >

[Flink Kubernetes Operator] Restoring from an outdated savepoint.

2023-12-22 Thread Ruibin Xing
I wanted to talk about an issue we've hit recently with Flink Kubernetes Operator 1.6.1 and Flink 1.17.1. As we're using the Savepoint upgrade mode, we ran into cases where the lastSavepoint in status doesn't seem to update (still digging into why, could be an exception when cancelling

Re:Re: Flink CDC MySqlSplitReader问题

2023-12-21 Thread casel.chen
那意思是会优先处理已经在增量阶段的表,再处理新增快照阶段的表?顺序反过来的话会有什么影响?如果新增表全量数据比较多会导致其他表增量处理变慢是么? 在 2023-12-20 21:40:05,"Hang Ruan" 写道: >Hi,casel > >这段逻辑应该只有在处理到新增表的时候才会用到。 >CDC 读取数据正常是会在所有SnapshotSplit读取完成后,才会下发BinlogSplit。 >但是如果是在增量阶段重启作业,同时新增加了一些表,就会出现同时有BinlogSplit和SnapshotSplit的情况,此时才会走到这段逻辑。 >

Re: 退订

2023-12-21 Thread Junrui Lee
你好, 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件 Best, Junrui jimandlice 于2023年12月21日周四 17:43写道: > 退订 > | | > jimandlice > | > | > 邮箱:jimandl...@163.com > |

RE: Re:Flink脏数据处理

2023-12-21 Thread Jiabao Sun
Hi, 需要精准控制异常数据的话,就不太推荐flink sql了。 考虑使用DataStream将异常数据用侧流输出[1],再做补偿。 Best, Jiabao [1] https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/side_output/ On 2023/12/06 08:45:20 Xuyang wrote: > Hi, > 目前flink sql主动收集脏数据的行为。有下面两种可行的办法: > 1.

RE: Re:Re:flink sql支持批量lookup join

2023-12-21 Thread Jiabao Sun
Hi, casel. 使用三次lookup join是可以实现的,加上缓存,性能应该不差。 WITH users AS ( SELECT * FROM (VALUES(1, 'zhangsan'), (2, 'lisi'), (3, 'wangwu')) T (id, name) ) SELECT orders.id, u1.name as creator_name, u2.name as approver_name, u3.name as deployer_name FROM ( SELECT *

退订

2023-12-21 Thread jimandlice
退订 | | jimandlice | | 邮箱:jimandl...@163.com |

退订

2023-12-21 Thread jimandlice
退订 | | jimandlice | | 邮箱:jimandl...@163.com |

RE: Pending records

2023-12-21 Thread Jiabao Sun
Hi rania, Does "pending records" specifically refer to the records that have been read from the source but have not been processed yet? If this is the case, FLIP-33[1] introduces some standard metrics for Source, including "pendingRecords," which can be helpful. However, not all Sources

RE: Does Flink on K8s support log4j2 kafka appender? [Flink log] [Flink Kubernetes Operator]

2023-12-20 Thread Jiabao Sun
Hi Chosen, Whether kafka appender is supported or not has no relation to the flink-kubernetes-operator. It only depends on whether log4j2 supports kafka appender. From the error message, it appears that the error is caused by the absence of the log4j-layout-template-json[1] plugin. For the

Flink India Jobs Refferal

2023-12-20 Thread sri hari kali charan Tummala
Hi Community, I got laid off at Apple in Feb 2023 which forced me move out of USA due to immigration problem (h1b) I was a Big Data,Spark,Scala,Python and Flink consultant with over 12+ years of experience. I am still haven't landed in a job in India since then I need referrals in India in

Re:RE: Re:RE: Flink - Java 17 support

2023-12-20 Thread Xuyang
Hi, Praveen Chandna. I don't know much about this plan, you can ask on the dev mailing list. [1] [1]https://flink.apache.org/what-is-flink/community/ -- Best! Xuyang 在 2023-12-20 14:53:52,"Praveen Chandna via user" 写道: Hello Xuyang One more query, is there plan to release

Re: Flink CDC MySqlSplitReader问题

2023-12-20 Thread Hang Ruan
Hi,casel 这段逻辑应该只有在处理到新增表的时候才会用到。 CDC 读取数据正常是会在所有SnapshotSplit读取完成后,才会下发BinlogSplit。 但是如果是在增量阶段重启作业,同时新增加了一些表,就会出现同时有BinlogSplit和SnapshotSplit的情况,此时才会走到这段逻辑。 Best, Hang key lou 于2023年12月20日周三 16:24写道: > 意思是当 有 binlog 就意味着 已经读完了 snapshot > > casel.chen 于2023年12月19日周二 16:45写道: > > >

需要帮助:在使用AsyncLookupFunction时如果asyncLookup抛出异常,则会遇到了SerializedThrowable 的StackOverflowError

2023-12-20 Thread Manong Karl
简单示例: public class TableA implements LookupTableSource { @Nullable private final LookupCache cache; public TableA(@Nullable LookupCache cache) { this.cache = cache; } @Override public LookupRuntimeProvider getLookupRuntimeProvider(LookupContext context) {

Re: Flink CDC MySqlSplitReader问题

2023-12-20 Thread key lou
意思是当 有 binlog 就意味着 已经读完了 snapshot casel.chen 于2023年12月19日周二 16:45写道: > 我在阅读flink-connector-mysql-cdc项目源码过程中遇到一个不清楚的地方,还请大佬指点,谢谢! > > > MySqlSplitReader类有一段代码如下,注释“(1) Reads binlog split firstly and then read > snapshot split”这一句话我不理解。 > 为什么要先读binlog split再读snapshot

RE: Re:RE: Flink - Java 17 support

2023-12-19 Thread Praveen Chandna via user
Hello Xuyang One more query, is there plan to release formal Java 17 support in Flink release 1.19 or 1.20 ? Or do we need to wait till Flink 2.0 ? Thanks !! From: Xuyang Sent: 19 December 2023 16:37 To: user@flink.apache.org Subject: Re:RE: Flink - Java 17 support Hi, Praveen Chandna.

Re:RE: Flink - Java 17 support

2023-12-19 Thread Xuyang
Hi, Praveen Chandna. Please correct me if I'm wrong. From the release note of Flink 1.18 [1], you can see that although 1.18 already supports Java 17, it is still in beta mode. [1] https://nightlies.apache.org/flink/flink-docs-master/zh/release-notes/flink-1.18/ -- Best! Xuyang

RE: Flink - Java 17 support

2023-12-19 Thread Praveen Chandna via user
Hello Team Please helps to reply on Java 17 support. Does the Java 17 experimental support is production ready ? Is it recommended to use Flink 1.18 with Java 17 or is there any risk if we move to Java 17 in production ? Thanks !! From: Praveen Chandna Sent: 17 December 2023 21:28 To: Praveen

Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2023-12-19 Thread Вова Фролов
Hello Flink Community, I am texting to you with an issue I have encountered while using Apache Flink version 1.17.1. In my Flink Job, I am using Kafka version 3.6.0 to ingest data from TPC-DS(current tpcds100 target size tpcds1), and then I am executing SQL queries, specifically, the q77

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-19 Thread T, Yadhunath
Hi Xuyang, Thanks for the reply! I haven't used a print connector yet. Thanks, Yad From: Xuyang Sent: Monday, December 18, 2023 8:26 AM To: T, Yadhunath Cc: user@flink.apache.org Subject: Re:Re: Re:Re: Event stuck in the Flink operator This Message Is From

Flink CDC MySqlSplitReader问题

2023-12-19 Thread casel.chen
我在阅读flink-connector-mysql-cdc项目源码过程中遇到一个不清楚的地方,还请大佬指点,谢谢! MySqlSplitReader类有一段代码如下,注释“(1) Reads binlog split firstly and then read snapshot split”这一句话我不理解。 为什么要先读binlog split再读snapshot split?为保证记录的时序性,不是应该先读全量的snapshot split再读增量的binlog split么? private MySqlRecords pollSplitRecords() throws

Re: Configuration not propagating unless explicitly initializing FileSystem for Azure File System

2023-12-18 Thread Junrui Lee
Hi, Yuval The issue you're encountering arises because Flink's FileSystem is designed to operate with cluster-level configuration. And this configuration is sourced from the flink-conf.yaml file. Consequently, when the FileSystem is initialized, it isn't able to access the configuration objects

Configuration not propagating unless explicitly initializing FileSystem for Azure File System

2023-12-18 Thread Yuval Itzchakov
Flink: 1.17.1 Hi, I've encountered a weird issue when passing a configuration object to StreamExecutionEnvironment.getExecutionEnvironment does not propagate to the hadoop file system being initialized when running Flink locally in an IDE. I am passing credentials in order to connect to Azure

RE: Feature flag functionality on flink

2023-12-18 Thread Jiabao Sun
Hi, If it is for simplicity, you can also try writing the flag into an external system, such as Redis、Zookeeper or MySQL, and query the flag from the external system when perform data processing. However, Broadcast State is still the mode that I recommend. Perhaps we only need to encapsulate

Re: 退订

2023-12-18 Thread Hang Ruan
Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from u...@flink.apache.org, and you can refer [1][2] for more details. 请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 地址来取消订阅来自 u...@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1]

Re: 退订

2023-12-18 Thread Hang Ruan
Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user@flink.apache.org, and you can refer [1][2] for more details. 请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 地址来取消订阅来自 user@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1]

Re: Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Hang Ruan
Hi, Dominik. >From the code, your sink has received an InvalidTxnStateException in KafkaCommittter[1]. And kafka connector treats it as a known exception to invoke `signalFailedWithKnownReason`. `signalFailedWithKnownReason` will not throw an exception. It let the committer to decide fail or

RE: flink1.15-flink1.18官方提供写入Elasticsearch的接口报序列化异常

2023-12-18 Thread Jiabao Sun
Hi, createIndexRequest是否不是静态的,scala的话可以在object中声明该方法。 Lambda中访问非静态方法,并且外部类不是可序列化的,可能会导致lambda无法被序列化。 Best, Jiabao On 2023/12/12 07:53:53 李世钰 wrote: > val result: ElasticsearchSink[String] = new Elasticsearch7SinkBuilder[String] > // This instructs the sink to emit after every element,

退订

2023-12-18 Thread 唐大彪
退订

退订

2023-12-18 Thread 唐大彪
退订

Re: Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Alexis Sarda-Espinosa
Hi Dominik, Sounds like it could be this? https://issues.apache.org/jira/browse/FLINK-28060 It doesn't mention transactions but I'd guess it could be the same mechanism. Regards, Alexis. On Mon, 18 Dec 2023, 07:51 Dominik Wosiński, wrote: > Hey, > I've got a question regarding the

Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Dominik Wosiński
Hey, I've got a question regarding the transaction failures in EXACTLY_ONCE flow with Flink 1.15.3 with Confluent Cloud Kafka. The case is that there is a FlinkKafkaProducer in EXACTLY_ONCE setup with default *transaction.timeout.ms *of 15min. During the

Re: Flink 1.18.0 Checkpoints on cancelled jobs

2023-12-18 Thread Hong Liang
Hi Ethan! Thanks for raising the issue, this is indeed a bug - for the previous code path, it falls back to "execution graph store" for completed jobs. I've raise a JIRA here - https://issues.apache.org/jira/browse/FLINK-33872 I've also managed to RC and fix it in the associated PR -

RE: Java 8 support in Flink 1.18

2023-12-18 Thread Praveen Chandna via user
Thanks a lot for quick response. From: Junrui Lee Sent: 18 December 2023 15:57 To: Praveen Chandna via user Cc: Praveen Chandna Subject: Re: Java 8 support in Flink 1.18 Hello, Praveen According to the Flink release-2.0 wiki page

Re: Java 8 support in Flink 1.18

2023-12-18 Thread Junrui Lee
Hello, Praveen According to the Flink release-2.0 wiki page ( https://cwiki.apache.org/confluence/display/FLINK/2.0+Release), there is a must-have task to remove Java 8 support. Therefore, Flink plans to remove support for Java 8 in the upcoming release-2.0. Best regards, Junrui Praveen

Re: Java 8 support in Flink 1.18

2023-12-18 Thread Junrui Lee
Hello, Praveen Java 8 is still supported in the latest Flink 1.18 release despite its deprecation. You can continue using Java 8 for now. However, deprecation serves as a warning that in future releases, Java 8 may no longer be supported. It is recommended that you begin planning your migration

Java 8 support in Flink 1.18

2023-12-18 Thread Praveen Chandna via user
Hello Team As per the Flink 1.15 release notes, Java 8 support has been deprecated. Will there be any impact ? Can I continue to use Java 8 or is there any risk ? Thanks !! // Regards Praveen Chandna

Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-17 Thread Xuyang
Hi, Yad. These SQLs seem to be fine. Have you tried using the print connector as a sink to test whether there is any problem? If everything goes fine with print sink, then the problem occurs on the kafka sink. -- Best! Xuyang 在 2023-12-15 18:48:45,"T, Yadhunath" 写道: Hi Xuyang,

Flink - Java 17 support

2023-12-17 Thread Praveen Chandna via user
Hello In the Flink version 1.18, there is experimental support for Java 17. What is the plan for Java 17 supported release of Flink ? As the below Jira is already closed for Java 17 support. https://issues.apache.org/jira/browse/FLINK-15736 Thanks !! // Regards Praveen Chandna

RE: Control who can manage Flink jobs

2023-12-17 Thread Jiabao Sun
Hi, I don't have much experience with Beam. If you only need to submit Flink tasks, I would recommend StreamPark[1]. Best, Jiabao [1] https://streampark.apache.org/docs/user-guide/Team On 2023/11/30 09:21:50 Поротиков Станислав Вячеславович via user wrote: > Hello! > Is there any way to

RE: Socket timeout when report metrics to pushgateway

2023-12-17 Thread Jiabao Sun
Hi, The pushgateway uses push mode to report metrics. When deployed on a single machine under high load, there may be some performance issues. A simple solution is to set up multiple pushgateways and push the metrics to different pushgateways based on different task groups. There are other

Re: Avoid dynamic classloading in native mode with Kubernetes Operator

2023-12-15 Thread Trystan
Is it better to create a Jira issue for this? From what I can tell it is currently not possible to use native k8s mode while having the jar in the system classpath. This seems like something that would generally be useful in order to avoid dynamic classloading when using the operator. On Mon, Nov

Scaling in stateful application

2023-12-15 Thread rania duni
Hello all! I have enabled flink Kubernetes operator's scaler and I have a stateful application. In the yaml file I have set the savepoint directory and the upgrade mode is *savepoint. *What configurations should be enabled to ensure a safe upgrade with savepoints during scaling operations?

multiple jobs in same deployment file

2023-12-15 Thread Shravan Kumar Reddy
Hi Team, How can we configure multiple task managers and multiple jobs with the same deployment file with flink operator. *Deployment.yaml* apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-example spec: image: flink:1.17 flinkVersion: v1_17

Re: Re:Re: Event stuck in the Flink operator

2023-12-15 Thread T, Yadhunath
Hi Xuyang, Thanks for the reply! I don't have any dataset to share with you at this time. The last block in the Flink pipeline performs 2 functions - temporal join( 2nd temporal join) and writing data into the sink topic. This is what Flink SQL code looks like - SELECT * FROM TABLE1 T1

Re: Event stuck in the Flink operator

2023-12-15 Thread T, Yadhunath
Hi Alex, Thanks for the reply! I have 3 Flink SQL table which is joined using temporal join. I am using the timestamp field in the first table as the event time attribute. eg: SELECT * FROM TABLE1 T1 LEFT JOIN TABLE2 FOR SYSTEM_TIME AS OF T1.timestamp as T2 -- Temporal join ON

Re:Re: Event stuck in the Flink operator

2023-12-14 Thread Xuyang
Hi, Yad. Can you share the smallest set of sql that can reproduce this problem? BTW, the last flink operator you mean is the sink with kafka connector? -- Best! Xuyang 在 2023-12-15 04:38:21,"Alex Cruise" 写道: Can you share your precise join semantics? I don't know

Re: Event stuck in the Flink operator

2023-12-14 Thread Alex Cruise
Can you share your precise join semantics? I don't know about Flink SQL offhand, but here are a couple ways to do this when you're using the DataStream API: * use the Session Window join

Event stuck in the Flink operator

2023-12-14 Thread T, Yadhunath
Hi, I am using Flink version 1.16 and I have a streaming job that uses PyFlinkSQL API. Whenever a new streaming event comes in it is not getting processed in the last Flink operator ( it performs temporal join along with writing data into Kafka topic) and it will be only pushed to Kafka on

Re: Flink 1.18.0 Checkpoints on cancelled jobs

2023-12-14 Thread Ethan T Yang
Hi Hong Liang Teoh, I think you are the owner of the ticket below. Can you take a look see if a bug in the code that breaks retrieving checkpoint history of the cancelled job? Thanks, Ivan > On Dec 10, 2023, at 8:46 AM, Surendra Singh Lilhore > wrote: > > Hi Ethan, > > Looks like this got

Re:OpenSearch connector for Flink version 1.18

2023-12-13 Thread Xuyang
Hi, Praveen Chandna. The latest news about opensearch connector is here[1]. You can subscribe the dev maillist[2] to ask someone related or comment under the relevant pr or jira. [1] https://lists.apache.org/thread/wzkpt7q166qpx4vd504vz9x5j01vh7y9 [2]

Re:关于文档中基于Table API 实现实时报表的问题

2023-12-13 Thread Xuyang
Hi, 你可以试一下用TO_TIMESTAMP(FROM_UNIXTIME(transaction_time)) 将long转为timestamp -- Best! Xuyang 在 2023-12-13 15:36:50,"ha.fen...@aisino.com" 写道: >文档中数据来源于kafka >tEnv.executeSql("CREATE TABLE transactions (\n" + >"account_id BIGINT,\n" + >"amount BIGINT,\n" +

OpenSearch connector for Flink version 1.18

2023-12-12 Thread Praveen Chandna via user
Hello In the Flink version 1.18, the connector for OpenSearch sink is not available. It's mentioned "There is no connector (yet) available for Flink version 1.18." https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/opensearch/ What is the plan, when will the

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Socket timeout when report metrics to pushgateway

2023-12-12 Thread 李琳
hello, we build flink report metrics to prometheus pushgateway, the program has been running for a period of time, with a amount of data reported to pushgateway, pushgateway response socket timeout exception, and much of metrics data reported failed. following is the exception: 2023-12-12

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: Flink operator autoscaler scaling down

2023-12-11 Thread Gyula Fóra
Could you please elaborate a little in which scenarios you find that the pending record metrics are not good to track Kafka lag? Thanks Gyula On Mon, Dec 11, 2023 at 4:26 PM Yang LI wrote: > Hello, > > Following our recent discussion, I've successfully implemented a Flink > operator featuring

Re: Flink operator autoscaler scaling down

2023-12-11 Thread Yang LI
Hello, Following our recent discussion, I've successfully implemented a Flink operator featuring a "memory protection" patch. However, in the course of my testing, I've encountered an issue: the Flink operator relies on the 'pending_record' metric to gauge backlog. Unfortunately, this metric

退订

2023-12-11 Thread RoyWilde
退订

Re: keyby mapState use question

2023-12-10 Thread Zakelly Lan
Hi, This should not happen. I guess the `onTimer` and `processElement` you are testing are triggered under different keyby keys. Note that the keyed states are partitioned by the keyby key first, so if querying or setting the state, you are only manipulating the specific partition which does not

<    10   11   12   13   14   15   16   17   18   19   >