Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2023-12-29 Thread Alessio Bernesco Làvore
Hello, i'm trying to setup a testing environment using: - Flink HA with Zookeeper - Docker Compose While starting the TaskManager generates an exception and then after some restarts if fails. The exception is: "Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing

Re: FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread ying lin
从同一个source里select,在flink sql中用statement set 执行两条insert语句到不同的sink表即可 Jiabao Sun 于2023年12月29日周五 16:55写道: > Hi, > > 使用 SQL 的话不太好实现写入多个路径, > 使用 DataStream 的话可以考虑自己实现一个 RichSinkFunction。 > > Best, > Jiabao > > On 2023/12/29 08:37:34 jinzhuguang wrote: > > Flink版本:1.16.0 > > > > 看官网上的案例: > > CREATE

RE: FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread Jiabao Sun
Hi, 使用 SQL 的话不太好实现写入多个路径, 使用 DataStream 的话可以考虑自己实现一个 RichSinkFunction。 Best, Jiabao On 2023/12/29 08:37:34 jinzhuguang wrote: > Flink版本:1.16.0 > > 看官网上的案例: > CREATE TABLE MyUserTable ( > column_name1 INT, > column_name2 STRING, > ... > part_name1 INT, > part_name2 STRING > )

FileSystem Connector如何优雅的支持同时写入多个路径

2023-12-29 Thread jinzhuguang
Flink版本:1.16.0 看官网上的案例: CREATE TABLE MyUserTable ( column_name1 INT, column_name2 STRING, ... part_name1 INT, part_name2 STRING ) PARTITIONED BY (part_name1, part_name2) WITH ( 'connector' = 'filesystem', -- 必选:指定连接器类型 'path' = 'file:///path/to/whatever', -- 必选:指定路径

RE: Flink SQL Windowing TVFs

2023-12-28 Thread Jiabao Sun
Hi, 在 1.14.0 版本中,CUMULATE 函数是需要用在GROUP BY聚合场景下的[1]。 部署到生产的 SQL 是否包含了 GROUP BY 表达式? 本地测试的Flink版本是不是1.14.0? Best, Jiabao [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/dev/table/sql/queries/window-tvf/#cumulate On 2023/12/29 04:57:09 "jiaot...@mail.jj.cn" wrote: > Hi,

[DISCUSS] Hadoop 2 vs Hadoop 3 usage

2023-12-28 Thread Martijn Visser
Hi all, I want to get some insights on how many users are still using Hadoop 2 vs how many users are using Hadoop 3. Flink currently requires a minimum version of Hadoop 2.10.2 for certain features, but also extensively uses Hadoop 3 (like for the file system implementations) Hadoop 2 has a

Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread Jiabao Sun
Hi, 是的,目前来说会 block 住。 flush + apply schema change 一般来说不会持续太长时间, 且 schema 变更一般来说是低频事件,即使 block 也不会有太大性能影响。 Best, Jiabao > 2023年12月28日 12:57,casel.chen 写道: > > > > > 感谢解惑! > 还有一个问题:如果一个 pipeline 涉及多张表数据同步,而只有一个表出现 schema 变更的话,其他表的数据处理也会 block 住吗? > > > > > > > > > 在 2023-12-28

Re:Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread casel.chen
感谢解惑! 还有一个问题:如果一个 pipeline 涉及多张表数据同步,而只有一个表出现 schema 变更的话,其他表的数据处理也会 block 住吗? 在 2023-12-28 01:16:40,"Jiabao Sun" 写道: >Hi, > >> 为什么最后output.collect(new StreamRecord<>(schemaChangeEvent)); >> 还要发送一次SchemaChangeEvent呢? > >Sink 也会收到 SchemaChangeEvent,因为 Sink 可能需要根据 Schema 变更的情况来调整

Re: flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread Jiabao Sun
Hi, > 为什么最后output.collect(new StreamRecord<>(schemaChangeEvent)); > 还要发送一次SchemaChangeEvent呢? Sink 也会收到 SchemaChangeEvent,因为 Sink 可能需要根据 Schema 变更的情况来调整 serializer 或 writer,参考 DorisEventSerializer > 最后一行requestReleaseUpstream()执行被block的原因是什么?是如何hold upstream然后再release > upstream的呢? 被 block

flink cdc 3.0 schema evolution原理是怎样的?

2023-12-27 Thread casel.chen
看了infoq介绍flink cdc 3.0文章 https://xie.infoq.cn/article/a80608df71c5291186153600b,我对其中schema. evolution设计原理想不明白,框架是如何做到schema change顺序性的。文章介绍得并不详细。 从mysql binlog产生changeEvent来看,所有的变更都是时间线性的,例如s1, d1, d2, s2, d3, d4, d5, s3, d6 其中d代表数据变更,s代表schema变更 这意味着d1,d2使用的是s1 schema,而d3~d5用的是s2

Re: Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-27 Thread Martijn Visser
Hi, If there's nothing that pushes the watermark forward, then the window won't be able to close. That's a common thing and expected for every operator that relies on watermarks. You can also decide to configure an idleness in order to push the watermark forward if needed. Best regards, Martijn

Re:Re:RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Xuyang
Hi, 可以贴一下flink的版本么?如果方便的话,也可以贴一下plan和最小可复现数据集。 -- Best! Xuyang 在 2023-12-26 09:25:30,"杨光跃" 写道: > > > > > > >CompiledPlan plan = env.compilePlanSql("insert into out_console " + >" select r.apply_id from t_purch_apply_sent_route r " + >" left join t_purch_apply_sent_route_goods

Re: RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Benchao Li
这个问题应该是跟 FLINK-33365[1] 中说的是同一个问题,这个已经在修复中了,在最新的 JDBC Connector 版本中会修复它。 [1] https://issues.apache.org/jira/browse/FLINK-33365 杨光跃 于2023年12月26日周二 09:25写道: > > > > > > > > CompiledPlan plan = env.compilePlanSql("insert into out_console " + > " select r.apply_id from t_purch_apply_sent_route r "

Re:RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread 杨光跃
CompiledPlan plan = env.compilePlanSql("insert into out_console " + " select r.apply_id from t_purch_apply_sent_route r " + " left join t_purch_apply_sent_route_goods FOR SYSTEM_TIME AS OF r.pt as t " + "ON t.apply_id = r.apply_id and t.isdel = r.isdel" + " where r.apply_id = 61558439941351

RE: lock up表过滤条件下推导致的bug

2023-12-25 Thread Jiabao Sun
Hi, 邮件中的图片没显示出来,麻烦把 SQL 贴出来一下。 Best, Jiabao On 2023/12/25 12:22:41 杨光跃 wrote: > 我的sql如下: > 、 > > > t_purch_apply_sent_route 是通过flink cdc创建的 > t_purch_apply_sent_route_goods 是普通的jdbc > 我期望的结果是返回符合过滤条件的;但现在执行的结果,会返回t_purch_apply_sent_route表所有数据 > 这显然不符合我的预期,原因应该是因为过滤条件进行了过早的下推 >

lock up表过滤条件下推导致的bug

2023-12-25 Thread 杨光跃
我的sql如下: 、 t_purch_apply_sent_route 是通过flink cdc创建的 t_purch_apply_sent_route_goods 是普通的jdbc 我期望的结果是返回符合过滤条件的;但现在执行的结果,会返回t_purch_apply_sent_route表所有数据 这显然不符合我的预期,原因应该是因为过滤条件进行了过早的下推 这应该算是bug吧,或者要满足我的预期,该怎么写sql?

Re: Re:RE: Flink - Java 17 support

2023-12-24 Thread xiangyu feng
Hi Praveen, I'm not sure what do you need from Java 17. In Bytedance, we have tried to compile Flink on Java 8/Java 11 and run on Java 17. It works well in production and also takes advantage of new Java 17 features. You can try this way if you have urgent need for Java 17. Hope this helps u.

Re: Re: 退订

2023-12-24 Thread Zhanshun Zou
退订 李国辉 于2023年11月22日周三 22:28写道: > > 退订 > > > > > -- > 发自我的网易邮箱手机智能版 > > > > - Original Message - > From: "Junrui Lee" > To: user-zh@flink.apache.org > Sent: Wed, 22 Nov 2023 10:19:32 +0800 > Subject: Re: 退订 > > Hi, > > 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。 > > Best,

Re: Re: Flink CDC MySqlSplitReader问题

2023-12-24 Thread Hang Ruan
Hi, 我记得这段逻辑是为了保证在新增表后,binlog 读取能和新增表的快照读取一起进行,保证binlog读取不会中断。 这里应该是会先读binlog,然后再读snapshot,再是binlog。这样的切换,来保证binlog 能一直有数据读出来。 Best, Hang casel.chen 于2023年12月22日周五 10:44写道: > 那意思是会优先处理已经在增量阶段的表,再处理新增快照阶段的表?顺序反过来的话会有什么影响?如果新增表全量数据比较多会导致其他表增量处理变慢是么? > > > > > > > > > > > > > > > > > > 在

Re: [Flink Kubernetes Operator] Restoring from an outdated savepoint.

2023-12-22 Thread Gyula Fóra
Please upgrade the operator to the latest release, and if the issue still exists please open a Jira ticket with the details. Gyula On Fri, 22 Dec 2023 at 21:17, Ruibin Xing wrote: > I wanted to talk about an issue we've hit recently with Flink Kubernetes > Operator 1.6.1 and Flink 1.17.1. > >

[Flink Kubernetes Operator] Restoring from an outdated savepoint.

2023-12-22 Thread Ruibin Xing
I wanted to talk about an issue we've hit recently with Flink Kubernetes Operator 1.6.1 and Flink 1.17.1. As we're using the Savepoint upgrade mode, we ran into cases where the lastSavepoint in status doesn't seem to update (still digging into why, could be an exception when cancelling

Re:Re: Flink CDC MySqlSplitReader问题

2023-12-21 Thread casel.chen
那意思是会优先处理已经在增量阶段的表,再处理新增快照阶段的表?顺序反过来的话会有什么影响?如果新增表全量数据比较多会导致其他表增量处理变慢是么? 在 2023-12-20 21:40:05,"Hang Ruan" 写道: >Hi,casel > >这段逻辑应该只有在处理到新增表的时候才会用到。 >CDC 读取数据正常是会在所有SnapshotSplit读取完成后,才会下发BinlogSplit。 >但是如果是在增量阶段重启作业,同时新增加了一些表,就会出现同时有BinlogSplit和SnapshotSplit的情况,此时才会走到这段逻辑。 >

Re: 退订

2023-12-21 Thread Junrui Lee
你好, 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件 Best, Junrui jimandlice 于2023年12月21日周四 17:43写道: > 退订 > | | > jimandlice > | > | > 邮箱:jimandl...@163.com > |

RE: Re:Flink脏数据处理

2023-12-21 Thread Jiabao Sun
Hi, 需要精准控制异常数据的话,就不太推荐flink sql了。 考虑使用DataStream将异常数据用侧流输出[1],再做补偿。 Best, Jiabao [1] https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/side_output/ On 2023/12/06 08:45:20 Xuyang wrote: > Hi, > 目前flink sql主动收集脏数据的行为。有下面两种可行的办法: > 1.

RE: Re:Re:flink sql支持批量lookup join

2023-12-21 Thread Jiabao Sun
Hi, casel. 使用三次lookup join是可以实现的,加上缓存,性能应该不差。 WITH users AS ( SELECT * FROM (VALUES(1, 'zhangsan'), (2, 'lisi'), (3, 'wangwu')) T (id, name) ) SELECT orders.id, u1.name as creator_name, u2.name as approver_name, u3.name as deployer_name FROM ( SELECT *

退订

2023-12-21 Thread jimandlice
退订 | | jimandlice | | 邮箱:jimandl...@163.com |

退订

2023-12-21 Thread jimandlice
退订 | | jimandlice | | 邮箱:jimandl...@163.com |

RE: Pending records

2023-12-21 Thread Jiabao Sun
Hi rania, Does "pending records" specifically refer to the records that have been read from the source but have not been processed yet? If this is the case, FLIP-33[1] introduces some standard metrics for Source, including "pendingRecords," which can be helpful. However, not all Sources

RE: Does Flink on K8s support log4j2 kafka appender? [Flink log] [Flink Kubernetes Operator]

2023-12-20 Thread Jiabao Sun
Hi Chosen, Whether kafka appender is supported or not has no relation to the flink-kubernetes-operator. It only depends on whether log4j2 supports kafka appender. From the error message, it appears that the error is caused by the absence of the log4j-layout-template-json[1] plugin. For the

Flink India Jobs Refferal

2023-12-20 Thread sri hari kali charan Tummala
Hi Community, I got laid off at Apple in Feb 2023 which forced me move out of USA due to immigration problem (h1b) I was a Big Data,Spark,Scala,Python and Flink consultant with over 12+ years of experience. I am still haven't landed in a job in India since then I need referrals in India in

Re:RE: Re:RE: Flink - Java 17 support

2023-12-20 Thread Xuyang
Hi, Praveen Chandna. I don't know much about this plan, you can ask on the dev mailing list. [1] [1]https://flink.apache.org/what-is-flink/community/ -- Best! Xuyang 在 2023-12-20 14:53:52,"Praveen Chandna via user" 写道: Hello Xuyang One more query, is there plan to release

Re: Flink CDC MySqlSplitReader问题

2023-12-20 Thread Hang Ruan
Hi,casel 这段逻辑应该只有在处理到新增表的时候才会用到。 CDC 读取数据正常是会在所有SnapshotSplit读取完成后,才会下发BinlogSplit。 但是如果是在增量阶段重启作业,同时新增加了一些表,就会出现同时有BinlogSplit和SnapshotSplit的情况,此时才会走到这段逻辑。 Best, Hang key lou 于2023年12月20日周三 16:24写道: > 意思是当 有 binlog 就意味着 已经读完了 snapshot > > casel.chen 于2023年12月19日周二 16:45写道: > > >

需要帮助:在使用AsyncLookupFunction时如果asyncLookup抛出异常,则会遇到了SerializedThrowable 的StackOverflowError

2023-12-20 Thread Manong Karl
简单示例: public class TableA implements LookupTableSource { @Nullable private final LookupCache cache; public TableA(@Nullable LookupCache cache) { this.cache = cache; } @Override public LookupRuntimeProvider getLookupRuntimeProvider(LookupContext context) {

Re: Flink CDC MySqlSplitReader问题

2023-12-20 Thread key lou
意思是当 有 binlog 就意味着 已经读完了 snapshot casel.chen 于2023年12月19日周二 16:45写道: > 我在阅读flink-connector-mysql-cdc项目源码过程中遇到一个不清楚的地方,还请大佬指点,谢谢! > > > MySqlSplitReader类有一段代码如下,注释“(1) Reads binlog split firstly and then read > snapshot split”这一句话我不理解。 > 为什么要先读binlog split再读snapshot

RE: Re:RE: Flink - Java 17 support

2023-12-19 Thread Praveen Chandna via user
Hello Xuyang One more query, is there plan to release formal Java 17 support in Flink release 1.19 or 1.20 ? Or do we need to wait till Flink 2.0 ? Thanks !! From: Xuyang Sent: 19 December 2023 16:37 To: user@flink.apache.org Subject: Re:RE: Flink - Java 17 support Hi, Praveen Chandna.

Re:RE: Flink - Java 17 support

2023-12-19 Thread Xuyang
Hi, Praveen Chandna. Please correct me if I'm wrong. From the release note of Flink 1.18 [1], you can see that although 1.18 already supports Java 17, it is still in beta mode. [1] https://nightlies.apache.org/flink/flink-docs-master/zh/release-notes/flink-1.18/ -- Best! Xuyang

RE: Flink - Java 17 support

2023-12-19 Thread Praveen Chandna via user
Hello Team Please helps to reply on Java 17 support. Does the Java 17 experimental support is production ready ? Is it recommended to use Flink 1.18 with Java 17 or is there any risk if we move to Java 17 in production ? Thanks !! From: Praveen Chandna Sent: 17 December 2023 21:28 To: Praveen

Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2023-12-19 Thread Вова Фролов
Hello Flink Community, I am texting to you with an issue I have encountered while using Apache Flink version 1.17.1. In my Flink Job, I am using Kafka version 3.6.0 to ingest data from TPC-DS(current tpcds100 target size tpcds1), and then I am executing SQL queries, specifically, the q77

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-19 Thread Rui Fan
Thanks everyone for the feedback! It doesn't have more feedback here, so I started the new vote[1] just now to update the default value of backoff-multiplier from 1.2 to 1.5. [1] https://lists.apache.org/thread/0b1dcwb49owpm6v1j8rhrg9h0fvs5nkt Best, Rui On Tue, Dec 12, 2023 at 7:14 PM

Re: Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-19 Thread T, Yadhunath
Hi Xuyang, Thanks for the reply! I haven't used a print connector yet. Thanks, Yad From: Xuyang Sent: Monday, December 18, 2023 8:26 AM To: T, Yadhunath Cc: user@flink.apache.org Subject: Re:Re: Re:Re: Event stuck in the Flink operator This Message Is From

Flink CDC MySqlSplitReader问题

2023-12-19 Thread casel.chen
我在阅读flink-connector-mysql-cdc项目源码过程中遇到一个不清楚的地方,还请大佬指点,谢谢! MySqlSplitReader类有一段代码如下,注释“(1) Reads binlog split firstly and then read snapshot split”这一句话我不理解。 为什么要先读binlog split再读snapshot split?为保证记录的时序性,不是应该先读全量的snapshot split再读增量的binlog split么? private MySqlRecords pollSplitRecords() throws

Re: Configuration not propagating unless explicitly initializing FileSystem for Azure File System

2023-12-18 Thread Junrui Lee
Hi, Yuval The issue you're encountering arises because Flink's FileSystem is designed to operate with cluster-level configuration. And this configuration is sourced from the flink-conf.yaml file. Consequently, when the FileSystem is initialized, it isn't able to access the configuration objects

Configuration not propagating unless explicitly initializing FileSystem for Azure File System

2023-12-18 Thread Yuval Itzchakov
Flink: 1.17.1 Hi, I've encountered a weird issue when passing a configuration object to StreamExecutionEnvironment.getExecutionEnvironment does not propagate to the hadoop file system being initialized when running Flink locally in an IDE. I am passing credentials in order to connect to Azure

RE: Feature flag functionality on flink

2023-12-18 Thread Jiabao Sun
Hi, If it is for simplicity, you can also try writing the flag into an external system, such as Redis、Zookeeper or MySQL, and query the flag from the external system when perform data processing. However, Broadcast State is still the mode that I recommend. Perhaps we only need to encapsulate

Re: 退订

2023-12-18 Thread Hang Ruan
Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from u...@flink.apache.org, and you can refer [1][2] for more details. 请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 地址来取消订阅来自 u...@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1]

Re: 退订

2023-12-18 Thread Hang Ruan
Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user@flink.apache.org, and you can refer [1][2] for more details. 请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 地址来取消订阅来自 user@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Best, Hang [1]

Re: Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Hang Ruan
Hi, Dominik. >From the code, your sink has received an InvalidTxnStateException in KafkaCommittter[1]. And kafka connector treats it as a known exception to invoke `signalFailedWithKnownReason`. `signalFailedWithKnownReason` will not throw an exception. It let the committer to decide fail or

RE: flink1.15-flink1.18官方提供写入Elasticsearch的接口报序列化异常

2023-12-18 Thread Jiabao Sun
Hi, createIndexRequest是否不是静态的,scala的话可以在object中声明该方法。 Lambda中访问非静态方法,并且外部类不是可序列化的,可能会导致lambda无法被序列化。 Best, Jiabao On 2023/12/12 07:53:53 李世钰 wrote: > val result: ElasticsearchSink[String] = new Elasticsearch7SinkBuilder[String] > // This instructs the sink to emit after every element,

退订

2023-12-18 Thread 唐大彪
退订

退订

2023-12-18 Thread 唐大彪
退订

Re: Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Alexis Sarda-Espinosa
Hi Dominik, Sounds like it could be this? https://issues.apache.org/jira/browse/FLINK-28060 It doesn't mention transactions but I'd guess it could be the same mechanism. Regards, Alexis. On Mon, 18 Dec 2023, 07:51 Dominik Wosiński, wrote: > Hey, > I've got a question regarding the

Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Dominik Wosiński
Hey, I've got a question regarding the transaction failures in EXACTLY_ONCE flow with Flink 1.15.3 with Confluent Cloud Kafka. The case is that there is a FlinkKafkaProducer in EXACTLY_ONCE setup with default *transaction.timeout.ms *of 15min. During the

Re: Flink 1.18.0 Checkpoints on cancelled jobs

2023-12-18 Thread Hong Liang
Hi Ethan! Thanks for raising the issue, this is indeed a bug - for the previous code path, it falls back to "execution graph store" for completed jobs. I've raise a JIRA here - https://issues.apache.org/jira/browse/FLINK-33872 I've also managed to RC and fix it in the associated PR -

RE: Java 8 support in Flink 1.18

2023-12-18 Thread Praveen Chandna via user
Thanks a lot for quick response. From: Junrui Lee Sent: 18 December 2023 15:57 To: Praveen Chandna via user Cc: Praveen Chandna Subject: Re: Java 8 support in Flink 1.18 Hello, Praveen According to the Flink release-2.0 wiki page

Re: Java 8 support in Flink 1.18

2023-12-18 Thread Junrui Lee
Hello, Praveen According to the Flink release-2.0 wiki page ( https://cwiki.apache.org/confluence/display/FLINK/2.0+Release), there is a must-have task to remove Java 8 support. Therefore, Flink plans to remove support for Java 8 in the upcoming release-2.0. Best regards, Junrui Praveen

Re: Java 8 support in Flink 1.18

2023-12-18 Thread Junrui Lee
Hello, Praveen Java 8 is still supported in the latest Flink 1.18 release despite its deprecation. You can continue using Java 8 for now. However, deprecation serves as a warning that in future releases, Java 8 may no longer be supported. It is recommended that you begin planning your migration

Java 8 support in Flink 1.18

2023-12-18 Thread Praveen Chandna via user
Hello Team As per the Flink 1.15 release notes, Java 8 support has been deprecated. Will there be any impact ? Can I continue to use Java 8 or is there any risk ? Thanks !! // Regards Praveen Chandna

Re:Re: Re:Re: Event stuck in the Flink operator

2023-12-17 Thread Xuyang
Hi, Yad. These SQLs seem to be fine. Have you tried using the print connector as a sink to test whether there is any problem? If everything goes fine with print sink, then the problem occurs on the kafka sink. -- Best! Xuyang 在 2023-12-15 18:48:45,"T, Yadhunath" 写道: Hi Xuyang,

Flink - Java 17 support

2023-12-17 Thread Praveen Chandna via user
Hello In the Flink version 1.18, there is experimental support for Java 17. What is the plan for Java 17 supported release of Flink ? As the below Jira is already closed for Java 17 support. https://issues.apache.org/jira/browse/FLINK-15736 Thanks !! // Regards Praveen Chandna

RE: Control who can manage Flink jobs

2023-12-17 Thread Jiabao Sun
Hi, I don't have much experience with Beam. If you only need to submit Flink tasks, I would recommend StreamPark[1]. Best, Jiabao [1] https://streampark.apache.org/docs/user-guide/Team On 2023/11/30 09:21:50 Поротиков Станислав Вячеславович via user wrote: > Hello! > Is there any way to

RE: Socket timeout when report metrics to pushgateway

2023-12-17 Thread Jiabao Sun
Hi, The pushgateway uses push mode to report metrics. When deployed on a single machine under high load, there may be some performance issues. A simple solution is to set up multiple pushgateways and push the metrics to different pushgateways based on different task groups. There are other

Re: Avoid dynamic classloading in native mode with Kubernetes Operator

2023-12-15 Thread Trystan
Is it better to create a Jira issue for this? From what I can tell it is currently not possible to use native k8s mode while having the jar in the system classpath. This seems like something that would generally be useful in order to avoid dynamic classloading when using the operator. On Mon, Nov

Scaling in stateful application

2023-12-15 Thread rania duni
Hello all! I have enabled flink Kubernetes operator's scaler and I have a stateful application. In the yaml file I have set the savepoint directory and the upgrade mode is *savepoint. *What configurations should be enabled to ensure a safe upgrade with savepoints during scaling operations?

multiple jobs in same deployment file

2023-12-15 Thread Shravan Kumar Reddy
Hi Team, How can we configure multiple task managers and multiple jobs with the same deployment file with flink operator. *Deployment.yaml* apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-example spec: image: flink:1.17 flinkVersion: v1_17

Re: Re:Re: Event stuck in the Flink operator

2023-12-15 Thread T, Yadhunath
Hi Xuyang, Thanks for the reply! I don't have any dataset to share with you at this time. The last block in the Flink pipeline performs 2 functions - temporal join( 2nd temporal join) and writing data into the sink topic. This is what Flink SQL code looks like - SELECT * FROM TABLE1 T1

Re: Event stuck in the Flink operator

2023-12-15 Thread T, Yadhunath
Hi Alex, Thanks for the reply! I have 3 Flink SQL table which is joined using temporal join. I am using the timestamp field in the first table as the event time attribute. eg: SELECT * FROM TABLE1 T1 LEFT JOIN TABLE2 FOR SYSTEM_TIME AS OF T1.timestamp as T2 -- Temporal join ON

Re:Re: Event stuck in the Flink operator

2023-12-14 Thread Xuyang
Hi, Yad. Can you share the smallest set of sql that can reproduce this problem? BTW, the last flink operator you mean is the sink with kafka connector? -- Best! Xuyang 在 2023-12-15 04:38:21,"Alex Cruise" 写道: Can you share your precise join semantics? I don't know

Re: Event stuck in the Flink operator

2023-12-14 Thread Alex Cruise
Can you share your precise join semantics? I don't know about Flink SQL offhand, but here are a couple ways to do this when you're using the DataStream API: * use the Session Window join

Event stuck in the Flink operator

2023-12-14 Thread T, Yadhunath
Hi, I am using Flink version 1.16 and I have a streaming job that uses PyFlinkSQL API. Whenever a new streaming event comes in it is not getting processed in the last Flink operator ( it performs temporal join along with writing data into Kafka topic) and it will be only pushed to Kafka on

Re: Flink 1.18.0 Checkpoints on cancelled jobs

2023-12-14 Thread Ethan T Yang
Hi Hong Liang Teoh, I think you are the owner of the ticket below. Can you take a look see if a bug in the code that breaks retrieving checkpoint history of the cancelled job? Thanks, Ivan > On Dec 10, 2023, at 8:46 AM, Surendra Singh Lilhore > wrote: > > Hi Ethan, > > Looks like this got

Re:OpenSearch connector for Flink version 1.18

2023-12-13 Thread Xuyang
Hi, Praveen Chandna. The latest news about opensearch connector is here[1]. You can subscribe the dev maillist[2] to ask someone related or comment under the relevant pr or jira. [1] https://lists.apache.org/thread/wzkpt7q166qpx4vd504vz9x5j01vh7y9 [2]

Re:关于文档中基于Table API 实现实时报表的问题

2023-12-13 Thread Xuyang
Hi, 你可以试一下用TO_TIMESTAMP(FROM_UNIXTIME(transaction_time)) 将long转为timestamp -- Best! Xuyang 在 2023-12-13 15:36:50,"ha.fen...@aisino.com" 写道: >文档中数据来源于kafka >tEnv.executeSql("CREATE TABLE transactions (\n" + >"account_id BIGINT,\n" + >"amount BIGINT,\n" +

OpenSearch connector for Flink version 1.18

2023-12-12 Thread Praveen Chandna via user
Hello In the Flink version 1.18, the connector for OpenSearch sink is not available. It's mentioned "There is no connector (yet) available for Flink version 1.18." https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/opensearch/ What is the plan, when will the

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-12 Thread Maximilian Michels
Thank you Rui! I think a 1.5 multiplier is a reasonable tradeoff between restarting fast but not putting too much pressure on the cluster due to restarts. -Max On Tue, Dec 12, 2023 at 8:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > Hi Maximilian and Mason, > > Thanks a lot for your feedback! >

Socket timeout when report metrics to pushgateway

2023-12-12 Thread 李琳
hello, we build flink report metrics to prometheus pushgateway, the program has been running for a period of time, with a amount of data reported to pushgateway, pushgateway response socket timeout exception, and much of metrics data reported failed. following is the exception: 2023-12-12

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-11 Thread Rui Fan
Hi Maximilian and Mason, Thanks a lot for your feedback! After an offline consultation with Max, I guess I understand your concern for now: when flink job restarts, it will make a bunch of calls to the Kubernetes API, e.g. read/write to config maps, create task managers. Currently, the default

Re: Flink operator autoscaler scaling down

2023-12-11 Thread Gyula Fóra
Could you please elaborate a little in which scenarios you find that the pending record metrics are not good to track Kafka lag? Thanks Gyula On Mon, Dec 11, 2023 at 4:26 PM Yang LI wrote: > Hello, > > Following our recent discussion, I've successfully implemented a Flink > operator featuring

Re: Flink operator autoscaler scaling down

2023-12-11 Thread Yang LI
Hello, Following our recent discussion, I've successfully implemented a Flink operator featuring a "memory protection" patch. However, in the course of my testing, I've encountered an issue: the Flink operator relies on the 'pending_record' metric to gauge backlog. Unfortunately, this metric

退订

2023-12-11 Thread RoyWilde
退订

Re: keyby mapState use question

2023-12-10 Thread Zakelly Lan
Hi, This should not happen. I guess the `onTimer` and `processElement` you are testing are triggered under different keyby keys. Note that the keyed states are partitioned by the keyby key first, so if querying or setting the state, you are only manipulating the specific partition which does not

Re: Flink 1.18.0 Checkpoints on cancelled jobs

2023-12-10 Thread Surendra Singh Lilhore
Hi Ethan, Looks like this got changed after https://issues.apache.org/jira/browse/FLINK-32469. Now the checkpoint history call throws below exception for canceled job. 2023-12-10 21:50:12,990 ERROR org.apache.flink.runtime.rest.handler.job.checkpoints. CheckpointingStatisticsHandler [] -

Re: Production deployment of Flink

2023-12-07 Thread Gyula Fóra
Hi! We recommend using the community supported Flink Kubernetes Operator: https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.7/docs/try-flink-kubernetes-operator/quick-start/ Cheers, Gyula On Thu, Dec 7, 2023 at 6:33 PM Tauseef Janvekar wrote: > Hi Al, > > I am using

Feature flag functionality on flink

2023-12-07 Thread Oscar Perez via user
Hi, We would like to enable sort of a feature flag functionality for flink jobs. The idea would be to use broadcast state reading from a configuration topic and then ALL operators with logic would listen to this state. This documentation:

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-07 Thread Maximilian Michels
Hey Rui, +1 for changing the default restart strategy to exponential-delay. This is something all users eventually run into. They end up changing the restart strategy to exponential-delay. I think the current defaults are quite balanced. Restarts happen quickly enough unless there are consecutive

Re: [DISCUSS] Change the default restart-strategy to exponential-delay

2023-12-07 Thread Maximilian Michels
Hey Rui, +1 for changing the default restart strategy to exponential-delay. This is something all users eventually run into. They end up changing the restart strategy to exponential-delay. I think the current defaults are quite balanced. Restarts happen quickly enough unless there are consecutive

Production deployment of Flink

2023-12-07 Thread Tauseef Janvekar
Hi Al, I am using flink in my local setup and it works just fine - I installed it using confluent example training course. Here I had to manually execute start-cluster.sh and othe steps to start task managers. We installed flink on kubernetes using bitnami helm chart and it works just fine. But

RE: Reading text file from S3

2023-12-07 Thread Fourais
Thank you Jaehyeon Kim. A workaround consists on exporting was credentials as environment variables using https://github.com/linaro-its/aws2-wrap The command below will set the environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, which makes the Flink code work.

Unable to locate full stderr logs while using Flink Operator

2023-12-07 Thread Edgar H
Hi all! I've just deployed an Apache Beam job using FlinkRunner in k8s and found that the job failed and has the following field: error: >- {"type":"org.apache.flink.util.SerializedThrowable","message":"org.apache.flink.client.program.ProgramInvocationException: The main method caused an

Re: Query on using two sinks for a Flink job (Flink SQL)

2023-12-07 Thread elakiya udhayanan
Hi Chen/ Feng, Thanks for pointing out the mistake I made, after correcting the query I am able to run the job with two sinks successfully. Thanks, Elakiya On Thu, Dec 7, 2023 at 4:37 AM Chen Yu wrote: > Hi Chen, > You should tell flink which table to insert by “INSERT INTO XXX SELECT >

keyby mapState use question

2023-12-07 Thread Jake.zhang
Hi all: KeyBy process function EventKeyedBroadcastProcessFunction { private transient mapstate = null; public void open(Configuration parameters) throws Exception { // initial map state } public void processElement() { // can't get onTimer() function set state key value }

[flink-k8s-connector] In-place scaling up often takes several times till it succeeds.

2023-12-06 Thread Xiaolong Wang
Hi, I'm playing with a Flink 1.18 demo with the auto-scaler and the adaptive scheduler. The operator can correctly collect data and order the job to scale up, but it'll take the job several times to reach the required parallelism. E.g. The original parallelism for each vertex is something like

Re: Query on using two sinks for a Flink job (Flink SQL)

2023-12-06 Thread Chen Yu
Hi Chen, You should tell flink which table to insert by “INSERT INTO XXX SELECT XXX”. For single non insert query, flink will collect output to the console automatically. Therefore, you don’t need to add insert also works. But you must point out target table specifically when you need to write

Reading text file from S3

2023-12-06 Thread Fourais
Hi, Using Flink 1.18 and Java 17, I am trying to read a text file from S3 using env.readTextFile("s3://mybucket/folder1/file.txt"). When I run the app in the IDE, I get the following error: Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by

Re: Query on using two sinks for a Flink job (Flink SQL)

2023-12-06 Thread Feng Jin
Hi Elakiya, You should use DML in the statement set instead of DQL . Here is a simple example: executeSql("CREATE TABLE source_table1 .."); executeSql("CREATE TABLE source_table2 .."); executeSql("CREATE TABLE sink_table1 .."); executeSql("CREATE TABLE sink_table1 ..");

Re: Query on using two sinks for a Flink job (Flink SQL)

2023-12-06 Thread elakiya udhayanan
Hi Xuyang, Zhangao, Thanks for your response, I have attached sample job files that I tried with the Statementset and with two queries. Please let me know if you are able to point out where I am possibly going wrong. Thanks, Elakiya On Wed, Dec 6, 2023 at 4:51 PM Xuyang wrote: > Hi, Elakiya.

Re:Query on using two sinks for a Flink job (Flink SQL)

2023-12-06 Thread Xuyang
Hi, Elakiya. Are you following the example here[1]? Could you attach a minimal, reproducible SQL? [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/insert/ -- Best! Xuyang At 2023-12-06 17:49:17, "elakiya udhayanan" wrote: Hi Team, I would like

Re: Query on using two sinks for a Flink job (Flink SQL)

2023-12-06 Thread Zhanghao Chen
Hi Elakiya, You can try executing TableEnvironmentImpl#executeInternal for non-insert statements, then using StatementSet.addInsertSql to add multiple insertion statetments, and finally calling StatementSet#execute. Best, Zhanghao Chen From: elakiya udhayanan

<    9   10   11   12   13   14   15   16   17   18   >