RE: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-23 Thread Jean-Marc Paulin
Thanks for y our insight. I am still trying to understand exactly what happens here. We currently have the default setting in kafka, and we set the "transaction.timeout.ms" to 15 minutes (which also happen to be the default "transaction.max.timeout.ms". My expectation would be that if our

Job goes into FINISHED state after rescaling - link operator

2024-04-22 Thread Maxim Senin via user
Hi. My Flink Deployment is set to use savepoint for upgrades and for taking savepoint before stopping. When rescaling happens, for some reason it scales the JobManager to zero (“Scaling JobManager Deployment to zero with 300 seconds timeout”) and the job goes into FINISHED state. It doesn’t

FlinkCEP

2024-04-22 Thread Esa Heikkinen
Hi It's been over 5 years since I last did anything with FlinkCEP and Flink. Has there been any significant development in FlinkCEP during this time? BR. Esa

Re: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-21 Thread Yanfei Lei
Hi JM, Yes, `InvalidPidMappingException` occurs because the transaction is lost in most cases. For short-term, " transaction.timeout.ms" > "transactional.id.expiration.ms" can ignore the `InvalidPidMappingException`[1]. For long-term, FLIP-319[2] provides a solution. [1]

Re: Flink 1.18.1 cannot read from Kafka

2024-04-21 Thread Phil Stavridis
Thanks Biao. Kind regards Phil > On 14 Apr 2024, at 18:04, Biao Geng wrote: > > Hi Phil, > > You can check my github link > > for a detailed tutorial and example codes :). > > Best, > Biao Geng > > Phil

申请退订邮件

2024-04-21 Thread Steven Shi
退订 > 下面是被转发的邮件: > > 发件人: Biao Geng > 主题: 回复:申请退订邮件申请,谢谢 > 日期: 2024年4月2日 GMT+8 10:17:20 > 收件人: user-zh@flink.apache.org > 回复-收件人: user-zh@flink.apache.org > > Hi, > > 退订请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org > . > > > Best, > Biao Geng > > 于2024年3月31日周日 22:20写道: > >>

处理时间的滚动窗口提前触发

2024-04-20 Thread hhq
我使用了一个基于处理时间的滚动窗口,窗口大小设置为60s,但是我在窗口的处理函数中比较窗口的结束时间和系统时间,偶尔会发现获取到的系统时间早于窗口结束时间(这里的提前量不大,只有几毫秒,但是我不清楚,这是flink窗口本身的原因还是我代码的问题)我没有找到原因,请求帮助 public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-19 Thread Jean-Marc Paulin
Hi, we use Flink 1.18 with Kafka Sink, and we enabled `EXACTLY_ONCE` on one of our kafka sink. We set the transation timeout to 15 minutes. When we try to restore from a savepoint, way after that 15 minutes window, Flink enter in a RESTARTING loop. We see the error: ``` { "exception": {

退订

2024-04-19 Thread 冮雪程
| | 冮雪程 | | gxc_bigd...@163.com | 回复的原邮件 | 发件人 | jh...@163.com | | 发送日期 | 2024年04月18日 16:17 | | 收件人 | user-zh | | 主题 | Re: 回复:退订 | 退订 jh...@163.com 发件人: 我的意中人是个盖世英雄 发送时间: 2024-04-18 16:03 收件人: user-zh 主题: 回复:退订 退订 ---原始邮件--- 发件人: "willluzheng"

退订

2024-04-18 Thread junhua . xie

Re: Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-18 Thread Hangxiang Yu
Hi, Lei. It's indeed a bit confusing. Could you share the related rocksdb log which may contain more detailed info ? On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote: > > I enable RocksDB native metrics and do some performance tuning. > > state.backend.rocksdb.block.cache-size is set to 128m,4

退订

2024-04-18 Thread dongming

RE: Watermark advancing too quickly when reprocessing events with event time from Kafka

2024-04-18 Thread Tyron Zerafa
Hi, I’m experiencing the same issue on Flink 18.1. I have a slightly different job graph. I have a single Kafka Source (parallelism 6) that is consuming from 2 topics, one topic with 4 partitions and one topic with 6 partitions. The autoWatermarkInteval change to 0 didn’t fix my issue. Did

Re: Flink流批一体应用在实时数仓数据核对场景下有哪些注意事项?

2024-04-18 Thread Yunfeng Zhou
流模式和批模式在watermark和一些算子语义等方面上有一些不同,但没看到Join和Window算子上有什么差异,这方面应该在batch mode下应该是支持的。具体的两种模式的比较可以看一下这个文档 https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/datastream/execution_mode/ On Thu, Apr 18, 2024 at 9:44 AM casel.chen wrote: > > 有人尝试这么实践过么?可以给一些建议么?谢谢! > > > > > > > > > > > >

Strange Problem (0 AvailableTask)

2024-04-18 Thread Hemi Grs
Hello, I have several versions of Flink (1.17.0, 1.18.0, 1.18.1 and 1.19.0) on my server. I am still trying it out (on & off), and I was running a job for sync a table from mysql to elasticsearch and it was running find without any problems ( I was using 1.18.1 version). But after a few weeks, I

Async code inside Flink Sink

2024-04-17 Thread Jacob Rollings
Hello, I have a use case where I need to do a cache file deletion after a successful sunk operation(writing to db). My Flink pipeline is built using Java. I am contemplating using Java completableFuture.runasync() to perform the file deletion activity. I am wondering what issues this might cause

Re: Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Zhanghao Chen
If you have some experience before, I'd recommend setting a good parallelism and TM resource spec first, to give the autotuner a good starting point. Usually, the autoscaler can tune your jobs well within a few attempts (<=3). As for `pekko.ask.timeout`, the default value should be sufficient

Re:Flink流批一体应用在实时数仓数据核对场景下有哪些注意事项?

2024-04-17 Thread casel.chen
有人尝试这么实践过么?可以给一些建议么?谢谢! 在 2024-04-15 11:15:34,"casel.chen" 写道: >我最近在调研Flink实时数仓数据质量保障,需要定期(每10/20/30分钟)跑批核对实时数仓产生的数据,传统方式是通过spark作业跑批,如Apache >DolphinScheduler的数据质量模块。 >但这种方式的最大缺点是需要使用spark sql重写flink >sql业务逻辑,难以确保二者一致性。所以我在考虑能否使用Flink流批一体特性,复用flink

Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Maxim Senin via user
Hi. Does it make sense to specify `parallelism` for task managers or the `job`, and, similarly, to specify memory amount for the task managers, or it’s better to leave it to autoscaler and autotuner to pick the best values? How many times would the autoscaler need to restart task managers

Re: Understanding default firings in case of allowed lateness

2024-04-17 Thread Sachin Mittal
Hi Xuyang, So if I check the side output way then my pipeline would be something like this: final OutputTag lateOutputTag = new OutputTag("late-data"){}; SingleOutputStreamOperator reducedDataStream = dataStream .keyBy(new MyKeySelector())

Re:Understanding default firings in case of allowed lateness

2024-04-17 Thread Xuyang
Hi, Sachin. IIUC, it is in the second situation you listed, that is: [ reduceData (d1, d2, d3), reducedData(ld1, d2, d3, late d4, late d5, late d6) ]. However, because of `table.exec.emit.late-fire.delay`, it could also be such as [ reduceData (d1, d2, d3), reducedData(ld1, d2, d3, late d4,

Understanding default firings in case of allowed lateness

2024-04-17 Thread Sachin Mittal
Hi, Suppose my pipeline is: data .keyBy(new MyKeySelector()) .window(TumblingEventTimeWindows.of(Time.seconds(60))) .allowedLateness(Time.seconds(180)) .reduce(new MyDataReducer()) So I wanted to know if the final output stream would contain reduced data at the end of the

Re: Elasticsearch8 example

2024-04-17 Thread Hang Ruan
Hi Tauseef. I see that the support of Elasticsearch 8[1] will be released in elasticsearch-3.1.0. So there is no docs for the elasticsearch8 by now. We could learn to use it by some tests[2] before the docs is ready. Best, Hang [1] https://issues.apache.org/jira/browse/FLINK-26088 [2]

Re: Table Source from Parquet Bug

2024-04-17 Thread Hang Ruan
Hi, David. Have you added the parquet format[1] dependency in your dependencies? It seems that the class ParquetColumnarRowInputFormat cannot be found. Best, Hang [1] https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/table/formats/parquet/ Sohil Shah 于2024年4月17日周三

Re: Iceberg connector

2024-04-16 Thread Péter Váry
Hi Chetas, > the only way out to use only the DataStream API (and not the table api) if I want to use a custom splitComparator? You can use watermark generation, and with that, watermark based split ordering using the table api. OTOH, currently there is no way to define a custom comparator using

Re: Iceberg connector

2024-04-16 Thread Chetas Joshi
Hi Péter, Great! Thanks! The resources are really useful. I don't have TABLE_EXEC_ICEBERG_USE_FLIP27_SOURCE set so it is the FlinkSource

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Robert Prat
Hi Péter, Thanks for pointing this out! I was aware of the difference in version between pyflink and some of the JAR dependencies. I was starting out with PyFlink 1.16 and I had some errors when creating the Dockerfile that seemed to be fixed when upgrading the version to 1.18. Thus the

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Péter Váry
Is it intentional, that you are using iceberg-flink-runtime-1.16-1.3.1.jar with 1.18.0 PyFlink? This might cause issues later. I would try to synchronize the Flink versions throughout all the dependencies. On Tue, Apr 16, 2024, 11:23 Robert Prat wrote: > I finally managed to make it work

Re: Iceberg connector

2024-04-16 Thread Péter Váry
Hi Chetas, See my answers below: On Tue, Apr 16, 2024, 06:39 Chetas Joshi wrote: > Hello, > > I am running a batch flink job to read an iceberg table. I want to > understand a few things. > > 1. How does the FlinkSplitPlanner decide which fileScanTasks (I think one > task corresponds to one

RE: Table Source from Parquet Bug

2024-04-16 Thread Sohil Shah
Hi David, Since this is a ClassNotFoundException, you may be missing a dependency. Could you share your pom.xml. Thanks -Sohil Project: Braineous https://bugsbunnyshah.github.io/braineous/ On 2024/04/16 15:22:34 David Silva via user wrote: > Hi, > > Our team would like to leverage Flink but

Re: Table Source from Parquet Bug

2024-04-16 Thread Sohil Shah
Hello David, Since this is a ClassNotFoundException, you maybe missing a dependency. Could you share your pom.xml. Thanks -Sohil Project: Braineous https://bugsbunnyshah.github.io/braineous/ On Tue, Apr 16, 2024 at 11:25 AM David Silva via user wrote: > Hi, > > Our team would like to leverage

Re: GCS FileSink Read Timeouts

2024-04-16 Thread Dylan Fontana via user
Thanks for the links! We've tried the `gs.writer.chunk.size` before and found it didn't make a meaningful difference unfortunately. The hadoop-connector link you've sent I think is actually not applicable since the gcs Filesystem connector isn't using the hadoop implementation but instead the

Elasticsearch8 example

2024-04-16 Thread Tauseef Janvekar
Dear Team, Can anyone please share an example for flink-connector-elasticsearch8 I found this connector being added to the github. But no proper documentation is present around it. It will be of great help if a sample code is provided on the above connector. Thanks, Tauseef

Re: Pyflink Performance and Benchmark

2024-04-16 Thread Chase Zhang
On Mon, Apr 15, 2024 at 16:17 Niklas Wilcke wrote: > Hi Flink Community, > u > I wanted to reach out to you to get some input about Pyflink performance. > Are there any resources available about Pyflink benchmarks and maybe a > comparison with the Java API? I wasn't able to find something

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Robert Prat
I finally managed to make it work following the advice of Robin Moffat who replied to the earlier email: There's a lot of permutations that you've described, so it's hard to take one reproducible test case here to try and identify the error :) It certainly looks JAR related. You could try

Re:Re: Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-16 Thread Xuyang
Thanks for driving this ;) -- Best! Xuyang 在 2024-04-16 10:47:56,"Xiaolong Wang" 写道: Reported. JIRA link: https://issues.apache.org/jira/browse/FLINK-35117?filter=-2 On Tue, Apr 16, 2024 at 10:05 AM Xiaolong Wang wrote: By adding `'org.apache.commons.text` to the

Iceberg connector

2024-04-15 Thread Chetas Joshi
Hello, I am running a batch flink job to read an iceberg table. I want to understand a few things. 1. How does the FlinkSplitPlanner decide which fileScanTasks (I think one task corresponds to one data file) need to be clubbed together within a single split and when to create a new split? 2.

Re: Optimize exact deduplication for tens of billions data per day

2024-04-15 Thread Alex Cruise
It may not be completely relevant to this conversation in this year, but I find myself sharing this article once or twice a year when opining about how hard deduplication at scale can be.  -0xe1a On Thu, Apr 11, 2024 at 10:22 PM Péter Váry

Re: [EXTERNAL]Re: Pyflink Performance and Benchmark

2024-04-15 Thread Niklas Wilcke
Hi Zhanghao Chen, thanks for sharing the link. This looks quite interesting! Regards, Niklas > On 15. Apr 2024, at 12:43, Zhanghao Chen wrote: > > When it comes down to the actual runtime, what really matters is the plan > optimization and the operator impl & shuffling. You might be

Re: Flink job performance

2024-04-15 Thread Kenan Kılıçtepe
How many taskmanagers and server do you have? Can you also share the task managers page of flink dashboard? On Mon, Apr 15, 2024 at 10:58 AM Oscar Perez via user wrote: > Hi community! > > We have an interesting problem with Flink after increasing parallelism in > a certain way. Here is the

Re: Flink job performance

2024-04-15 Thread Oscar Perez via user
Hi, I appreciate your comments and thank you for that. My original question still remains though. Why the very same job just by changing the settings aforementioned had this increase in cpu usage and performance degradation when we should have expected the opposite behaviour? thanks again, Oscar

Kinesis connector writes wrong sequence number at stop with savepoint

2024-04-15 Thread Vararu, Vadim
I’ve been investigating a data duplication issue in a Kinesis -> Flink -> Kafka exactly once setup. Found out that at the stop with savepoint next things happen: * The Kafka transaction is committed, the last processed events being written * The Kinesis sequence number is written in

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
The exception basically says the remote TM is unreachable, probably terminated due to some other reasons. This may not be the root cause. Is there any other exceptions in the log? Also, since the overall resource usage is almost full, could you try allocating more CPUs and see if the

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
Hi, there seems to be sth wrong with the two images attached in the latest email. I cannot open them. Best, Zhanghao Chen From: Oscar Perez via user Sent: Monday, April 15, 2024 15:57 To: Oscar Perez via user ; pi-team ; Hermes Team Subject: Flink job

Re: Understanding event time wrt watermarking strategy in flink

2024-04-15 Thread Sachin Mittal
Hi Yunfeng, So regarding the dropping of records for out of order watermark, lats say records later than T - B will be dropped by the first operator after watermarking, which is reading from the source. So then these records will never be forwarded to the step where we do event-time windowing.

Re: Pyflink Performance and Benchmark

2024-04-15 Thread Zhanghao Chen
When it comes down to the actual runtime, what really matters is the plan optimization and the operator impl & shuffling. You might be interested in this blog: https://flink.apache.org/2022/05/06/exploring-the-thread-mode-in-pyflink/, which did a benchmark on the latter with the common the

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
Hi Oscar, The rebalance operation will go over the network stack, but not necessarily involving remote data shuffle. For data shuffling between tasks of the same node, the local channel is used, but compared to chained operators, it still introduces extra data serialization overhead. For data

Data duplication at stop with savepoint

2024-04-15 Thread Vararu, Vadim
Hi community, Need your help to understand if there is a misconfiguration or it’s a Flink bug. I’ve drawn a schema for better understanding but here is the problem in few steps:

Pyflink Performance and Benchmark

2024-04-15 Thread Niklas Wilcke
Hi Flink Community, I wanted to reach out to you to get some input about Pyflink performance. Are there any resources available about Pyflink benchmarks and maybe a comparison with the Java API? I wasn't able to find something valuable, but maybe I missed something? I am aware that

Flink流批一体应用在实时数仓数据核对场景下有哪些注意事项?

2024-04-14 Thread casel.chen
我最近在调研Flink实时数仓数据质量保障,需要定期(每10/20/30分钟)跑批核对实时数仓产生的数据,传统方式是通过spark作业跑批,如Apache DolphinScheduler的数据质量模块。 但这种方式的最大缺点是需要使用spark sql重写flink sql业务逻辑,难以确保二者一致性。所以我在考虑能否使用Flink流批一体特性,复用flink sql,只需要将数据源从cdc或kafka换成hologres或starrocks表,再新建跑批结果表,最后只需要比较相同时间段内实时结果表和跑批结果表的数据即可。不过有几点疑问: 1. 原实时flink

Re: Understanding event time wrt watermarking strategy in flink

2024-04-14 Thread Yunfeng Zhou
Hi Sachin, Firstly sorry for my misunderstanding about watermarking in the last email. When you configure an out-of-orderness watermark with a tolerance of B, the next watermark emitted after a record with timestamp T would be T-B instead of T described in my last email. Then let's go back to

RE: Flink 1.18.1 cannot read from Kafka

2024-04-14 Thread Sohil Shah
Hi Phil, if __name__ == "__main__": process_table() error: link_app | Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'kafka' that implements 'org.apache.flink.table.factories.DynamicTableFactory' in the classpath. flink_app | flink_app |

Flink 1.18.1 cannot read from Kafka

2024-04-14 Thread Sohil Shah
Hi Phil, if __name__ == "__main__": process_table() error: link_app | Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'kafka' that implements 'org.apache.flink.table.factories.DynamicTableFactory' in the classpath. flink_app | flink_app |

Re: Flink 1.18.1 cannot read from Kafka

2024-04-14 Thread Biao Geng
Hi Phil, You can check my github link for a detailed tutorial and example codes :). Best, Biao Geng Phil Stavridis 于2024年4月12日周五 19:10写道: > Hi Biao, > > Thanks for looking into it and providing a detailed example. >

回复:退订

2024-04-14 Thread willluzheng
退订 回复的原邮件 | 发件人 | jimandlice | | 发送日期 | 2024年04月13日 19:50 | | 收件人 | user-zh | | 主题 | 退订 | 退订 jimandlice jimandl...@163.com

Would like feedback on Apache Flink based data pipeline platform: Braineous

2024-04-13 Thread Sohil Shah
Open Source High-Scale Data Pipeline Platform for Enterprise Data, Analytics, and Machine Learning Applications. Documentation: https://bugsbunnyshah.github.io/braineous/guides/developer-guide Get Started: https://bugsbunnyshah.github.io/braineous/get-started/ GitHub:

退订

2024-04-13 Thread jimandlice
退订 jimandlice jimandl...@163.com

Flink 1.18 support for flink stateful functions

2024-04-12 Thread Deshpande, Omkar via user
Hello, Is there a plan to add support for flink 1.18 in flink stateful function? Also, generally the stateful functions seem to be slow and lag behind the flink release cycle. Is the stateful function project going to be actively maintained? Thanks, Omkar

Re: ProcessWindowFunction中使用per-window state

2024-04-12 Thread gongzhongqiang
你好, 可以通过使用 globalState / windowState 获取之前的状态进行增量计算。 下面这个 demo 可以方便理解: public class ProcessWindowFunctionDemo { public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // 使用处理时间

Pyflink w Nessie and Iceberg in S3 Jars

2024-04-12 Thread Robert Prat
Hi there, For several days I have been trying to find the right configuration for my pipeline which roughly consists in the following schema RabbitMQ->PyFlink->Nessie/Iceberg/S3. For what I am going to explain I have tried both locally and through the official Flink docker images. I have

Re: Understanding event time wrt watermarking strategy in flink

2024-04-12 Thread Sachin Mittal
Hi Yunfeng, I have a question around the tolerance for out of order bound watermarking, What I understand that when consuming from source with out of order bound set as B, lets say it gets a record with timestamp T. After that it will drop all the subsequent records which arrive with the

Re: Understanding event time wrt watermarking strategy in flink

2024-04-12 Thread Yunfeng Zhou
Hi Sachin, 1. When your Flink job performs an operation like map or flatmap, the output records would be automatically assigned with the same timestamp as the input record. You don't need to manually assign the timestamp in each step. So the windowing result in your example should be as you have

Re: Optimize exact deduplication for tens of billions data per day

2024-04-11 Thread Péter Váry
Hi Lei, There is an additional overhead when adding new keys to an operator, since Flink needs to maintain the state, timers etc for the individual keys. If you are interested in more details, I suggest to use the FlinkUI and compare the flamegraph for the stages. There you can see the difference

Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-11 Thread Lei Wang
I enable RocksDB native metrics and do some performance tuning. state.backend.rocksdb.block.cache-size is set to 128m,4 slots for each TaskManager. The observed result for one specific parallel slot: state.backend.rocksdb.metrics.block-cache-capacity is about 14.5M

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
Thanks very much, it finally works On Thu, Apr 11, 2024 at 8:27 PM Zhanghao Chen wrote: > Add a space between -yD and the param should do the trick. > > Best, > Zhanghao Chen > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 19:40 > *To:* Zhanghao Chen >

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Zhanghao Chen
Add a space between -yD and the param should do the trick. Best, Zhanghao Chen From: Lei Wang Sent: Thursday, April 11, 2024 19:40 To: Zhanghao Chen Cc: Biao Geng ; user Subject: Re: How to enable RocksDB native metrics? Hi Zhanghao, flink run -m yarn-cluster

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
Hi Zhanghao, flink run -m yarn-cluster -ys 4 -ynm EventCleaning_wl -yjm 2G -ytm 16G -yqu default -p 8 -yDstate.backend.latency-track.keyed-state-enabled=true -c com.zkj.task.EventCleaningTask SourceDataCleaning-wl_0410.jar --sourceTopic dwd_audio_record --groupId clean_wl_ --sourceServers

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Zhanghao Chen
Hi Lei, You are using an old-styled CLI for YARN jobs where "-yD" instead of "-D" should be used. From: Lei Wang Sent: Thursday, April 11, 2024 12:39 To: Biao Geng Cc: user Subject: Re: How to enable RocksDB native metrics? Hi Biao, I tried, it doesn't

Understanding event time wrt watermarking strategy in flink

2024-04-11 Thread Sachin Mittal
Hello folks, I have few questions: Say I have a source like this: final DataStream data = env.fromSource( source, WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(60)) .withTimestampAssigner((event, timestamp) -> event.timestamp)); My pipeline after

Re: Optimize exact deduplication for tens of billions data per day

2024-04-11 Thread Lei Wang
Hi Peter, I tried,this improved performance significantly,but i don't know exactly why. According to what i know, the number of keys in RocksDB doesn't decrease. Any specific technical material about this? Thanks, Lei On Fri, Mar 29, 2024 at 9:49 PM Lei Wang wrote: > Perhaps I can

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
Hi Biao, I tried, it doesn't work. The cmd is: flink run -m yarn-cluster -ys 4 -ynm EventCleaning_wl -yjm 2G -ytm 16G -yqu default -p 8 -Dstate.backend.latency-track.keyed-state-enabled=true -c com.zkj.task.EventCleaningTask SourceDataCleaning-wl_0410.jar --sourceTopic dwd_audio_record

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
Hi Biao, I tried, it does On Mon, Apr 8, 2024 at 9:48 AM Biao Geng wrote: > Hi Lei, > You can use the "-D" option in the command line to set configs for a > specific job. E.g, `flink run-application -t > yarn-application -Djobmanager.memory.process.size=1024m `. > See >

Re:Unable to use Table API in AWS Managed Flink 1.18

2024-04-10 Thread Xuyang
Hi, Perez. Flink use SPI to find the jdbc connector in the classloader and when starting, the dir '${FLINK_ROOT}/lib' will be added into the classpath. That is why in AWS the exception throws. IMO there are two ways to solve this question. 1. upload the connector jar to AWS to let the

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, I will check out running with flink run, but should this be run in the Flink JobManager? Would that mean that the container for the Flink JobManager would require both Python installed and a copy of the flink_client.py module? Are there some examples of running flink run in a

Re: How are window's boundaries decided in flink

2024-04-10 Thread Dylan Fontana via user
Hi Sachin, Assignment for tumbling windows is exclusive on the endTime; see description here https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/operators/windows/#tumbling-windows . So in your example it would be assigned to window (60, 120) as in reality the windows

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Biao Geng
Hi Phil, It should be totally ok to use `python -m flink_client.job`. It just seems to me that the flink cli is being used more often. And yes, you also need to add the sql connector jar to the flink_client container. After putting the jar in your client container, add codes like

How are window's boundaries decided in flink

2024-04-10 Thread Sachin Mittal
Hi, Lets say I have defined 1 minute TumblingEventTimeWindows. So it will create windows as: (0, 60), (60, 120), Now lets say I have an event at time t = 60. In which window would this get aggregated ? 1st or second or both. Say I want this to get aggregated only in the second window, how

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, 1. I have a Flink client container like this: # Flink client flink_client: container_name: flink_client image: flink-client:local build: context: . dockerfile: flink_client/Dockerfile networks: - standard depends_on: - jobmanager - Kafka The flink_client/Dockerfile has this bash file

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Biao Geng
Hi Phil, Your codes look good. I mean how do you run the python script. Maybe you are using flink cli? i.e. run commands like ` flink run -t .. -py job.py -j /path/to/flink-sql-kafka-connector.jar`. If that's the case, the `-j /path/to/flink-sql-kafka-connector.jar` is necessary so that in client

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, For submitting the job, I run t_env.execute_sql. Shouldn’t that be sufficient for submitting the job using the Table API with PyFlink? Isn’t that the recommended way for submitting and running PyFlink jobs on a running Flink cluster? The Flink cluster runs without issues, but there is

Unable to use Table API in AWS Managed Flink 1.18

2024-04-10 Thread Enrique Alberto Perez Delgado
Hi all, I am using AWS Managed Flink 1.18, where I am getting this error when trying to submit my job: ``` Caused by: org.apache.flink.table.api.ValidationException: Cannot discover a connector using option: 'connector'='jdbc' at

Re:Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-10 Thread Xuyang
Hi, Wang. Could you provide more details for this bug, such as minimum reproducible test code, pom dependencies, etc? Further more, can you try again to package the dependency "commons-text" with version "1.10.0" manually to check if it works? If you can work around this bug by this way, I

Re: flink 已完成job等一段时间会消失

2024-04-09 Thread gongzhongqiang
你好: 如果想长期保留已完成的任务,推荐使用 History Server : https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#history-server Best, Zhongqiang Gong ha.fen...@aisino.com 于2024年4月9日周二 10:39写道: > 在WEBUI里面,已完成的任务会在completed jobs里面能够看到,过了一会再进去看数据就没有了,是有什么配置自动删除吗? >

Re: Flink 1.18.1 cannot read from Kafka

2024-04-09 Thread Biao Geng
Hi Phil, Thanks for sharing the detailed information of the job. For you question, how to you submit the job? After applying your yaml file, I think you will successfully launch a flink cluster with 1 JM and 1 TM. Then you would submit the pyflink job to the flink cluster. As the error you showed

Flink 1.18.1 cannot read from Kafka

2024-04-09 Thread Phil Stavridis
Hello, I have set up Flink and Kafka containers using docker-compose, for testing how Flink works for processing Kafka messages. I primarily want to check how the Table API works but also how the Stream API would process the Kafka messages. I have included the main part of the

Re: Use of data generator source

2024-04-09 Thread Lasse Nedergaard
Hi TkachenkoYes I have and we use it extensively for unit testing. But we also have integration testing as part of our project and here I run into the problem.In my previous implementation I used SourceFunction interface and added a delay in the run function. but it’s depredicated so I have

Re: Use of data generator source

2024-04-09 Thread Yaroslav Tkachenko
Hi Lasse, Have you seen this approach https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/testing/#unit-testing-stateful-or-timely-udfs--custom-operators ? On Tue, Apr 9, 2024 at 7:09 AM Lasse Nedergaard < lassenedergaardfl...@gmail.com> wrote: > Hi. > > I my

Re: Debugging Kryo Fallback

2024-04-09 Thread Salva Alcántara
{ "emoji": "", "version": 1 }

Use of data generator source

2024-04-09 Thread Lasse Nedergaard
Hi. I my Integration test, running on 1.19, with a mini cluster I mock all my sources with DataGeneratorSource and it works fine until I have a timer function in a key processed function. The problem is that the processing time doesn’t advance after all data has been produced in the sources.

Re: Debugging Kryo Fallback

2024-04-09 Thread Zhanghao Chen
Hi, you may first enable the Kryo fallback option first, submit the job, and search for "be processed as GenericType". Flink will print it in most cases where we fall back to Kryo (a few exceptions including type Class, Object, recursive types, interface). Best, Zhanghao Chen

Re: Re: 采集mysql全量的时候出现oom问题

2024-04-09 Thread gongzhongqiang
可以尝试的解决办法: - 调大 JM 内存 (如 Shawn Huang 所说) - 调整快照期间批读的大小,以降低 state 大小从而减轻 checkpiont 过程中 JM 内存压力 Best, Zhongqiang Gong wyk 于2024年4月9日周二 16:56写道: > > 是的,分片比较大,有一万七千多个分片 > >

Found an issue when using Flink 1.19's AsyncScalarFunction

2024-04-09 Thread Xiaolong Wang
Hi, I found a ClassNotFound exception when using Flink 1.19's AsyncScalarFunction. Stack trace: Caused by: java.lang.ClassNotFoundException: > org.apache.commons.text.StringSubstitutor > > at java.net.URLClassLoader.findClass(Unknown Source) ~[?:?] > > at java.lang.ClassLoader.loadClass(Unknown

Re:Re: 采集mysql全量的时候出现oom问题

2024-04-09 Thread wyk
是的,分片比较大,有一万七千多个分片 jm内存目前是2g,我调整到4g之后还是会有这么问题,我在想如果我一直调整jm内存,后面增量的时候内存会有所浪费,在flink官网上找到了flink堆内存的相关参数,但是对这个不太了解,不知道具体该怎么调试合适,麻烦帮忙看一下如下图这些参数调整那个合适呢? flink官网地址为: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/memory/mem_setup_jobmanager/ | Component |

Re: 采集mysql全量的时候出现oom问题

2024-04-08 Thread Shawn Huang
从报错信息看,是由于JM的堆内存不够,可以尝试把JM内存调大,一种可能的原因是mysql表全量阶段分片较多,导致SourceEnumerator状态较大。 Best, Shawn Huang wyk 于2024年4月8日周一 17:46写道: > > > 开发者们好: > flink版本1.14.5 > flink-cdc版本 2.2.0 > > 在使用flink-cdc-mysql采集全量的时候,全量阶段会做checkpoint,但是checkpoint的时候会出现oom问题,这个有什么办法吗? >

回复:flink 已完成job等一段时间会消失

2024-04-08 Thread spoon_lz
有一个过期时间的配置 https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#jobstore-expiration-time | | spoon_lz | | spoon...@126.com | 回复的原邮件 | 发件人 | ha.fen...@aisino.com | | 发送日期 | 2024年04月9日 10:38 | | 收件人 | user-zh | | 主题 | flink 已完成job等一段时间会消失 |

Re: How to debug window step in flink

2024-04-08 Thread Sachin Mittal
Hi, Yes it was a watermarking issue. There were few out of order records in my stream and as per watermarking strategy the watermark was advanced to the future and hence current events were getting discarded. I have fixed this by not processing future timestamped records. Thanks Sachin On Mon,

Re: Debugging Kryo Fallback

2024-04-08 Thread Salva Alcántara
Yeah I think you're right and there is no need for anything, really. I was thinking of having more user friendly tests for my POJOs for which I checked the Kryo Fallback and if detected provide an exhaustive list of issues found (vs raising an exception for the first problem, requiring users to

Re: Debugging Kryo Fallback

2024-04-08 Thread Yunfeng Zhou
Hi Salva, Could you please give me some hint about the issues Flink can collect apart from the exception and the existing logs? Suppose we record the exception in the log and the Flink job continues, I can imagine that similar Kryo exceptions from each of the rest records will then appear in the

Re: How to debug window step in flink

2024-04-08 Thread Dominik.Buenzli
Hi Sachin What exactly does the MyReducer do? Can you provide us with some code? Just a wild guess from my side, did you check the watermarking? If the Watermarks aren't progressing there's no way for Flink to know when to emit a window and therefore you won't see any outgoing events. Kind

How to debug window step in flink

2024-04-08 Thread Sachin Mittal
Hi, I have a following windowing step in my pipeline: inputData .keyBy(new MyKeySelector()) .window( TumblingEventTimeWindows.of(Time.seconds(60))) .reduce(new MyReducer()) .name("MyReducer"); Same step when I see in Flink UI shows as:

<    1   2   3   4   5   6   7   8   9   10   >