RuntimeContextInitializationContextAdapters: ClassNotFoundException

2022-06-29 Thread Harald Busch
I get an java.lang.ClassNotFoundException: org.apache.flink.api.common.serialization.RuntimeContextInitializationContextAdapters when running my Apache Flink code locally. It works fine when I use a local generator for my DataStream. As soon as I switch to an AWS FlinkKinesisConsumer to get

Re:Re:使用lombok生成的pojo对象是否支持State Schema Evolution

2022-06-29 Thread Howie Yang
Hi,Xuyang 修改了作业的逻辑,但可能核心问题还是我修改了数据流中的pojo对象(新增了字段),最终导致了这个问题的出现 -- Best, Howie 在 2022-06-29 22:52:04,"Xuyang" 写道: >Hi,请问下是修改了作业的逻辑之后,根据savepoint重启吗?如果是这样,是状态不兼容的原因 >在 2022-06-29 17:57:54,"Howie Yang" 写道: >>flink版本:1.9.0 >>

Re: Flink 1.13 with GCP Object storage

2022-06-29 Thread Krzysztof Chmielewski
Ok, the problem was that I had few conflicting versions of hadoop-common lib on class path. When I fix the dependencies, it start working. Cheers. śr., 29 cze 2022 o 18:20 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> napisał(a): > Hi, > I'm trying to read data from GCP Object Store

Flink 1.13 with GCP Object storage

2022-06-29 Thread Krzysztof Chmielewski
Hi, I'm trying to read data from GCP Object Store with Flink's File source 1.13.6 I followed instruction from [1] But when I started my job on Flink cluster I have this exception: No FileSystem for scheme: gs Any ideas? Cheers [1]

Re: Optimizing parallelism in reactive mode with adaptive scaling

2022-06-29 Thread Vishal Surana
I've attached a screenshot of the job which highlights the "missing slots". [image: Screenshot 2022-06-29 at 9.38.54 PM.png] Coming to slot sharing, it seems that slot sharing isn't being honored. It doesn't matter if I put 2 or 3 of the 3 heavy weight operators - flink is simply ignoring it and

Re: Flink 1.12 StreamRecordQueueEntry is not public class

2022-06-29 Thread Milind Vaidya
Thanks Xuyang, I did something similar to unblock myself. - Milind On Wed, Jun 29, 2022 at 8:40 PM Xuyang wrote: > Hi, Milind. You may notice that these classes are tagged with 'Internal' > and that mean they are may only used in flink itself. But I think you may > do some retrofit work on

Re:Flink 1.12 StreamRecordQueueEntry is not public class

2022-06-29 Thread Xuyang
Hi, Milind. You may notice that these classes are tagged with 'Internal' and that mean they are may only used in flink itself. But I think you may do some retrofit work on flink, and it's a fast way to tag it as public and rebuild flink just for customization. At 2022-06-24 06:27:43, "Milind

Re:使用lombok生成的pojo对象是否支持State Schema Evolution

2022-06-29 Thread Xuyang
Hi,请问下是修改了作业的逻辑之后,根据savepoint重启吗?如果是这样,是状态不兼容的原因 在 2022-06-29 17:57:54,"Howie Yang" 写道: >flink版本:1.9.0 > >问题:使用lombok生成的pojo对象,在数据流进行传输,中途终止任务做savepoint,state中保存应该都是这个对象; >从savepoint重启任务后,报这个error:StateMigrationException: The new state serializer >cannot be incompatible. ... Heap state backend >

Re: Upgrading a job in rolling mode

2022-06-29 Thread Robin Cassan
Hi Weihua! Thanks so much for your answer, it's pretty apparent that it's absolutely not trivial to do with your explanation. Have you (or any other) been able to apply something close to this Green/Blue deployment idea from this Zero Downtime Upgrade presentation:

Re: Optimizing parallelism in reactive mode with adaptive scaling

2022-06-29 Thread Weihua Hu
Hi, Vishal The reactive mode will adjust the parallelism of tasks by slots of cluster. it will not allocate new workers automatically.[1] 1. max parallelism only works to scale up the parallelism of tasks. it will not affect the scheduling of tasks. 2. flink will enable slot sharing by default,

Re: Upgrading a job in rolling mode

2022-06-29 Thread Weihua Hu
Hi, Robin This is a good question, but Flink can't do rolling upgrades. I'll try to explain the cost of Flink's support for RollingUpgrade. 1. There is a shuffle connection between tasks in a Region, and in order to ensure the consistency of the data processed by the upstream and downstream

Optimizing parallelism in reactive mode with adaptive scaling

2022-06-29 Thread Vishal Surana
I have a job which has about 10 operators, 3 of which are heavy weight. I understand that the current implementation of autoscaling gives more or less no configurability besides max parallelism. That is practically useless as the operators I have will inevitably choke if one of the 3 ends up with

Re: The methodlogy behind the join in Table API and Datastream

2022-06-29 Thread yuxia
> any way I can both receive the message of both update. I think you may need outer join[1] [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#outer-equi-join Best regards, Yuxia 发件人: "lan tran" 收件人: "User" 发送时间: 星期三, 2022年 6 月 29日 下午 6:04:30

The methodlogy behind the join in Table API and Datastream

2022-06-29 Thread lan tran
Hi team,I have the question about the methodology behind the joining using SQL-Client and DataStream. I have some scenario like this: I have two tables: t1 and t2 and I consume the WAL log from it and send to Kafka. Next, I will join two tables above together and convert this table in changelog

使用lombok生成的pojo对象是否支持State Schema Evolution

2022-06-29 Thread Howie Yang
flink版本:1.9.0 问题:使用lombok生成的pojo对象,在数据流进行传输,中途终止任务做savepoint,state中保存应该都是这个对象; 从savepoint重启任务后,报这个error:StateMigrationException: The new state serializer cannot be incompatible. ... Heap state backend -- Best, Howie

Re: Overhead on managing Timers with large number of keys

2022-06-29 Thread yuxia
The short answer is yes. In any case, flink wil spend time/cpu to invoke the timer. Best regards, Yuxia 发件人: "Surendra Lalwani" 收件人: "User" 发送时间: 星期三, 2022年 6 月 29日 下午 3:52:32 主题: Overhead on managing Timers with large number of keys Hi Team, I am working on the States and using

Re: How to make current application cdc

2022-06-29 Thread Sid
Hi Yuxia, Thank you so much for your response. Much appreciated. Here, by CDC I meant the incremental changes that have to be pushed from Kafka to my processing layer which is Flink. Let me go through the links shared by you. Sid. On Mon, Jun 27, 2022 at 6:39 AM yuxia wrote: > > I mean CDC

Overhead on managing Timers with large number of keys

2022-06-29 Thread Surendra Lalwani
Hi Team, I am working on the States and using KeyedStreams and Process Functions. I want to store the number of customers in the state and also I am registering onTimer for all the customer_ids. I wanted to understand if we register something around 70 Million Keys and we have onTimer registered

Upgrading a job in rolling mode

2022-06-29 Thread Robin Cassan
Hi all! We are running a flink cluster on kubernetes and deploying a single job on it through "flink run ". Whenever we want to modify the jar, we cancel the job and run the "flink run" command again, with the new jar, and the retained checkpoint URL from the first run. This works well, but this

Re: how to connect to the flink-state store and use it as cache to serve APIs.

2022-06-29 Thread laxmi narayan
Hi Hangxiang, I was thinking , since we already store entire state in the checkpoint dir so why can't we expose it as a service through the Flink queryable state, in this way I can easily avoid introducing a cache and serve realtime APIs via this state itself and I can go to the database for the