Re: Re: flink kryo exception

2021-02-10 Thread Piotr Nowojski
>> --Original Mail -- >> *Sender:*赵一旦 >> *Send Date:*Sun Feb 7 16:13:57 2021 >> *Recipients:*Till Rohrmann >> *CC:*Robert Metzger , user >> *Subject:*Re: flink kryo exception >> >>> It also maybe have somethin

Re: Re: flink kryo exception

2021-02-07 Thread 赵一旦
> command ? > > Best, > Yun > > > --Original Mail -- > *Sender:*赵一旦 > *Send Date:*Sun Feb 7 16:13:57 2021 > *Recipients:*Till Rohrmann > *CC:*Robert Metzger , user > *Subject:*Re: flink kryo exception > >> It also maybe

Re: Re: flink kryo exception

2021-02-07 Thread Yun Gao
Hi yidan, One more thing to confirm: are you create the savepoint and stop the job all together with bin/flink cancel -s [:targetDirectory] :jobId command ? Best, Yun --Original Mail -- Sender:赵一旦 Send Date:Sun Feb 7 16:13:57 2021 Recipients:Till Rohrmann

Re: flink kryo exception

2021-02-07 Thread 赵一旦
It also maybe have something to do with my job's first tasks. The second task have two input, one is the kafka source stream(A), another is self-defined mysql source as broadcast stream.(B) In A: I have a 'WatermarkReAssigner', a self-defined operator which add an offset to its input watermark and

Re: flink kryo exception

2021-02-07 Thread 赵一旦
The first problem is critical, since the savepoint do not work. The second problem, in which I changed the solution, removed the 'Map' based implementation before the data are transformed to the second task, and this case savepoint works. The only problem is that, I should stop the job and remembe

Re: flink kryo exception

2021-02-05 Thread Till Rohrmann
Could you provide us with a minimal working example which reproduces the problem for you? This would be super helpful in figuring out the problem you are experiencing. Thanks a lot for your help. Cheers, Till On Fri, Feb 5, 2021 at 1:03 PM 赵一旦 wrote: > Yeah, and if it is different, why my job r

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Yeah, and if it is different, why my job runs normally. The problem only occurres when I stop it. Robert Metzger 于2021年2月5日周五 下午7:08写道: > Are you 100% sure that the jar files in the classpath (/lib folder) are > exactly the same on all machines? (It can happen quite easily in a > distributed st

Re: flink kryo exception

2021-02-05 Thread Robert Metzger
Are you 100% sure that the jar files in the classpath (/lib folder) are exactly the same on all machines? (It can happen quite easily in a distributed standalone setup that some files are different) On Fri, Feb 5, 2021 at 12:00 PM 赵一旦 wrote: > Flink1.12.0; only using aligned checkpoint; Standal

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Flink1.12.0; only using aligned checkpoint; Standalone Cluster; Robert Metzger 于2021年2月5日周五 下午6:52写道: > Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which > can lead to corrupted data when using UC) > Can you tell us a little bit about your environment? (How are you > de

Re: flink kryo exception

2021-02-05 Thread Robert Metzger
Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which can lead to corrupted data when using UC) Can you tell us a little bit about your environment? (How are you deploying Flink, which state backend are you using, what kind of job (I guess DataStream API)) Somehow the process r

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Hi all, I find that the failure always occurred in the second task, after the source task. So I do something in the first chaining task, I transform the 'Map' based class object to another normal class object, and the problem disappeared. Based on the new solution, I also tried to stop and restore

Re: flink kryo exception

2021-02-05 Thread 赵一旦
I do not think this is some code related problem anymore, maybe it is some bug? 赵一旦 于2021年2月5日周五 下午4:30写道: > Hi all, I find that the failure always occurred in the second task, after > the source task. So I do something in the first chaining task, I transform > the 'Map' based class object to an

Re: flink kryo exception

2021-02-03 Thread Till Rohrmann
>From these snippets it is hard to tell what's going wrong. Could you maybe give us a minimal example with which to reproduce the problem? Alternatively, have you read through Flink's serializer documentation [1]? Have you tried to use a simple POJO instead of inheriting from a HashMap? The stack

Re: flink kryo exception

2021-02-03 Thread 赵一旦
Some facts are possibly related with these, since another job do not meet these expectations. The problem job use a class which contains a field of Class MapRecord, and MapRecord is defined to extend HashMap so as to accept variable json data. Class MapRecord: @NoArgsConstructor @Slf4j public cla

Re: flink kryo exception

2021-02-03 Thread 赵一旦
Actually the exception is different every time I stop the job. Such as: (1) com.esotericsoftware.kryo.KryoException: Unable to find class: g^XT The stack as I given above. (2) java.lang.IndexOutOfBoundsException: Index: 46, Size: 17 2021-02-03 18:37:24 java.lang.IndexOutOfBoundsException: Index: 4

Re: flink kryo exception

2021-02-03 Thread Till Rohrmann
Hi, could you show us the job you are trying to resume? Is it a SQL job or a DataStream job, for example? >From the stack trace, it looks as if the class g^XT is not on the class path. Cheers, Till On Wed, Feb 3, 2021 at 10:30 AM 赵一旦 wrote: > I have a job, the checkpoint and savepoint all rig