Hi Edward,

>From this log: Caused by: java.io.EOFException, it seems that the state
metadata file has been corrupted.
But I can't confirm it, maybe Stefan knows more details, Ping him for you.

Thanks, vino.

Edward Rojas <edward.roja...@gmail.com> 于2018年9月7日周五 上午1:22写道:

> Hello all,
>
> We are running Flink 1.5.3 on Kubernetes with RocksDB as statebackend.
> When performing some load testing we got an /OutOfMemoryError: native
> memory
> exhausted/, causing the job to fail and be restarted.
>
> After the Taskmanager is restarted, the job is recovered from a Checkpoint,
> but it seems that there is a problem when trying to access the state. We
> got
> the error from the *onTimer* function of a *onProcessingTime*.
>
> It would be possible that the OOM error could have caused to checkpoint a
> corrupted state?
>
> We get Exceptions like:
>
> TimerException{java.lang.RuntimeException: Error while retrieving data from
> RocksDB.}
>         at
>
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:288)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:277)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:191)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>         at java.lang.Thread.run(Thread.java:811)
> Caused by: java.lang.RuntimeException: Error while retrieving data from
> RocksDB.
>         at
>
> org.apache.flink.contrib.streaming.state.RocksDBValueState.value(RocksDBValueState.java:89)
>         at com.xxx.ProcessFunction.*onTimer*(ProcessFunction.java:279)
>         at
>
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.invokeUserFunction(KeyedProcessOperator.java:94)
>         at
>
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.*onProcessingTime*(KeyedProcessOperator.java:78)
>         at
>
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.*onProcessingTime*(HeapInternalTimerService.java:266)
>         at
>
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:285)
>         ... 7 more
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:208)
>         at java.io.DataInputStream.readUTF(DataInputStream.java:618)
>         at java.io.DataInputStream.readUTF(DataInputStream.java:573)
>         at
>
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:381)
>         at
>
> org.apache.flink.contrib.streaming.state.RocksDBValueState.value(RocksDBValueState.java:87)
>         ... 12 more
>
>
> Thanks in advance for any help
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to