Shashank Agarwal created FLINK-7760: ---------------------------------------
Summary: Restore failing from external checkpointing metadata. Key: FLINK-7760 URL: https://issues.apache.org/jira/browse/FLINK-7760 Project: Flink Issue Type: Bug Components: CEP, State Backends, Checkpointing Affects Versions: 1.3.2 Environment: Yarn, Flink 1.3.2, HDFS, FsStateBackend Reporter: Shashank Agarwal My job failed due to failure of cassandra. I have enabled ExternalizedCheckpoints. But when job tried to restore from that checkpoint it's failing continuously with following error. {code:java} 2017-10-04 09:39:20,611 INFO org.apache.flink.runtime.taskmanager.Task - KeyedCEPPatternOperator -> Map (1/2) (8ff7913f820ead571c8b54ccc6b16045) switched from RUNNING to FAILED. java.lang.IllegalStateException: Could not initialize keyed state backend. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:321) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:217) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:676) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:663) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.StreamCorruptedException: invalid type code: 00 at java.io.ObjectInputStream$BlockDataInputStream.readBlockHeader(ObjectInputStream.java:2519) at java.io.ObjectInputStream$BlockDataInputStream.refill(ObjectInputStream.java:2553) at java.io.ObjectInputStream$BlockDataInputStream.skipBlockData(ObjectInputStream.java:2455) at java.io.ObjectInputStream.skipCustomData(ObjectInputStream.java:1951) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1621) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeCondition(NFA.java:1211) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeStates(NFA.java:1169) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:957) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:852) at org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders$StateTableByKeyGroupReaderV2V3.readMappingsInKeyGroup(StateTableByKeyGroupReaders.java:132) at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restorePartitionedState(HeapKeyedStateBackend.java:518) at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restore(HeapKeyedStateBackend.java:397) at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(StreamTask.java:772) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:311) ... 6 more {code} I have tried to start new job also after failure with parameter {code:java} -s [checkpoint meta data path]{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)