Flink 1.7 job cluster (restore from checkpoint error)

2018-12-04 Thread Hao Sun
I am using 1.7 and job cluster on k8s. Here is how I start my job docker-entrypoint.sh job-cluster -j com.zendesk.fraud_prevention.examples.ConnectedStreams --allowNonRestoredState *Seems like --allowNonRestoredState is not honored* === Logs === java","line":"1041","message":"Restoring

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-05 Thread Till Rohrmann
Hi Hao, I think you need to provide a savepoint file via --fromSavepoint to resume from in order to specify --allowNonRestoredState. Otherwise this option will be ignored because it only works if you resume from a savepoint. Cheers, Till On Wed, Dec 5, 2018 at 12:29 AM Hao Sun wrote: > I am us

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-05 Thread Hao Sun
Till, Flink is automatically trying to recover from a checkpoint not savepoint. How can I get allowNonRestoredState applied in this case? Hao Sun Team Lead 1019 Market St. 7F San Francisco, CA 94103 On Wed, Dec 5, 2018 at 10:09 AM Till Rohrmann wrote: > Hi Hao, > > I think you need to provide

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-06 Thread Till Rohrmann
Hi Hao, if Flink tries to recover from a checkpoint, then the JobGraph should not be modified and the system should be able to restore the state. Have you changed the JobGraph and are you now trying to recover from the latest checkpoint which is stored in ZooKeeper? If so, then you can also start

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-06 Thread Hao Sun
Thanks for the tip! I did change the jobGraph this time. Hao Sun Team Lead 1019 Market St. 7F San Francisco, CA 94103 On Thu, Dec 6, 2018 at 2:47 AM Till Rohrmann wrote: > Hi Hao, > > if Flink tries to recover from a checkpoint, then the JobGraph should not > be modified and the system should