Re: Flink job restart at checkpoint interval

2016-11-16 Thread Satish Chandra Gupta
Hi Till, Thanks. Yes, that is what I have been doing. But accessing GUI over VPN of Flink running on a yarn cluster on EMR sometime becomes very slow (not even execution plan gets shown :-) sometime), that's why I thought of this. Thanks, +satish On Wed, Nov 16, 2016 at 6:46 PM, Till Rohrmann w

Re: Flink job restart at checkpoint interval

2016-11-16 Thread Till Rohrmann
Hi Satish, I'm afraid but I think there is no such way to configure the name of the checkpoint file for a task at the moment. For the latest checkpoint you can see the state sizes for the individual subtask in the web ui under checkpoints. Cheers, Till On Tue, Nov 15, 2016 at 10:52 PM, Satish Ch

Re: Flink job restart at checkpoint interval

2016-11-15 Thread Satish Chandra Gupta
Hi Ufuk and Till, Thanks a lot. Both these suggestions were useful. Older version of xerces was being loaded from one of the dependencies, and I also fixed the serialization glitch in my code, and now checkpointing works. I have 5 value states apart from a custom trigger, and a custom trigger. Is

Re: Flink job restart at checkpoint interval

2016-11-15 Thread Till Rohrmann
Hi Satish, your problem seems to be more related to a problem reading Hadoop's configuration. According to the internet [1,2,3] try to select a proper xerces version to resolve the problem. [1] http://stackoverflow.com/questions/26974067/org-apache-hadoop-conf-configuration-loadresource-error [2]

Re: Flink job restart at checkpoint interval

2016-11-14 Thread Satish Chandra Gupta
Most of the ValueState I am using are of Long or Boolean, except one which is a map of Long to Scala case class: ValueState[Map[Long, AnScalaCaseClass]] Does this serialization happen only for the value state members of operators, or also other private fields? Thanks +satish On Mon, Nov 14, 201

Re: Flink job restart at checkpoint interval

2016-11-14 Thread Ufuk Celebi
There seems to be an Exception happening when Flink tries to serialize the state of your operator (see the stack trace). What are you trying to store via the ValueState? Maybe you can share a code excerpt? – Ufuk On 14 November 2016 at 10:51:06, Satish Chandra Gupta (scgupt...@gmail.com) wrot

Flink job restart at checkpoint interval

2016-11-14 Thread Satish Chandra Gupta
Hi, I am using Value State, backed by FsStateBackend on hdfs, as following: env.setStateBackend(new FsStateBackend(stateBackendPath)) env.enableCheckpointing(checkpointInterval) It is non-iterative job running Flink/Yarn. The job restarts at checkpointInterval, I have tried interval varying fro