Re: updateStateByKey not persisting in Spark 1.5.1

2016-01-21 Thread Ted Yu
I searched for checkpoint related methods in various Listener classes but haven't found any. Analyzing DAG is tedious and fragile since DAG may change in future Spark releases. Cheers On Thu, Jan 21, 2016 at 8:25 AM, Brian London wrote: > Thanks. It looks like

Re: updateStateByKey not persisting in Spark 1.5.1

2016-01-21 Thread Brian London
Thanks. It looks like extending my batch duration to 7 seconds is a work-around. I'd like to build a check for the lack of checkpointing in our integration tests. Is there a way to parse the DAG at runtime? On Wed, Jan 20, 2016 at 2:01 PM Ted Yu wrote: > This is related:

Re: updateStateByKey not persisting in Spark 1.5.1

2016-01-20 Thread Shixiong(Ryan) Zhu
Could you share your log? On Wed, Jan 20, 2016 at 7:55 AM, Brian London wrote: > I'm running a streaming job that has two calls to updateStateByKey. When > run in standalone mode both calls to updateStateByKey behave as expected. > When run on a cluster, however, it

updateStateByKey not persisting in Spark 1.5.1

2016-01-20 Thread Brian London
I'm running a streaming job that has two calls to updateStateByKey. When run in standalone mode both calls to updateStateByKey behave as expected. When run on a cluster, however, it appears that the first call is not being checkpointed as shown in this DAG image: http://i.imgur.com/zmQ8O2z.png

Re: updateStateByKey not persisting in Spark 1.5.1

2016-01-20 Thread Ted Yu
This is related: SPARK-6847 FYI On Wed, Jan 20, 2016 at 7:55 AM, Brian London wrote: > I'm running a streaming job that has two calls to updateStateByKey. When > run in standalone mode both calls to updateStateByKey behave as expected. > When run on a cluster,