Re: How to perform clean-up after stateful streaming processes an RDD?

2017-06-06 Thread David Rosenstrauch
Thanks much for the suggestion. Reading over your mail though, I realize that I may not have made something clear: I don't just have a single external service object; the idea is that I have one per executor, so that the code running on each executor accesses the external service independently.

Re: How to perform clean-up after stateful streaming processes an RDD?

2017-06-06 Thread David Rosenstrauch
Thanks much for the suggestion. Reading over your mail though, I realize that I may not have made something clear: I don't just have a single external service object; the idea is that I have one per executor, so that the code running on each executor accesses the external service independently.

Re: How to perform clean-up after stateful streaming processes an RDD?

2017-06-06 Thread Gerard Maas
It looks like the clean up should go into the foreachRDD function: stateUpdateStream.foreachRdd(...) { rdd => // do stuff with the rdd stateUpdater.cleanupExternalService// should work in this position } Code within the foreachRDD(*) executes on the driver, so you can keep the state of

How to perform clean-up after stateful streaming processes an RDD?

2017-06-06 Thread David Rosenstrauch
We have some code we've written using stateful streaming (mapWithState) which works well for the most part. The stateful streaming runs, processes the RDD of input data, calls the state spec function for each input record, and does all proper adding and removing from the state cache. However, I