Re: init / shutdown for complex map job?

2014-12-30 Thread Kevin Burton
Yes. I can do a just in time init… I can see that the first map was done. However, I can’t see that the last map was done I think.. and the shutdown is the key part. Without it all my daemon threads won’t properly exit and I will not have all messages sent over the wire. On Sun, Dec 28, 2014 at

Re: init / shutdown for complex map job?

2014-12-28 Thread Sean Owen
(Still pending, but believe it's in progress and being written by a colleague here.) On Sun, Dec 28, 2014 at 2:41 PM, Ray Melton wrote: > A follow-up to the blog cited below was hinted at, per "But Wait, > There's More ... To keep this post brief, the remainder will be left to > a follow-up post.

Re: init / shutdown for complex map job?

2014-12-28 Thread Ray Melton
A follow-up to the blog cited below was hinted at, per "But Wait, There's More ... To keep this post brief, the remainder will be left to a follow-up post." Is this follow-up pending? Is it sort of pending? Did the follow-up happen, but I just couldn't find it on the web? Regards, Ray. On Sun

Re: init / shutdown for complex map job?

2014-12-28 Thread Sean Owen
You can't quite do cleanup in mapPartitions in that way. Here is a bit more explanation (farther down): http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/ On Dec 28, 2014 8:18 AM, "Akhil Das" wrote: > Something like? > > val a = myRDD.mapPartitions(p => { > > >

Re: init / shutdown for complex map job?

2014-12-28 Thread Akhil Das
Something like? val a = myRDD.mapPartitions(p => { //Do the init //Perform some operations //Shut it down? }) Thanks Best Regards On Sun, Dec 28, 2014 at 1:53 AM, Kevin Burton wrote: > I have a job where I want to map over all data in a cass

init / shutdown for complex map job?

2014-12-27 Thread Kevin Burton
I have a job where I want to map over all data in a cassandra database. I’m then selectively sending things to my own external system (ActiveMQ) if the item matches criteria. The problem is that I need to do some init and shutdown. Basically on init I need to create ActiveMQ connections and on s