Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-28 Thread Sean Owen
It might kind of work, but you are effectively making all of your workers into mini, separate Spark drivers in their own right. This might cause snags down the line as this isn't the normal thing to do. On Tue, Oct 28, 2014 at 12:11 AM, Localhost shell universal.localh...@gmail.com wrote: Hey

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-27 Thread Localhost shell
Hey lordjoe, Apologies for the late reply. I followed your threadlocal approach and it worked fine. I will update the thread if I get to know more on this. (Don't know how Spark Scala does it but what I wanted to achieve in java is quiet common in many spark-scala github gists) Thanks. On

How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread Localhost shell
Hey All, I am unable to access objects declared and initialized outside the call() method of JavaRDD. In the below code snippet, call() method makes a fetch call to C* but since javaSparkContext is defined outside the call method scope so compiler give a compilation error. stringRdd.foreach(new

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread Sean Owen
In Java, javaSparkContext would have to be declared final in order for it to be accessed inside an inner class like this. But this would still not work as the context is not serializable. You should rewrite this so you are not attempting to use the Spark context inside an RDD. On Thu, Oct 23,

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread Localhost shell
Bang On Sean Before sending the issue mail, I was able to remove the compilation error by making it final but then got the Caused by: java.io.NotSerializableException: org.apache.spark.api.java.JavaSparkContext (As you mentioned) Now regarding your suggestion of changing the business logic, 1.

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread Jayant Shekhar
+1 to Sean. Is it possible to rewrite your code to not use SparkContext in RDD. Or why does javaFunctions() need the SparkContext. On Thu, Oct 23, 2014 at 10:53 AM, Localhost shell universal.localh...@gmail.com wrote: Bang On Sean Before sending the issue mail, I was able to remove the

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread Localhost shell
Hey Jayant, In my previous mail, I have mentioned a github gist *https://gist.github.com/rssvihla/6577359860858ccb0b33 https://gist.github.com/rssvihla/6577359860858ccb0b33 *which is doing very similar to what I want to do but its using scala language for spark. Hence my question (reiterating

Re: How to access objects declared and initialized outside the call() method of JavaRDD

2014-10-23 Thread lordjoe
What I have been doing is building a JavaSparkContext the first time it is needed and keeping it as a ThreadLocal - All my code uses SparkUtilities.getCurrentContext(). On a Slave machine you build a new context and don't have to serialize it The code is in a large project at