It might kind of work, but you are effectively making all of your
workers into mini, separate Spark drivers in their own right. This
might cause snags down the line as this isn't the normal thing to do.
On Tue, Oct 28, 2014 at 12:11 AM, Localhost shell
universal.localh...@gmail.com wrote:
Hey
Hey lordjoe,
Apologies for the late reply.
I followed your threadlocal approach and it worked fine. I will update the
thread if I get to know more on this.
(Don't know how Spark Scala does it but what I wanted to achieve in java is
quiet common in many spark-scala github gists)
Thanks.
On
Hey All,
I am unable to access objects declared and initialized outside the call()
method of JavaRDD.
In the below code snippet, call() method makes a fetch call to C* but since
javaSparkContext is defined outside the call method scope so compiler give
a compilation error.
stringRdd.foreach(new
In Java, javaSparkContext would have to be declared final in order for
it to be accessed inside an inner class like this. But this would
still not work as the context is not serializable. You should rewrite
this so you are not attempting to use the Spark context inside an
RDD.
On Thu, Oct 23,
Bang On Sean
Before sending the issue mail, I was able to remove the compilation error
by making it final but then got the
Caused by: java.io.NotSerializableException:
org.apache.spark.api.java.JavaSparkContext (As you mentioned)
Now regarding your suggestion of changing the business logic,
1.
+1 to Sean.
Is it possible to rewrite your code to not use SparkContext in RDD. Or why
does javaFunctions() need the SparkContext.
On Thu, Oct 23, 2014 at 10:53 AM, Localhost shell
universal.localh...@gmail.com wrote:
Bang On Sean
Before sending the issue mail, I was able to remove the
Hey Jayant,
In my previous mail, I have mentioned a github gist
*https://gist.github.com/rssvihla/6577359860858ccb0b33
https://gist.github.com/rssvihla/6577359860858ccb0b33 *which is doing
very similar to what I want to do but its using scala language for spark.
Hence my question (reiterating
What I have been doing is building a JavaSparkContext the first time it is
needed and keeping it as a ThreadLocal - All my code uses
SparkUtilities.getCurrentContext(). On a Slave machine you build a new
context and don't have to serialize it
The code is in a large project at