Hi Lee, it's actually not related to threading at all - you would still have
the same problem even if you were using a single thread. See this section (
https://spark.apache.org/docs/latest/programming-guide.html#passing-functions-to-spark)
of the Spark docs.
On June 5, 2015, at 5:12 PM, Lee
Hi Will,
That doesn't seem to be the case and was part of the source of my
confusion. The code currently in the run method of the runnable works
perfectly fine with the lambda expressions when it is invoked from the main
method. They also work when they are invoked from within a separate method
Hi Lee, I'm stuck with only mobile devices for correspondence right now, so
I can't get to shell to play with this issue - this is all supposition; I
think that the lambdas are closing over the context because it's a
constructor parameter to your Runnable class, which is why inlining the
lambdas
Hi all,
I'm having some issues finding any kind of best practices when attempting
to create Spark applications which launch jobs from a thread pool.
Initially I had issues passing the SparkContext to other threads as it is
not serializable. Eventually I found that adding the @transient
On Fri, Jun 5, 2015 at 11:48 AM, Lee McFadden splee...@gmail.com wrote:
Initially I had issues passing the SparkContext to other threads as it is
not serializable. Eventually I found that adding the @transient annotation
prevents a NotSerializableException.
This is really puzzling. How are
You can see an example of the constructor for the class which executes a
job in my opening post.
I'm attempting to instantiate and run the class using the code below:
```
val conf = new SparkConf()
.setAppName(appNameBase.format(Test))
val connector = CassandraConnector(conf)
+1 to question about serializaiton. SparkContext is still in driver
process(even if it has several threads from which you submit jobs)
as for the problem, check your classpath, scala version, spark version etc.
such errors usually happens when there is some conflict in classpath. Maybe
you
Lee, what cluster do you use? standalone, yarn-cluster, yarn-client, mesos?
in yarn-cluster the driver program is executed inside one of nodes in
cluster, so might be that driver code needs to be serialized to be sent to
some node
On 5 June 2015 at 22:55, Lee McFadden splee...@gmail.com wrote:
Ignoring the serialization thing (seems like a red herring):
On Fri, Jun 5, 2015 at 11:48 AM, Lee McFadden splee...@gmail.com wrote:
15/06/05 11:35:32 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
localhost): java.lang.NoSuchMethodError:
On Fri, Jun 5, 2015 at 12:30 PM Marcelo Vanzin van...@cloudera.com wrote:
Ignoring the serialization thing (seems like a red herring):
People seem surprised that I'm getting the Serialization exception at all -
I'm not convinced it's a red herring per se, but on to the blocking issue...
On Fri, Jun 5, 2015 at 12:55 PM, Lee McFadden splee...@gmail.com wrote:
Regarding serialization, I'm still confused as to why I was getting a
serialization error in the first place as I'm executing these Runnable
classes from a java thread pool. I'm fairly new to Scala/JVM world and
there
Your lambda expressions on the RDDs in the SecondRollup class are closing
around the context, and Spark has special logic to ensure that all variables in
a closure used on an RDD are Serializable - I hate linking to Quora, but
there's a good explanation here:
On Fri, Jun 5, 2015 at 1:00 PM Igor Berman igor.ber...@gmail.com wrote:
Lee, what cluster do you use? standalone, yarn-cluster, yarn-client, mesos?
Spark standalone, v1.2.1.
On Fri, Jun 5, 2015 at 12:58 PM Marcelo Vanzin van...@cloudera.com wrote:
You didn't show the error so the only thing we can do is speculate. You're
probably sending the object that's holding the SparkContext reference over
the network at some point (e.g. it's used by a task run in an
On Fri, Jun 5, 2015 at 2:05 PM Will Briggs wrbri...@gmail.com wrote:
Your lambda expressions on the RDDs in the SecondRollup class are closing
around the context, and Spark has special logic to ensure that all
variables in a closure used on an RDD are Serializable - I hate linking to
Quora,
15 matches
Mail list logo