Re: SparkContext Threading

2015-06-06 Thread Lee McFadden
) of the Spark docs. On June 5, 2015, at 5:12 PM, Lee McFadden splee...@gmail.com wrote: On Fri, Jun 5, 2015 at 2:05 PM Will Briggs wrbri...@gmail.com wrote: Your lambda expressions on the RDDs in the SecondRollup class are closing around the context, and Spark has special logic to ensure

SparkContext Threading

2015-06-05 Thread Lee McFadden
Hi all, I'm having some issues finding any kind of best practices when attempting to create Spark applications which launch jobs from a thread pool. Initially I had issues passing the SparkContext to other threads as it is not serializable. Eventually I found that adding the @transient

Re: SparkContext Threading

2015-06-05 Thread Lee McFadden
, although it's not really required at the moment as I am only submitting one job until I get this issue straightened out :) Thanks, Lee On Fri, Jun 5, 2015 at 11:50 AM Marcelo Vanzin van...@cloudera.com wrote: On Fri, Jun 5, 2015 at 11:48 AM, Lee McFadden splee...@gmail.com wrote: Initially I

Re: SparkContext Threading

2015-06-05 Thread Lee McFadden
On Fri, Jun 5, 2015 at 12:30 PM Marcelo Vanzin van...@cloudera.com wrote: Ignoring the serialization thing (seems like a red herring): People seem surprised that I'm getting the Serialization exception at all - I'm not convinced it's a red herring per se, but on to the blocking issue...

Re: SparkContext Threading

2015-06-05 Thread Lee McFadden
On Fri, Jun 5, 2015 at 1:00 PM Igor Berman igor.ber...@gmail.com wrote: Lee, what cluster do you use? standalone, yarn-cluster, yarn-client, mesos? Spark standalone, v1.2.1.

Re: SparkContext Threading

2015-06-05 Thread Lee McFadden
On Fri, Jun 5, 2015 at 12:58 PM Marcelo Vanzin van...@cloudera.com wrote: You didn't show the error so the only thing we can do is speculate. You're probably sending the object that's holding the SparkContext reference over the network at some point (e.g. it's used by a task run in an

Re: SparkContext Threading

2015-06-05 Thread Lee McFadden
On Fri, Jun 5, 2015 at 2:05 PM Will Briggs wrbri...@gmail.com wrote: Your lambda expressions on the RDDs in the SecondRollup class are closing around the context, and Spark has special logic to ensure that all variables in a closure used on an RDD are Serializable - I hate linking to Quora,

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Lee McFadden
Python dependency management. As far as I can tell, there is no core issue, upstream or otherwise. On Tue, May 12, 2015 at 11:39 AM, Lee McFadden splee...@gmail.com wrote: Thanks again for all the help folks. I can confirm that simply switching to `--packages org.apache.spark:spark

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Lee McFadden
Thanks again for all the help folks. I can confirm that simply switching to `--packages org.apache.spark:spark-streaming-kafka-assembly_2.10:1.3.1` makes everything work as intended. I'm not sure what the difference is between the two packages honestly, or why one should be used over the other,

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
: com.yammer.metrics.core.Gauge is in metrics-core jar e.g., in master branch: [INFO] | \- org.apache.kafka:kafka_2.10:jar:0.8.1.1:compile [INFO] | +- com.yammer.metrics:metrics-core:jar:2.2.0:compile Please make sure metrics-core jar is on the classpath. On Mon, May 11, 2015 at 1:32 PM, Lee McFadden splee

Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
Hi, We've been having some issues getting spark streaming running correctly using a Kafka stream, and we've been going around in circles trying to resolve this dependency. Details of our environment and the error below, if anyone can help resolve this it would be much appreciated. Submit

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
in the assembly, is it? you'd have to provide it and all its dependencies with your app. You could also build this into your own app jar. Tools like Maven will add in the transitive dependencies. On Mon, May 11, 2015 at 10:04 PM, Lee McFadden splee...@gmail.com wrote: Thanks Ted

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
itself can't be introducing java dependency clashes? On Mon, May 11, 2015, 4:34 PM Lee McFadden splee...@gmail.com wrote: Ted, many thanks. I'm not used to Java dependencies so this was a real head-scratcher for me. Downloading the two metrics packages from the maven repository (metrics-core