Re: Spark submit does automatically upload the jar to cluster?

2015-12-29 Thread jiml
And for more clarification on this: For non-YARN installs this bug has been filed to make the Spark driver upload jars The point of confusion, that I along with other newcomers commonly suffer from is this. In non-YARN installs: *The

Re: Problem of submitting Spark task to cluster from eclipse IDE on Windows

2015-12-28 Thread jiml
rk, please post what is the different between "WordCount MapReduce job" and "Spark Wordcount" -- that's not clear to me. Post your SparkConf and Spark Context calls. JimL I'm new to Spark. Before I describe the problem, I'd like to let you know the role of the machines that organiz

Re: Spark submit does automatically upload the jar to cluster?

2015-12-28 Thread jiml
That's funny I didn't delete that answer! I think I have two accounts crossing, here was the answer: I don't know if this is going to help, but I agree that some of the docs would lead one to believe that the Spark driver or master is going to spread your jars around for you. But there's other

Re: SPARK_CLASSPATH out, spark.executor.extraClassPath in?

2015-12-28 Thread jiml
I looked into this a lot more and posted an answer to a similar question on SO, but it's EC2 specific. Still might be some useful info in there and any comments/corrections/improvements would be greatly appreciated!

Various ways to use --jars? Some undocumented ways?

2016-01-11 Thread jiml
(Sorry to "repost" I originally answered/replied to an older question but my part was not expanding) Question is: Looking for all the ways to specify a set of jars using --jars on spark-submit? I know this is old but I am about to submit a proposed docs change on --jars, and I had an issue with

Re: how to submit multiple jar files when using spark-submit script in shell?

2016-01-11 Thread jiml
Question is: Looking for all the ways to specify a set of jars using --jars on spark-submit I know this is old but I am about to submit a proposed docs change on --jars, and I had an issue with --jars today When this user submitted the following command line, is that a proper way to reference a

Re: Kryo serializer Exception during serialization: java.io.IOException: java.lang.IllegalArgumentException:

2016-01-08 Thread jiml
(point of post is to see if anyone has ideas about errors at end of post) In addition, the real way to test if it's working is to force serialization: In Java: Create array of all your classes: // for kyro serializer it wants to register all classes that need to be serialized Class[]

Flashback: RDD.aggregate versus accumulables...

2016-03-10 Thread jiml
And Lord Joe you were right future versions did protect accumulators in actions. I wonder if anyone has a "modern" take on the accumulator vs. aggregate question. Seems like if I need to do it by key or control partitioning I would use aggregate. Bottom line question / reason for post: I wonder