Re: If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-09 Thread Marcelo Vanzin
Yes, that's how file: URLs are interpreted everywhere in Spark. (It's also explained in the link to the docs I posted earlier.) The second interpretation below is local: URLs in Spark, but that doesn't work with Yarn on Spark 1.0 (so it won't work with CDH 5.1 and older either). On Mon, Sep 8,

If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-08 Thread Dimension Data, LLC.
Hello friends: It was mentioned in another (Y.A.R.N.-centric) email thread that 'SPARK_JAR' was deprecated, and to use the 'spark.yarn.jar' property instead for YARN submission. For example: user$ pyspark [some-options] --driver-java-options

Re: If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-08 Thread Marcelo Vanzin
On Mon, Sep 8, 2014 at 9:35 AM, Dimension Data, LLC. subscripti...@didata.us wrote: user$ pyspark [some-options] --driver-java-options spark.yarn.jar=hdfs://namenode:8020/path/to/spark-assembly-*.jar This command line does not look correct. spark.yarn.jar is not a JVM command line option.

Re: If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-08 Thread Marcelo Vanzin
On Mon, Sep 8, 2014 at 10:00 AM, Dimension Data, LLC. subscripti...@didata.us wrote: user$ export MASTER=local[nn] # Run spark shell on LOCAL CPU threads. user$ pyspark [someOptions] --driver-java-options -Dspark.*XYZ*.jar=' /usr/lib/spark/assembly/lib/spark-assembly-*.jar' My question is,

Re: If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-08 Thread Marcelo Vanzin
On Mon, Sep 8, 2014 at 11:52 AM, Dimension Data, LLC. subscripti...@didata.us wrote: So just to clarify for me: When specifying 'spark.yarn.jar' as I did above, even if I don't use HDFS to create a RDD (e.g. do something simple like: 'sc.parallelize(range(100))'), it is still necessary to

Re: If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ...

2014-09-08 Thread Marcelo Vanzin
On Mon, Sep 8, 2014 at 3:54 PM, Dimension Data, LLC. subscripti...@didata.us wrote: You're probably right about the above because, as seen *below* for pyspark (but probably for other Spark applications too), once '-Dspark.master=[yarn-client|yarn-cluster]' is specified, the app invocation