Confused about class paths in spark 1.1.0

2014-10-30 Thread Shay Seng
Hi,

I've been trying to move up from spark 0.9.2 to 1.1.0.
I'm getting a little confused with the setup for a few different use cases,
grateful for any pointers...

(1) spark-shell + with jars that are only required by the driver
(1a)
I added spark.driver.extraClassPath  /mypath/to.jar to my
spark-defaults.conf
I launched spark-shell with:  ./spark-shell

Here I see on the WebUI that spark.driver.extraClassPath has been set, but
I am NOT able to access any methods in the jar.

(1b)
I removed spark.driver.extraClassPath from my spark-default.conf
I launched spark-shell with  .//spark-shell --driver.class.path
/mypath/to.jar

Again I see that the WebUI spark.driver.extraClassPath has been set.
But this time I am able to access the methods in the jar.

Q: Is spark-shell not considered the driver in this case?  why does using
--driver.class.path on the command line have a different behavior to
setting it in spark-defaults.conf ?


(2) Rather than adding each jar individually, is there a way to use
wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/*
 but with --driver.class.path it seems to require individual files.

tks
Shay


Re: Confused about class paths in spark 1.1.0

2014-10-30 Thread Matei Zaharia
Try using --jars instead of the driver-only options; they should work with 
spark-shell too but they may be less tested.

Unfortunately, you do have to specify each JAR separately; you can maybe use a 
shell script to list a directory and get a big list, or set up a project that 
builds all of the dependencies into one assembly JAR.

Matei

 On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com wrote:
 
 Hi,
 
 I've been trying to move up from spark 0.9.2 to 1.1.0. 
 I'm getting a little confused with the setup for a few different use cases, 
 grateful for any pointers...
 
 (1) spark-shell + with jars that are only required by the driver
 (1a) 
 I added spark.driver.extraClassPath  /mypath/to.jar to my 
 spark-defaults.conf
 I launched spark-shell with:  ./spark-shell
 
 Here I see on the WebUI that spark.driver.extraClassPath has been set, but I 
 am NOT able to access any methods in the jar.
 
 (1b)
 I removed spark.driver.extraClassPath from my spark-default.conf
 I launched spark-shell with  .//spark-shell --driver.class.path /mypath/to.jar
 
 Again I see that the WebUI spark.driver.extraClassPath has been set. 
 But this time I am able to access the methods in the jar. 
 
 Q: Is spark-shell not considered the driver in this case?  why does using 
 --driver.class.path on the command line have a different behavior to setting 
 it in spark-defaults.conf ?
  
 
 (2) Rather than adding each jar individually, is there a way to use 
 wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/*  but 
 with --driver.class.path it seems to require individual files.
 
 tks
 Shay


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Confused about class paths in spark 1.1.0

2014-10-30 Thread Shay Seng
-- jars does indeed work but this causes the jars to also get shipped to
the workers -- which I don't want to do for efficiency reasons.

I think you are saying that setting spark.driver.extraClassPath in
spark-default.conf  ought to have the same behavior as providing
--driver.class.apth  to spark-shell. Correct? If so I will file a bug
report since this is definitely not the case.


On Thu, Oct 30, 2014 at 5:39 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:

 Try using --jars instead of the driver-only options; they should work with
 spark-shell too but they may be less tested.

 Unfortunately, you do have to specify each JAR separately; you can maybe
 use a shell script to list a directory and get a big list, or set up a
 project that builds all of the dependencies into one assembly JAR.

 Matei

  On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com wrote:
 
  Hi,
 
  I've been trying to move up from spark 0.9.2 to 1.1.0.
  I'm getting a little confused with the setup for a few different use
 cases, grateful for any pointers...
 
  (1) spark-shell + with jars that are only required by the driver
  (1a)
  I added spark.driver.extraClassPath  /mypath/to.jar to my
 spark-defaults.conf
  I launched spark-shell with:  ./spark-shell
 
  Here I see on the WebUI that spark.driver.extraClassPath has been set,
 but I am NOT able to access any methods in the jar.
 
  (1b)
  I removed spark.driver.extraClassPath from my spark-default.conf
  I launched spark-shell with  .//spark-shell --driver.class.path
 /mypath/to.jar
 
  Again I see that the WebUI spark.driver.extraClassPath has been set.
  But this time I am able to access the methods in the jar.
 
  Q: Is spark-shell not considered the driver in this case?  why does
 using --driver.class.path on the command line have a different behavior to
 setting it in spark-defaults.conf ?
 
 
  (2) Rather than adding each jar individually, is there a way to use
 wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/*
 but with --driver.class.path it seems to require individual files.
 
  tks
  Shay




Re: Confused about class paths in spark 1.1.0

2014-10-30 Thread Matei Zaharia
Yeah, I think you should file this as a bug. The problem is that JARs need to 
also be added into the Scala compiler and REPL class loader, and we probably 
don't do this for the ones in this driver config property.

Matei

 On Oct 30, 2014, at 6:07 PM, Shay Seng s...@urbanengines.com wrote:
 
 -- jars does indeed work but this causes the jars to also get shipped to 
 the workers -- which I don't want to do for efficiency reasons.
 
 I think you are saying that setting spark.driver.extraClassPath in 
 spark-default.conf  ought to have the same behavior as providing 
 --driver.class.apth  to spark-shell. Correct? If so I will file a bug 
 report since this is definitely not the case.
 
 
 On Thu, Oct 30, 2014 at 5:39 PM, Matei Zaharia matei.zaha...@gmail.com 
 mailto:matei.zaha...@gmail.com wrote:
 Try using --jars instead of the driver-only options; they should work with 
 spark-shell too but they may be less tested.
 
 Unfortunately, you do have to specify each JAR separately; you can maybe use 
 a shell script to list a directory and get a big list, or set up a project 
 that builds all of the dependencies into one assembly JAR.
 
 Matei
 
  On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com 
  mailto:s...@urbanengines.com wrote:
 
  Hi,
 
  I've been trying to move up from spark 0.9.2 to 1.1.0.
  I'm getting a little confused with the setup for a few different use cases, 
  grateful for any pointers...
 
  (1) spark-shell + with jars that are only required by the driver
  (1a)
  I added spark.driver.extraClassPath  /mypath/to.jar to my 
  spark-defaults.conf
  I launched spark-shell with:  ./spark-shell
 
  Here I see on the WebUI that spark.driver.extraClassPath has been set, but 
  I am NOT able to access any methods in the jar.
 
  (1b)
  I removed spark.driver.extraClassPath from my spark-default.conf
  I launched spark-shell with  .//spark-shell --driver.class.path 
  /mypath/to.jar
 
  Again I see that the WebUI spark.driver.extraClassPath has been set.
  But this time I am able to access the methods in the jar.
 
  Q: Is spark-shell not considered the driver in this case?  why does using 
  --driver.class.path on the command line have a different behavior to 
  setting it in spark-defaults.conf ?
 
 
  (2) Rather than adding each jar individually, is there a way to use 
  wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/*  
  but with --driver.class.path it seems to require individual files.
 
  tks
  Shay