Confused about class paths in spark 1.1.0
Hi, I've been trying to move up from spark 0.9.2 to 1.1.0. I'm getting a little confused with the setup for a few different use cases, grateful for any pointers... (1) spark-shell + with jars that are only required by the driver (1a) I added spark.driver.extraClassPath /mypath/to.jar to my spark-defaults.conf I launched spark-shell with: ./spark-shell Here I see on the WebUI that spark.driver.extraClassPath has been set, but I am NOT able to access any methods in the jar. (1b) I removed spark.driver.extraClassPath from my spark-default.conf I launched spark-shell with .//spark-shell --driver.class.path /mypath/to.jar Again I see that the WebUI spark.driver.extraClassPath has been set. But this time I am able to access the methods in the jar. Q: Is spark-shell not considered the driver in this case? why does using --driver.class.path on the command line have a different behavior to setting it in spark-defaults.conf ? (2) Rather than adding each jar individually, is there a way to use wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/* but with --driver.class.path it seems to require individual files. tks Shay
Re: Confused about class paths in spark 1.1.0
Try using --jars instead of the driver-only options; they should work with spark-shell too but they may be less tested. Unfortunately, you do have to specify each JAR separately; you can maybe use a shell script to list a directory and get a big list, or set up a project that builds all of the dependencies into one assembly JAR. Matei On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com wrote: Hi, I've been trying to move up from spark 0.9.2 to 1.1.0. I'm getting a little confused with the setup for a few different use cases, grateful for any pointers... (1) spark-shell + with jars that are only required by the driver (1a) I added spark.driver.extraClassPath /mypath/to.jar to my spark-defaults.conf I launched spark-shell with: ./spark-shell Here I see on the WebUI that spark.driver.extraClassPath has been set, but I am NOT able to access any methods in the jar. (1b) I removed spark.driver.extraClassPath from my spark-default.conf I launched spark-shell with .//spark-shell --driver.class.path /mypath/to.jar Again I see that the WebUI spark.driver.extraClassPath has been set. But this time I am able to access the methods in the jar. Q: Is spark-shell not considered the driver in this case? why does using --driver.class.path on the command line have a different behavior to setting it in spark-defaults.conf ? (2) Rather than adding each jar individually, is there a way to use wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/* but with --driver.class.path it seems to require individual files. tks Shay - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Confused about class paths in spark 1.1.0
-- jars does indeed work but this causes the jars to also get shipped to the workers -- which I don't want to do for efficiency reasons. I think you are saying that setting spark.driver.extraClassPath in spark-default.conf ought to have the same behavior as providing --driver.class.apth to spark-shell. Correct? If so I will file a bug report since this is definitely not the case. On Thu, Oct 30, 2014 at 5:39 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Try using --jars instead of the driver-only options; they should work with spark-shell too but they may be less tested. Unfortunately, you do have to specify each JAR separately; you can maybe use a shell script to list a directory and get a big list, or set up a project that builds all of the dependencies into one assembly JAR. Matei On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com wrote: Hi, I've been trying to move up from spark 0.9.2 to 1.1.0. I'm getting a little confused with the setup for a few different use cases, grateful for any pointers... (1) spark-shell + with jars that are only required by the driver (1a) I added spark.driver.extraClassPath /mypath/to.jar to my spark-defaults.conf I launched spark-shell with: ./spark-shell Here I see on the WebUI that spark.driver.extraClassPath has been set, but I am NOT able to access any methods in the jar. (1b) I removed spark.driver.extraClassPath from my spark-default.conf I launched spark-shell with .//spark-shell --driver.class.path /mypath/to.jar Again I see that the WebUI spark.driver.extraClassPath has been set. But this time I am able to access the methods in the jar. Q: Is spark-shell not considered the driver in this case? why does using --driver.class.path on the command line have a different behavior to setting it in spark-defaults.conf ? (2) Rather than adding each jar individually, is there a way to use wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/* but with --driver.class.path it seems to require individual files. tks Shay
Re: Confused about class paths in spark 1.1.0
Yeah, I think you should file this as a bug. The problem is that JARs need to also be added into the Scala compiler and REPL class loader, and we probably don't do this for the ones in this driver config property. Matei On Oct 30, 2014, at 6:07 PM, Shay Seng s...@urbanengines.com wrote: -- jars does indeed work but this causes the jars to also get shipped to the workers -- which I don't want to do for efficiency reasons. I think you are saying that setting spark.driver.extraClassPath in spark-default.conf ought to have the same behavior as providing --driver.class.apth to spark-shell. Correct? If so I will file a bug report since this is definitely not the case. On Thu, Oct 30, 2014 at 5:39 PM, Matei Zaharia matei.zaha...@gmail.com mailto:matei.zaha...@gmail.com wrote: Try using --jars instead of the driver-only options; they should work with spark-shell too but they may be less tested. Unfortunately, you do have to specify each JAR separately; you can maybe use a shell script to list a directory and get a big list, or set up a project that builds all of the dependencies into one assembly JAR. Matei On Oct 30, 2014, at 5:24 PM, Shay Seng s...@urbanengines.com mailto:s...@urbanengines.com wrote: Hi, I've been trying to move up from spark 0.9.2 to 1.1.0. I'm getting a little confused with the setup for a few different use cases, grateful for any pointers... (1) spark-shell + with jars that are only required by the driver (1a) I added spark.driver.extraClassPath /mypath/to.jar to my spark-defaults.conf I launched spark-shell with: ./spark-shell Here I see on the WebUI that spark.driver.extraClassPath has been set, but I am NOT able to access any methods in the jar. (1b) I removed spark.driver.extraClassPath from my spark-default.conf I launched spark-shell with .//spark-shell --driver.class.path /mypath/to.jar Again I see that the WebUI spark.driver.extraClassPath has been set. But this time I am able to access the methods in the jar. Q: Is spark-shell not considered the driver in this case? why does using --driver.class.path on the command line have a different behavior to setting it in spark-defaults.conf ? (2) Rather than adding each jar individually, is there a way to use wildcards? Previously with SPARK_CLASS_PATH I was able to use mypath/* but with --driver.class.path it seems to require individual files. tks Shay