Thanks Cheng,

If I do not use --jars how can I tell spark to search the jars(and files) on 
HDFS?

Do you mean the driver will not need to setup a HTTP file server for this 
scenario and the worker will fetch the jars and files from HDFS?

Thanks
Dong Lei

From: Cheng Lian [mailto:lian.cs....@gmail.com]
Sent: Thursday, June 11, 2015 12:50 PM
To: Dong Lei; dev@spark.apache.org
Cc: Dianfei (Keith) Han
Subject: Re: How to support dependency jars and files on HDFS in standalone 
cluster mode?

Since the jars are already on HDFS, you can access them directly in your Spark 
application without using --jars

Cheng
On 6/11/15 11:04 AM, Dong Lei wrote:
Hi spark-dev:

I can not use a hdfs location for the "--jars" or "--files" option when doing a 
spark-submit in a standalone cluster mode. For example:
                Spark-submit  ...   --jars hdfs://ip/1.jar  ....  
hdfs://ip/app.jar (standalone cluster mode)
will not download 1.jar to driver's http file server(but the app.jar will be 
downloaded to the driver's dir).

I figure out the reason spark not downloading the jars is that when doing 
sc.addJar to http file server, the function called is Files.copy which does not 
support a remote location.
And I think if spark can download the jars and add them to http file server, 
the classpath is not correctly set, because the classpath contains remote 
location.

So I'm trying to make it work and come up with two options, but neither of them 
seem to be elegant, and I want to hear your advices:

Option 1:
Modify HTTPFileServer.addFileToDir, let it recognize a "hdfs" prefix.

This is not good because I think it breaks the scope of http file server.

Option 2:
Modify DriverRunner.downloadUserJar, let it download all the "--jars" and 
"--files" with the application jar.

This sounds more reasonable that option 1 for downloading files. But this way I 
need to read the "spark.jars" and "spark.files" on downloadUserJar or 
DriverRunnder.start and replace it with a local path. How can I do that?


Do you have a more elegant solution, or do we have a plan to support it in the 
furture?

Thanks
Dong Lei

Reply via email to