Dong Lei created SPARK-8369:
-------------------------------

             Summary: Support dependency jar and files on HDFS in standalone 
cluster mode
                 Key: SPARK-8369
                 URL: https://issues.apache.org/jira/browse/SPARK-8369
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
            Reporter: Dong Lei


Currently, in standalone cluster mode, spark can take care of the app-jar 
whether the app-jar is specified by file:// or hdfs://. But the dependencies 
specified by --jars and --files do not support a hdfs:// prefix. 

For example:
spark-submit 
 ...
--jars hdfs://path1/1.jar hdfs://path2/2.jar
--files hdfs://path3/3.file hdfs://path4/4.file
hdfs://path5/app.jar

only app.jar will be downloaded to the driver and distributed to executors, 
others (1.jar, 2.jar. 3.file, 4.file) will not. 
I think such a feature is useful for users. 

----------------------------
To support such a feature, I think we can treat the jars and files like the app 
jar in DriverRunner. We download them and replace the remote addresses with 
local addresses. And the DriverWrapper will not be aware.  

The problem is it's not easy to replace these addresses than replace the 
location app jar, because we have a placeholder for app jar "<<USER_JAR>>".  We 
may need to do some string matching to achieve it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to