GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/21420

    [SPARK-24377][Spark Submit] make --py-files work in non pyspark application

    ## What changes were proposed in this pull request?
    
    For some Spark applications, though they're a java program, they require 
not only jar dependencies, but also python dependencies. One example is Livy 
remote SparkContext application, this application is actually an embedded REPL 
for Scala/Python/R, it will not only load in jar dependencies, but also python 
and R deps, so we should specify not only "--jars", but also "--py-files".
    
    Currently for a Spark application, --py-files can only be worked for a 
pyspark application, so it will not be worked in the above case. So here 
propose to remove such restriction.
    
    Also we tested that "spark.submit.pyFiles" only supports quite limited 
scenario (client mode with local deps), so here also expand the usage of 
"spark.submit.pyFiles" to be alternative of --py-files.
    
    ## How was this patch tested?
    
    UT added.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-24377

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21420.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21420
    
----
commit a41c99bf311aa8f4e0c2e07c1288f5a11e057ea4
Author: jerryshao <sshao@...>
Date:   2018-05-24T06:53:23Z

    make --py-files work in non pyspark application

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to