GitHub user lianhuiwang reopened a pull request: https://github.com/apache/spark/pull/3976
[SPARK-5173]support python application running on yarn cluster mode now when we run python application on yarn cluster mode through spark-submit, spark-submit does not support python application on yarn cluster mode. so i modify code of submit and yarn's AM in order to support it. through specifying .py file or primaryResource file via spark-submit, we can make pyspark run in yarn-cluster mode. example:spark-submit --master yarn-master --num-executors 1 --driver-memory 1g --executor-memory 1g xx.py --primaryResource yy.conf this config is same as pyspark on yarn-client mode. firstly,we put local path of .py or primaryResource to yarn's dist.files.that can be distributed on slave nodes.and then in spark-submit we transfer --py-files and --primaryResource to yarn.Client and use "org.apache.spark.deploy.PythonRunner" to user class that can run .py files on ApplicationMaster. in yarn.Client we transfer --py-files and --primaryResource to ApplicationMaster. in ApplicationMaster, user's class is org.apache.spark.deploy.PythonRunner, and user's args is primaryResource and -py-files. so that can make pyspark run on ApplicationMaster. @JoshRosen @tgravescs @sryza You can merge this pull request into a Git repository by running: $ git pull https://github.com/lianhuiwang/spark SPARK-5173 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3976.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3976 ---- commit 9c941bc59527e594ee1d155c00cb8e55d7c40fe8 Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-09T12:58:24Z support python application running on yarn cluster mode commit 172eec10b9daaf9ed838e821474d28871ab63462 Author: Wang Lianhui <lianhuiwan...@gmail.com> Date: 2015-01-09T15:01:52Z fix a min submit's bug commit f1f55b6eb4b65499be8e182e857d89a158873234 Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-29T11:13:35Z when yarn-cluster, all python files can be non-local commit 905a10610532578c774e58d12b927597330fb9ff Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-31T03:29:09Z update with sryza and andrewor 's comments commit 097a5ec37456bf9d13a952f4108a750b9f9f84d0 Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-31T03:59:06Z fix line length exceeds 100 commit 5b300648fe53d9de604e8afce7580fddfe6bbaef Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-31T12:18:22Z add test commit d60bc6069cf65637622472ef1cd27153333df53c Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-31T14:07:03Z fix test commit 2adc8f591ddd0f253496c18d32b1910d29e04c8d Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-01-31T16:35:01Z add spark.test.home commit 47d2fc35e53a8851790607085bc67e94736358d6 Author: lianhuiwang <lianhuiwan...@gmail.com> Date: 2015-02-01T02:40:25Z fix test ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org