PySpark on Yarn a lot of python scripts project

Oleg Ruchovets Fri, 05 Sep 2014 01:51:18 -0700

Hi ,
   We avaluating PySpark  and successfully executed examples of PySpark on
Yarn.


Next step what we want to do:
       We have a python project ( bunch of python script using Anaconda
packages).
Question:
        What is the way to execute PySpark on Yarn having a lot of python
files ( ~ 50)?
       Should it be packaged in archive?
       How the command to execute Pyspark on Yarn with a lot of files will
looks like?
Currently command looks like:

./bin/spark-submit --master yarn  --num-executors 3  --driver-memory 4g
--executor-memory 2g --executor-cores 1
examples/src/main/python/wordcount.py   1000

Thanks
Oleg.

PySpark on Yarn a lot of python scripts project

Reply via email to