Hi ,
   We avaluating PySpark  and successfully executed examples of PySpark on
Yarn.

Next step what we want to do:
       We have a python project ( bunch of python script using Anaconda
packages).
Question:
        What is the way to execute PySpark on Yarn having a lot of python
files ( ~ 50)?
       Should it be packaged in archive?
       How the command to execute Pyspark on Yarn with a lot of files will
looks like?
Currently command looks like:

./bin/spark-submit --master yarn  --num-executors 3  --driver-memory 4g
--executor-memory 2g --executor-cores 1
examples/src/main/python/wordcount.py   1000

Thanks
Oleg.

Reply via email to