Hello, I'm trying to run pyspark using the following setup:
- spark 1.6.1 standalone cluster on ec2 - virtualenv installed on master - app is run using the following command: export PYSPARK_DRIVER_PYTHON=/path_to_virtualenv/bin/python export PYSPARK_PYTHON=/usr/bin/python /root/spark/bin/spark-submit --py-files mypackage.tar.gz myapp.py I'm getting the following error: java.io.IOException: Cannot run program "/path_to_virtualenv/bin/python": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) --> Looks like the executor process did not account for the PYSPARK_PYTHON setting, but used the same python executable it had on the driver (the virtualenv python), rather than using " /usr/bin/python" What am I doing wrong here? Thanks, Tomer