Hello,

I'm trying to run pyspark using the following setup:

- spark 1.6.1 standalone cluster on ec2
- virtualenv installed on master

- app is run using the following command:

export PYSPARK_DRIVER_PYTHON=/path_to_virtualenv/bin/python
export PYSPARK_PYTHON=/usr/bin/python
/root/spark/bin/spark-submit --py-files mypackage.tar.gz myapp.py

I'm getting the following error:

java.io.IOException: Cannot run program
"/path_to_virtualenv/bin/python": error=2, No such file or directory
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)

--> Looks like the executor process did not account for the PYSPARK_PYTHON
setting, but used the same python executable it had on the driver (the
virtualenv python), rather than using "
/usr/bin/python"

What am I doing wrong here?

Thanks,
Tomer

Reply via email to