Matei, thanks. So including the PYTHONPATH in spark-env.sh seemed to work. I
am faced with this issue now. I am doing a large GroupBy in pyspark and the
process fails (at the driver it seems). There is not much of a stack trace
here to see where the issue is happening. This process works locally.
I am getting a python ImportError on Spark standalone cluster. I have set the
PYTHONPATH on both worker and slave and the package imports properly when I
run PySpark command line on both machines. This only happens with Master -
Slave communication. Here is the error below:
14/04/10 13:40:19
Kind of strange because we haven’t updated CloudPickle AFAIK. Is this a package
you added on the PYTHONPATH? How did you set the path, was it in
conf/spark-env.sh?
Matei
On Apr 10, 2014, at 7:39 AM, aazout albert.az...@velos.io wrote:
I am getting a python ImportError on Spark standalone