https://issues.apache.org/jira/browse/SPARK-1825
I've had the following problems to make Windows+Pyspark+YARN work properly: 1. net.ScriptBasedMapping: Exception running /etc/hadoop/conf.cloudera.yarn/topology.py FIX? Comment the "net.topology.script.file.name" property configuration in the file core-site.xml. 2. Error from python worker: /usr/bin/python: No module named pyspark PYTHONPATH was: /yarn/nm/usercache/bigdata/filecache/63/spark-assembly-1.1.0-hadoop2.3.0-cdh5.0.1.jar FIX? Add the environment variable SPARK_YARN_USER_ENV to my client (Eclipse) launch configuration. Assign this value to the env var: PYTHONPATH=/opt/cloudera/parcels/CDH/lib/spark/python:/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.8.2.1-src.zip is there any way to do it simpler? am I do it something wrong?