Windows+Pyspark+YARN

Ángel Álvarez Pascua Thu, 20 Nov 2014 07:45:52 -0800

https://issues.apache.org/jira/browse/SPARK-1825



I've had the following problems to make Windows+Pyspark+YARN work properly:

1. net.ScriptBasedMapping: Exception running
/etc/hadoop/conf.cloudera.yarn/topology.py

FIX? Comment the "net.topology.script.file.name" property configuration in
the file core-site.xml.

2. Error from python worker:
  /usr/bin/python: No module named pyspark
PYTHONPATH was:

/yarn/nm/usercache/bigdata/filecache/63/spark-assembly-1.1.0-hadoop2.3.0-cdh5.0.1.jar

FIX? Add the environment variable SPARK_YARN_USER_ENV to my client
(Eclipse) launch configuration. Assign this value to the env var:

PYTHONPATH=/opt/cloudera/parcels/CDH/lib/spark/python:/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.8.2.1-src.zip


is there any way to do it simpler? am I do it something wrong?

Windows+Pyspark+YARN

Reply via email to