[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603973#comment-14603973 ]
Lianhui Wang commented on SPARK-8646: ------------------------------------- from [~juliet] 's logs, i think you miss python 'pandas.algos' module that pyspark does not provide. i think that you need to install it on nodes. > PySpark does not run on YARN > ---------------------------- > > Key: SPARK-8646 > URL: https://issues.apache.org/jira/browse/SPARK-8646 > Project: Spark > Issue Type: Bug > Components: PySpark, YARN > Affects Versions: 1.4.0 > Environment: SPARK_HOME=local/path/to/spark1.4install/dir > also with > SPARK_HOME=local/path/to/spark1.4install/dir > PYTHONPATH=$SPARK_HOME/python/lib > Spark apps are submitted with the command: > $SPARK_HOME/bin/spark-submit outofstock/data_transform.py > hdfs://foe-dev/DEMO_DATA/FACT_POS hdfs:/user/juliet/ex/ yarn-client > data_transform contains a main method, and the rest of the args are parsed in > my own code. > Reporter: Juliet Hougland > Attachments: pi-test.log, spark1.4-SPARK_HOME-set-PYTHONPATH-set.log, > spark1.4-SPARK_HOME-set-inline-HADOOP_CONF_DIR.log, > spark1.4-SPARK_HOME-set.log > > > Running pyspark jobs result in a "no module named pyspark" when run in > yarn-client mode in spark 1.4. > [I believe this JIRA represents the change that introduced this error.| > https://issues.apache.org/jira/browse/SPARK-6869 ] > This does not represent a binary compatible change to spark. Scripts that > worked on previous spark versions (ie comands the use spark-submit) should > continue to work without modification between minor versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org