[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-19095. ------------------------------- Resolution: Duplicate If this isn't something supported yet, it's not a bug, and I'd resolve this as a duplciate of the parent. > virtualenv example does not work in yarn cluster mode > ----------------------------------------------------- > > Key: SPARK-19095 > URL: https://issues.apache.org/jira/browse/SPARK-19095 > Project: Spark > Issue Type: Sub-task > Reporter: Yesha Vora > Priority: Critical > > Spark version: 2 > Steps: > * install virtualenv on all nodes > * create requirement1.txt with "numpy > requirement1.txt " > * Run kmeans.py application in yarn-cluster mode. > {code} > spark-submit --master yarn --deploy-mode cluster --conf > "spark.pyspark.virtualenv.enabled=true" --conf > "spark.pyspark.virtualenv.type=native" --conf > "spark.pyspark.virtualenv.requirements=/tmp/requirements1.txt" --conf > "spark.pyspark.virtualenv.bin.path=/usr/bin/virtualenv" --jars > /usr/hdp/current/hadoop-client/lib/hadoop-lzo.jar kmeans.py > /tmp/in/kmeans_data.txt 3{code} > The application fails to find numpy. > {code} > LogType:stdout > Log Upload Time:Thu Jan 05 20:05:49 +0000 2017 > LogLength:134 > Log Contents: > Traceback (most recent call last): > File "kmeans.py", line 27, in <module> > import numpy as np > ImportError: No module named numpy > End of LogType:stdout > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org