[jira] [Commented] (SPARK-9235) PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node manager acting as driver in yarn-cluster mode

2015-09-04 Thread Aaron Glahe (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730680#comment-14730680
 ] 

Aaron Glahe commented on SPARK-9235:


You set it in the spark-env.sh, e.g, since we use condo as our "python env":

SPARK_YARN_USER_ENV="PYSPARK_PYTHON=/srv/software/anaconda/bin/python"


> PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node manager acting 
> as driver in yarn-cluster mode
> 
>
> Key: SPARK-9235
> URL: https://issues.apache.org/jira/browse/SPARK-9235
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 1.4.1, 1.5.0
> Environment: CentOS 6.6, python 2.7, Spark 1.4.1 tagged version, YARN 
> Cluster Manager, CDH 5.4.1 (Hadoop 2.6.0++), Java 1.7
>Reporter: Aaron Glahe
>Priority: Minor
>
> Relates to SPARK-9229
> Env:  Spark on YARN, Java 1.7, Centos 6.6, CDH 5.4.1 (Hadoop 2.6.0++), 
> Anaconda Python 2.7.10 "installed" in /srv/software directory
> On a client/submitting machine, we set the PYSPARK_DRIVER_PYTHON env var in 
> spark-env.sh that pointed the anaconda python executable, which was on every 
> YARN node: 
> export PYSPARK_DRIVER_PYTHON='/srv/software/anaconda/bin/python'
> side note, export PYSPARK_PYTHON='/srv/software/anaconda/bin/python' was set 
> as well in the spark-env.sh.
> run the command:
> spark-submit test.py --master yarn --deploy-mode cluster
> It appears as though the Node Manager with the DRIVER does not use the 
> PYSPARK_DRIVER_PYTHON env python, but instead uses the CentOS system default 
> (which in this case is python 2.6).
> Workaround appears to setting the python path in the SPARK_YARN_USER_ENV



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-9235) PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node manager acting as driver in yarn-cluster mode

2015-07-21 Thread Aaron Glahe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Glahe updated SPARK-9235:
---
Summary: PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node 
manager acting as driver in yarn-cluster mode  (was: PYSPARK_DRIVER_PYTHON env 
variable is not set on the YARN Node manager acting as driver when yarn-cluster 
mode)

 PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node manager acting 
 as driver in yarn-cluster mode
 

 Key: SPARK-9235
 URL: https://issues.apache.org/jira/browse/SPARK-9235
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 1.4.1, 1.5.0
 Environment: CentOS 6.6, python 2.7, Spark 1.4.1 tagged version, YARN 
 Cluster Manager, CDH 5.4.1 (Hadoop 2.6.0++), Java 1.7
Reporter: Aaron Glahe
Priority: Minor

 Relates to SPARK-9229
 Env:  Spark on YARN, Java 1.7, Centos 6.6, CDH 5.4.1 (Hadoop 2.6.0++), 
 Anaconda Python 2.7.10 installed in /srv/software directory
 On a client/submitting machine, we set the PYSPARK_DRIVER_PYTHON env var in 
 spark-env.sh that pointed the anaconda python executable, which was on every 
 YARN node: 
 export PYSPARK_DRIVER_PYTHON='/srv/software/anaconda/bin/python'
 side note, export PYSPARK_PYTHON='/srv/software/anaconda/bin/python' was set 
 as well in the spark-env.sh.
 run the command:
 spark-submit test.py --master yarn --deploy-mode cluster
 It appears as though the Node Manager with the DRIVER does not use the 
 PYSPARK_DRIVER_PYTHON env python, but instead uses the CentOS system default 
 (which in this case is python 2.6).
 Workaround appears to setting the python path in the SPARK_YARN_USER_ENV



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-9235) PYSPARK_DRIVER_PYTHON env variable is not set on the YARN Node manager acting as driver when yarn-cluster mode

2015-07-21 Thread Aaron Glahe (JIRA)
Aaron Glahe created SPARK-9235:
--

 Summary: PYSPARK_DRIVER_PYTHON env variable is not set on the YARN 
Node manager acting as driver when yarn-cluster mode
 Key: SPARK-9235
 URL: https://issues.apache.org/jira/browse/SPARK-9235
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 1.4.1, 1.5.0
 Environment: CentOS 6.6, python 2.7, Spark 1.4.1 tagged version, YARN 
Cluster Manager, CDH 5.4.1 (Hadoop 2.6.0++), Java 1.7
Reporter: Aaron Glahe
Priority: Minor


Relates to SPARK-9229

Env:  Spark on YARN, Java 1.7, Centos 6.6, CDH 5.4.1 (Hadoop 2.6.0++), Anaconda 
Python 2.7.10 installed in /srv/software directory

On a client/submitting machine, we set the PYSPARK_DRIVER_PYTHON env var in 
spark-env.sh that pointed the anaconda python executable, which was on every 
YARN node: 

export PYSPARK_DRIVER_PYTHON='/srv/software/anaconda/bin/python'

side note, export PYSPARK_PYTHON='/srv/software/anaconda/bin/python' was set as 
well in the spark-env.sh.

run the command:
spark-submit test.py --master yarn --deploy-mode cluster

It appears as though the Node Manager with the DRIVER does not use the 
PYSPARK_DRIVER_PYTHON env python, but instead uses the CentOS system default 
(which in this case is python 2.6).

Workaround appears to setting the python path in the SPARK_YARN_USER_ENV



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org