Kevin Grealish created SPARK-16110: -------------------------------------- Summary: Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set Key: SPARK-16110 URL: https://issues.apache.org/jira/browse/SPARK-16110 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.6.1 Environment: Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-38-generic x86_64), Spark 1.6.1, Azure HDInsight 3.4) Reporter: Kevin Grealish
When a cluster has PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON environment variables set (needed for using non-system Python e.g. /usr/bin/anaconda/bin/python), then you are unable to override this per submission in YARN cluster mode. When using spark-submit (in this case via LIVY) to submit with an override: spark-submit --master yarn --deploy-mode cluster --conf 'spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=python3' --conf' 'spark.yarn.appMasterEnv.PYSPARK_PYTHON=python3' probe.py the environment variable values will override the conf settings. A workaround for some can be to unset the env vars but that is not always possible (e.g. submitting batch via LIVY where you can only pass through the parameters to spark-submit). Expectation is that the conf values above override the environment variables. Fix is to change the order of application of conf and env vars in the yarn client. Related discussion:https://issues.cloudera.org/browse/LIVY-159 Backporting this to 1.6 would be great and unblocking for me. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org