GitHub user pgandhi999 opened a pull request:

    https://github.com/apache/spark/pull/21468

    [SPARK-22151] : PYTHONPATH not picked up from the spark.yarn.appMaste…

    …rEnv properly
    
    Running in yarn cluster mode and trying to set pythonpath via 
spark.yarn.appMasterEnv.PYTHONPATH doesn't work.
    
    the yarn Client code looks at the env variables:
    val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    But when you set spark.yarn.appMasterEnv it puts it into the local env.
    
    So the python path set in spark.yarn.appMasterEnv isn't properly set.
    
    You can work around if you are running in cluster mode by setting it on the 
client like:
    
    PYTHONPATH=./addon/python/ spark-submit
    
    ## What changes were proposed in this pull request?
    In Client.scala, PYTHONPATH was being overridden, so changed code to append 
values to PYTHONPATH instead of overriding them.
    
    ## How was this patch tested?
    Added log statements to ApplicationMaster.scala to check for environment 
variable PYTHONPATH, ran a spark job in cluster mode before the change and 
verified the issue. Performed the same test after the change and verified the 
fix. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pgandhi999/spark SPARK-22151

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21468.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21468
    
----
commit 0aee8faad9cb60721b153c9bc2187f87a4036b9e
Author: pgandhi <pgandhi@...>
Date:   2018-05-31T14:36:13Z

    [SPARK-22151] : PYTHONPATH not picked up from the spark.yarn.appMasterEnv 
properly
    
    Running in yarn cluster mode and trying to set pythonpath via 
spark.yarn.appMasterEnv.PYTHONPATH doesn't work.
    
    the yarn Client code looks at the env variables:
    val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    But when you set spark.yarn.appMasterEnv it puts it into the local env.
    
    So the python path set in spark.yarn.appMasterEnv isn't properly set.
    
    You can work around if you are running in cluster mode by setting it on the 
client like:
    
    PYTHONPATH=./addon/python/ spark-submit
    
    In Client.scala, PYTHONPATH was being overridden, so changed code to append 
values to PYTHONPATH

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to