Dan Sanduleac created SPARK-20001:
-------------------------------------

             Summary: Support PythonRunner executing inside a Conda env
                 Key: SPARK-20001
                 URL: https://issues.apache.org/jira/browse/SPARK-20001
             Project: Spark
          Issue Type: New Feature
          Components: PySpark, Spark Core
    Affects Versions: 2.2.0
            Reporter: Dan Sanduleac


Similar to SPARK-13587, I'm trying to allow the user to configure a Conda 
environment that PythonRunner will run from. 
This change remembers theconda environment found on the driver and installs the 
same packages on the executor side, only once per PythonWorkerFactory. The list 
of requested conda packages are added to the PythonWorkerFactory cache, so two 
collects using the same environment (incl packages) can re-use the same running 
executors.

This issue requires that the conda binary is already available on the driver as 
well as executors, you just have to specify where it can be found.

Please see the attached issue on palantir/spark for additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to