Dan Sanduleac created SPARK-20001: ------------------------------------- Summary: Support PythonRunner executing inside a Conda env Key: SPARK-20001 URL: https://issues.apache.org/jira/browse/SPARK-20001 Project: Spark Issue Type: New Feature Components: PySpark, Spark Core Affects Versions: 2.2.0 Reporter: Dan Sanduleac
Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org