Hello Jeff, Well this would also mean that you have to manage the same virtualenv (same path) on all nodes and install your packages to it the same way you would if you would install the packages to the default python path.
In any case at the moment you can already do what you proposed by creating identical virtualenvs on all nodes on the same path and change the spark python path to point to the virtualenv. Best Regards, Mohannad On Mar 1, 2016 06:07, "Jeff Zhang" <zjf...@gmail.com> wrote: > I have created jira for this feature , comments and feedback are welcome > about how to improve it and whether it's valuable for users. > > https://issues.apache.org/jira/browse/SPARK-13587 > > > Here's some background info and status of this work. > > > Currently, it's not easy for user to add third party python packages in > pyspark. > > - One way is to using --py-files (suitable for simple dependency, but > not suitable for complicated dependency, especially with transitive > dependency) > - Another way is install packages manually on each node (time wasting, > and not easy to switch to different environment) > > Python now has 2 different virtualenv implementation. One is native > virtualenv another is through conda. > > I have implemented POC for this features. Here's one simple command for > how to use virtualenv in pyspark > > bin/spark-submit --master yarn --deploy-mode client --conf > "spark.pyspark.virtualenv.enabled=true" --conf > "spark.pyspark.virtualenv.type=conda" --conf > "spark.pyspark.virtualenv.requirements=/Users/jzhang/work/virtualenv/conda.txt" > --conf "spark.pyspark.virtualenv.path=/Users/jzhang/anaconda/bin/conda" > ~/work/virtualenv/spark.py > > There're 4 properties needs to be set > > - spark.pyspark.virtualenv.enabled (enable virtualenv) > - spark.pyspark.virtualenv.type (native/conda are supported, default > is native) > - spark.pyspark.virtualenv.requirements (requirement file for the > dependencies) > - spark.pyspark.virtualenv.path (path to the executable for for > virtualenv/conda) > > > > > > > Best Regards > > Jeff Zhang >