[ 
https://issues.apache.org/jira/browse/SPARK-32187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177765#comment-17177765
 ] 

Fabian Höring commented on SPARK-32187:
---------------------------------------

About this ticket: https://issues.apache.org/jira/browse/SPARK-13587 and those 
settings:
spark-submit --deploy-mode cluster --master yarn --py-files 
parallelisation_hack-0.1-py2.7.egg --conf spark.pyspark.virtualenv.enabled=true 
 --conf spark.pyspark.virtualenv.type=native --conf 
spark.pyspark.virtualenv.requirements=requirements.txt --conf 
spark.pyspark.virtualenv.bin.path=virtualenv --conf 
spark.pyspark.python=python3 pyspark_poc_runner.py
I don't know they still work but personally I would close the ticket and not 
put this in the doc. I think it is not the right way to to it as it doens't 
scale to 100 executors and can produce race conditions for the task running on 
the same executor (multiple pip installs at the same time on the same node)

 

> User Guide - Shipping Python Package
> ------------------------------------
>
>                 Key: SPARK-32187
>                 URL: https://issues.apache.org/jira/browse/SPARK-32187
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Documentation, PySpark
>    Affects Versions: 3.1.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> - Zipped file
> - Python files
> - PEX \(?\) (see also SPARK-25433)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to