[ https://issues.apache.org/jira/browse/SPARK-32187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17178728#comment-17178728 ]
Hyukjin Kwon commented on SPARK-32187: -------------------------------------- The draft looks good as a start. A couple of comments from my cursory look: - Let's make sure having copy-and-pastable examples, and let's try to write de facto standard given that there are multiple other sites such as [http://alkaline-ml.com/2018-07-02-conda-spark/], [https://jcristharif.com/venv-pack/spark.html.|https://jcristharif.com/venv-pack/spark.html]. - Let's place the section about shipping zip, egg and .py files onto the top, and place pex and virtual environment on the bottom. Arguably it is more common to simply use {{ --py-files}} or {{spark.submit.pyFiles}} configuration to ship Python packages. Let's open a PR and loop with other committers to have more reviews. Shipping packages is a bit hairy area and there are many other committers who have a better insight than me in particular about other clusters Mesos, Kubernates, etc. As for referencing your own stuff, It looks fine. It's okay to mention things as a FYI reference. {quote} there is no way to set the archives as a config param when not running on YARN. I checked the doc and the spark code. So it seems inconsistent. Can you check or confirm ? {quote} Yes, I think that's correct up to my knowledge. SPARK-13587 was not merged so PySpark does not support yet. Yes, it would not be in the doc at least for now. > User Guide - Shipping Python Package > ------------------------------------ > > Key: SPARK-32187 > URL: https://issues.apache.org/jira/browse/SPARK-32187 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark > Affects Versions: 3.1.0 > Reporter: Hyukjin Kwon > Priority: Major > > - Zipped file > - Python files > - PEX \(?\) (see also SPARK-25433) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org