dongjoon-hyun commented on code in PR #44680:
URL: https://github.com/apache/spark/pull/44680#discussion_r1762087346


##########
python/docs/Makefile:
##########
@@ -16,7 +16,7 @@
 # Minimal makefile for Sphinx documentation
 
 # You can set these variables from the command line.
-SPHINXOPTS    ?= "-W"
+SPHINXOPTS    ?= "-W" "-j" "auto"

Review Comment:
   Unfortunately, this breaks Python API doc generation in many core machines 
because this means the number of parallel `SparkSubmit` invocation of PySpark. 
In addition, given that each `PySpark` currently is launched with `local[*]`, 
this ends up `N * N` `pyspark.daemon`.
   
   As of today, this setting seems to work on low-core machine like `GitHub 
Action` runners (4 cores). This breaks `Python` documentations build even on M3 
Max environment and this is worse on large EC2 machines (c7i.24xlarge). You can 
see the failure locally like this.
   
   ```
   $ build/sbt package -Phive-thriftserver
   $ cd python/docs
   $ make html
   ...
   java.lang.OutOfMemoryError: Java heap space
   ...
   24/09/16 14:09:55 WARN PythonRunner: Incomplete task 7.0 in stage 30 (TID 
177) interrupted: Attempting to kill Python Worker
   ...
   make: *** [html] Error 2
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to