[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...

holdenk Fri, 03 Aug 2018 09:40:38 -0700

Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21977#discussion_r207601410
  
    --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala 
---
    @@ -51,6 +52,17 @@ private[spark] class PythonRDD(
       val bufferSize = conf.getInt("spark.buffer.size", 65536)
       val reuseWorker = conf.getBoolean("spark.python.worker.reuse", true)
     
    +  val memoryMb = {
    --- End diff --
    
    It's been awhile since I spent a lot of time thinking about how we launch 
our python worker processes. Maybe it would make sense to add a comment here 
explaining the logic a bit more? Based on the documentation in 
`PythonWorkerFactory` it appears we do the fork/not-fork decision not based on 
if reuseworker is set but instead on if we're in Windows or not. Is that the 
logic that this block was attempting to handle?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...

Reply via email to