Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23055#discussion_r235226292 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT]( private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", true) // each python worker gets an equal part of the allocation. the worker pool will grow to the // number of concurrent tasks, which is determined by the number of cores in this executor. - private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY) + private val memoryMb = if (Utils.isWindows) { --- End diff -- I am doing this because: 1. The code consistency - usually the configuration is dealt with in JVM side if possible - otherwise we should send the configuration to worker side and the process it in Python worker side ideally. What we need to do is disable the configuration on a certain condition explicitly. 2. If we do it only in Python side, it's going to introduce the third status in JVM (`memoryMb`). When the configuration value exists in JVM (which means it's enabled) but the functionaility is disabled in Python side. Dealing with it will reduce the complexity. Why are you so against about disabling it in JVM side? I don't see big downside of doing this. I added a flag as you said as well. For clarification, it's a rather minor point.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org