Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23055#discussion_r236494667 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT]( private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", true) // each python worker gets an equal part of the allocation. the worker pool will grow to the // number of concurrent tasks, which is determined by the number of cores in this executor. - private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY) + private val memoryMb = if (Utils.isWindows) { --- End diff -- I mean .. what I am disabling here is setting Python memory limit on Windows; looks Yarn side still can allocate more but not tested. I'm thinking we should disable whole feature on Windows ideally but I couldn't either test it on Windows because I don't have Windows Yarn cluster (I only have one Windows machine that has dev environment). I was trying to at least document that it doesn't work because it's going to work differently comparing to other OSes that don't have `resource` module (For current status, PySpark apps that uses Python worker do not work at all and we don't necessarily document it works on Windows). I think it's more correct to say it does not work because it's not tested (or at least just simply say it's not guaranteed on Windows). For the reason that I prefer to check it on JVM side instead is, 1. It's really usual to check it on JVM side when it's possible and make the worker simple. It could make it coupled to JVM but it's already strongly coupled 2. If Yarn code can also be tested, and if it doesn't work, it should also be disabled in JVM side. If we can find a workaround on Windows, we can fix the Python worker and enable it. 3. If it's checked JVM side ahead, it reduces one state in JVM (`memoryMb`) is enabled in JVM side but Python skips it.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org