[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

HyukjinKwon Fri, 16 Nov 2018 22:36:33 -0800

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23055#discussion_r234398745
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
    @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
       private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
       // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
       // number of concurrent tasks, which is determined by the number of 
cores in this executor.
    -  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
    +  private val memoryMb = if (Utils.isWindows) {
    --- End diff --
    
    > My point is that if resource can't be loaded for any reason, the code 
shouldn't fail. As it is, if resource can't be loaded then that is handled, but 
if the memory limit is set then the worker will still try to use it. That's 
what I think is brittle. There should be a flag for whether to attempt to use 
the resource API, based on whether it was loaded.
    
    Ah, so the point is that the condition for the existence `resource` might 
not be clear - so we should have the flag to make it not failed in case. Yup, 
makes sense. Let me add a flag.
    
    > If the worker operates as I described, then why make any changes on the 
JVM side?
    
    I am making some changes on the JVM side so that we can explicitly disable 
on a certain condition For instance, if we don't make a change in JVM side, and 
only make the changes in Python `worker`. Later, we can have some other changes 
in JVM side referring this configuration - which can be problematic.
    
    If we keep the configuration somehow in JVM side, it basically means we 
could have another status for this configuration, rather then just being 
disabled or enabled which we should take care of.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

Reply via email to