Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3783#discussion_r22668329
  
    --- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
    @@ -315,7 +314,7 @@ private[spark] class ExecutorAllocationManager(
       private def onExecutorAdded(executorId: String): Unit = synchronized {
         if (!executorIds.contains(executorId)) {
           executorIds.add(executorId)
    -      executorIds.foreach(onExecutorIdle)
    +      onExecutorIdle(executorId)
    --- End diff --
    
    > If an idle executor (call this executor X) is not removed because the 
minimum is reached, it won't be removed. However, its expireTime isn't changed. 
So after new executor joins, schedule() will remove this idle executor since 
it's expired.
    
    But its `expireTime` actually is already removed 
[here](https://github.com/apache/spark/blob/8d45834debc6986e61831d0d6e982d5528dccc51/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L215),
 so if we don't mark it idle when a new executor joins, it will never be 
removed. I think there are two possible solutions here:
    
    1) When an executor is not removed because the lower bound is reached, we 
set the expire time of that executor to a special value (e.g. -1) instead of 
removing the expire time from `removeTimes`. Then, when we add a new executor, 
we look for this special value and mark those executors idle. This is correct 
because if executor X is marked busy in the mean time, its `-1` expire time 
will be removed from `removeTimes`.
    
    2) We maintain a set of executor IDs that are marked busy. When we add a 
new executor, we mark all executors that are *not* busy idle.
    
    I personally prefer (1) because it doesn't introduce a new data structure, 
but it adds some complexity so we need to carefully document what `-1` means. 
(We should declare it in the `object`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to