Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17166#discussion_r107534830
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl 
private[scheduler](
           taskState: TaskState,
           reason: TaskFailedReason): Unit = synchronized {
         taskSetManager.handleFailedTask(tid, taskState, reason)
    -    if (!taskSetManager.isZombie && taskState != TaskState.KILLED) {
    +    if (!taskSetManager.isZombie) {
    --- End diff --
    
    I don't think you can get a request storm. The _only_ case this guards 
against is exactly what you mentioned -- a speculative task that is killed 
because another attempt succeeded. The number of speculative tasks is always 
small so this shouldn't be an issue. In comparison, we hit this code path much 
more often with failed tasks.
    
    If we were to add this check back, then in the task kill API we would have 
to add a parameter as to whether revive offers should be called. This is 
substantial added complexity which can be removed if we make this 
simplification at this site. I'll leave it to @kayousterhout to decide if this 
is worth it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to