Adrian Wang created SPARK-29177:
-----------------------------------

             Summary: Zombie tasks prevents executor from releasing when task 
exceeds maxResultSize
                 Key: SPARK-29177
                 URL: https://issues.apache.org/jira/browse/SPARK-29177
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.4, 2.3.4
            Reporter: Adrian Wang


When we fetch results from executors and found the total size has exceeded the 
maxResultSize configured, Spark will simply abort the stage and all dependent 
jobs. But the task triggered this is actually successful, but never posted 
`CompletionEvent` out, as a result it will never be removed from 
`CoarseGrainedSchedulerBackend`. If dynamic allocation is enabled, there will 
be zombie executor(s) remaining in resource manager, it will never die until 
application ends.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to