[ https://issues.apache.org/jira/browse/SPARK-37300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
wuyi resolved SPARK-37300. -------------------------- Fix Version/s: 3.3.0 Assignee: hujiahua Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/34578 > TaskSchedulerImpl should ignore task finished event if its task was already > finished state > ------------------------------------------------------------------------------------------ > > Key: SPARK-37300 > URL: https://issues.apache.org/jira/browse/SPARK-37300 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.2.0 > Reporter: hujiahua > Assignee: hujiahua > Priority: Major > Fix For: 3.3.0 > > > `TaskSchedulerImpl` handle task finished event at `handleSuccessfulTask` and > `handleFailedTask` , but in some case the task was already finished state, > which we should ignore task finished event. > Case describe: > when a executor finished a task of some stage, the driver will receive a > StatusUpdate event to handle it. At the same time the driver found the > executor heartbeat timed out, so the dirver also need handle ExecutorLost > event simultaneously. There was a race condition issues here, which will make > TaskSetManager.successful and TaskSetManager.tasksSuccessful wrong result. > More detailed description and discussion can be viewed at > https://issues.apache.org/jira/browse/SPARK-36575 and > https://github.com/apache/spark/pull/33872 -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org