Github user squito commented on the issue: https://github.com/apache/spark/pull/13685 thanks for the test and explanation @lw-lin ! That is a great walk through and reproduction of the issue, nice use of the latches to trigger it. :thumbsup: I have only very small comments, otherwise lgtm. Since the PR description will become the commit msg, can you update it to also include a very small description of the race? eg. just something like "Before this change, if a task was killed before it was deserialized, it would be marked as FAILED instead of KILLED." Incidentally, I was actually imagining a different race: if `executor.kill()` marks `taskRunner.killed` [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L215), but before calling `task.killed()` the worker thread throws the `TaskKilledException` [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L264). This change fixes that as well (though the test case doesn't cover it, but that's ok.)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org