Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/20490 Been having talks with colleagues last week and want to check something. How exactly do Spark executors abort speculative jobs without waiting for them get into the ready-to-commit phase. As I was told: by interrupting the thread. If so, can someone point me to where this happens? Assuming this really is the case, then there should really be a callback on task committers to tell them they've just been interrupted and to react accordingly âor maybe the task cleanup callback should be told that the cleanup is due to an interruption. For the mapreduce commit protocol; normal task cleanup should be called after the interrupt, other committers may do more. Rationale * if temp space is used on the host machines, job cleanup cannot reach it, so the space will only get cleaned up as cron jobs purge old content. Ryan's staging committer is the reference example here. * if the task consumes expensive remote resources (DB connections, etc), then releasing them early, i.e. before the job eventually completes, could free them up for others. Or indeed, expensive in-VM resources, like a thread pool of http clients.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org