Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/20490
  
    Been having talks with colleagues last week and want to check something. 
    
    How exactly do Spark executors abort speculative jobs without waiting for 
them get into the ready-to-commit phase. As I was told: by interrupting the 
thread. If so, can someone point me to where this happens?
    
    Assuming this really is the case, then there should really be a callback on 
task committers to tell them they've just been interrupted and to react 
accordingly —or maybe the task cleanup callback should be told that the 
cleanup is due to an interruption. For the mapreduce commit protocol; normal 
task cleanup should be called after the interrupt, other committers may do more.
    
    Rationale
    * if temp space is used on the host machines, job cleanup cannot reach it, 
so the space will only get cleaned up as cron jobs purge old content. Ryan's 
staging committer is the reference example here.
    * if the task consumes expensive remote resources (DB connections, etc), 
then releasing them early, i.e. before the job eventually completes, could free 
them up for others. Or indeed, expensive in-VM resources, like a thread pool of 
http clients.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to