Github user artemrd commented on the issue:

    https://github.com/apache/spark/pull/21114
  
    Just a long-running job and memory pressure is not enough. You need to have 
several attempts for a stage, each new attempt will update Stage._latestInfo, 
so previous StageInfo and it's accumulators can be GCed. After this 
AccumulatorContext.get() throws an exception until GCed accumulators are 
removed by ContextCleaner. It's also important to send an accumulator update 
for an old attempt before all tasks are finished, otherwise the stage will be 
marked as completed, removed from DAGScheduler.stageIdToStage and 
DAGScheduler.handleTaskSetFailed() will be ignored.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to