[ https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547954#comment-14547954 ]
Wilfred Spiegelenburg commented on SPARK-7705: ---------------------------------------------- I think the limitation that we currently set in the ApplicationMaster.scala on line [#120|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L120] is far to limiting. The {{cleanupStagingDir(fs)}} should be moved out of the {{if (!unregistered)}}. I have not tested this yet but this seems to be far more logical. Since we're in the shutdown hook it should also catch our case: {code} if (finalStatus == FinalApplicationStatus.SUCCEEDED || isLastAttempt) { // we only want to unregister if we don't want the RM to retry if (!unregistered) { unregister(finalStatus, finalMsg) } // Since we're done we should clean up the staging directory cleanupStagingDir(fs) } {code} Not sure how to create a PR to check and discuss this change > Cleanup of .sparkStaging directory fails if application is killed > ----------------------------------------------------------------- > > Key: SPARK-7705 > URL: https://issues.apache.org/jira/browse/SPARK-7705 > Project: Spark > Issue Type: Improvement > Components: YARN > Affects Versions: 1.3.0 > Reporter: Wilfred Spiegelenburg > Priority: Minor > > When a streaming application is killed while running on YARN the > .sparkStaging directory is not cleaned up. Setting > spark.yarn.preserve.staging.files=false does not help and still leaves the > files around. > The changes in SPARK-7503 do not catch this case since there is no exception > in the shutdown. When the application gets killed the AM gets told to > shutdown and the shutdown hook is run but the clean up is not triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org