[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed
[ https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559055#comment-14559055 ] Apache Spark commented on SPARK-7705: - User 'Sephiroth-Lin' has created a pull request for this issue: https://github.com/apache/spark/pull/6409 Cleanup of .sparkStaging directory fails if application is killed - Key: SPARK-7705 URL: https://issues.apache.org/jira/browse/SPARK-7705 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.3.0 Reporter: Wilfred Spiegelenburg Priority: Minor When a streaming application is killed while running on YARN the .sparkStaging directory is not cleaned up. Setting spark.yarn.preserve.staging.files=false does not help and still leaves the files around. The changes in SPARK-7503 do not catch this case since there is no exception in the shutdown. When the application gets killed the AM gets told to shutdown and the shutdown hook is run but the clean up is not triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed
[ https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559114#comment-14559114 ] Thomas Graves commented on SPARK-7705: -- YARN will retry an application for you if it dies badly. You can't just cleanup the staging directory on the kill or the retries will not work. I've posted this same question on the PR but can you describe in more details when this scenario happens and what YARN shows as the status of the application. Cleanup of .sparkStaging directory fails if application is killed - Key: SPARK-7705 URL: https://issues.apache.org/jira/browse/SPARK-7705 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.3.0 Reporter: Wilfred Spiegelenburg Priority: Minor When a streaming application is killed while running on YARN the .sparkStaging directory is not cleaned up. Setting spark.yarn.preserve.staging.files=false does not help and still leaves the files around. The changes in SPARK-7503 do not catch this case since there is no exception in the shutdown. When the application gets killed the AM gets told to shutdown and the shutdown hook is run but the clean up is not triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed
[ https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547954#comment-14547954 ] Wilfred Spiegelenburg commented on SPARK-7705: -- I think the limitation that we currently set in the ApplicationMaster.scala on line [#120|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L120] is far to limiting. The {{cleanupStagingDir(fs)}} should be moved out of the {{if (!unregistered)}}. I have not tested this yet but this seems to be far more logical. Since we're in the shutdown hook it should also catch our case: {code} if (finalStatus == FinalApplicationStatus.SUCCEEDED || isLastAttempt) { // we only want to unregister if we don't want the RM to retry if (!unregistered) { unregister(finalStatus, finalMsg) } // Since we're done we should clean up the staging directory cleanupStagingDir(fs) } {code} Not sure how to create a PR to check and discuss this change Cleanup of .sparkStaging directory fails if application is killed - Key: SPARK-7705 URL: https://issues.apache.org/jira/browse/SPARK-7705 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.3.0 Reporter: Wilfred Spiegelenburg Priority: Minor When a streaming application is killed while running on YARN the .sparkStaging directory is not cleaned up. Setting spark.yarn.preserve.staging.files=false does not help and still leaves the files around. The changes in SPARK-7503 do not catch this case since there is no exception in the shutdown. When the application gets killed the AM gets told to shutdown and the shutdown hook is run but the clean up is not triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org