[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed

2015-05-26 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559055#comment-14559055
 ] 

Apache Spark commented on SPARK-7705:
-

User 'Sephiroth-Lin' has created a pull request for this issue:
https://github.com/apache/spark/pull/6409

 Cleanup of .sparkStaging directory fails if application is killed
 -

 Key: SPARK-7705
 URL: https://issues.apache.org/jira/browse/SPARK-7705
 Project: Spark
  Issue Type: Improvement
  Components: YARN
Affects Versions: 1.3.0
Reporter: Wilfred Spiegelenburg
Priority: Minor

 When a streaming application is killed while running on YARN the 
 .sparkStaging directory is not cleaned up. Setting 
 spark.yarn.preserve.staging.files=false does not help and still leaves the 
 files around.
 The changes in SPARK-7503 do not catch this case since there is no exception 
 in the shutdown. When the application gets killed the AM gets told to 
 shutdown and the shutdown hook is run but the clean up is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed

2015-05-26 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559114#comment-14559114
 ] 

Thomas Graves commented on SPARK-7705:
--

YARN will retry an application for you if it dies badly.  You can't just 
cleanup the staging directory on the kill or the retries will not work.  

I've posted this same question on the PR but can you describe in more details 
when this scenario happens and what YARN shows as the status of the 
application.  

 Cleanup of .sparkStaging directory fails if application is killed
 -

 Key: SPARK-7705
 URL: https://issues.apache.org/jira/browse/SPARK-7705
 Project: Spark
  Issue Type: Improvement
  Components: YARN
Affects Versions: 1.3.0
Reporter: Wilfred Spiegelenburg
Priority: Minor

 When a streaming application is killed while running on YARN the 
 .sparkStaging directory is not cleaned up. Setting 
 spark.yarn.preserve.staging.files=false does not help and still leaves the 
 files around.
 The changes in SPARK-7503 do not catch this case since there is no exception 
 in the shutdown. When the application gets killed the AM gets told to 
 shutdown and the shutdown hook is run but the clean up is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7705) Cleanup of .sparkStaging directory fails if application is killed

2015-05-18 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547954#comment-14547954
 ] 

Wilfred Spiegelenburg commented on SPARK-7705:
--

I think the limitation that we currently set in the ApplicationMaster.scala on 
line 
[#120|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L120]
 is far to limiting.
The {{cleanupStagingDir(fs)}} should be moved out of the {{if (!unregistered)}}.

I have not tested this yet but this seems to be far more logical. Since we're 
in the shutdown hook it should also catch our case:
{code}
  if (finalStatus == FinalApplicationStatus.SUCCEEDED || isLastAttempt) 
{
// we only want to unregister if we don't want the RM to retry
if (!unregistered) {
  unregister(finalStatus, finalMsg)
}
// Since we're done we should clean up the staging directory
cleanupStagingDir(fs)
  }
{code}

Not sure how to create a PR to check and discuss this change

 Cleanup of .sparkStaging directory fails if application is killed
 -

 Key: SPARK-7705
 URL: https://issues.apache.org/jira/browse/SPARK-7705
 Project: Spark
  Issue Type: Improvement
  Components: YARN
Affects Versions: 1.3.0
Reporter: Wilfred Spiegelenburg
Priority: Minor

 When a streaming application is killed while running on YARN the 
 .sparkStaging directory is not cleaned up. Setting 
 spark.yarn.preserve.staging.files=false does not help and still leaves the 
 files around.
 The changes in SPARK-7503 do not catch this case since there is no exception 
 in the shutdown. When the application gets killed the AM gets told to 
 shutdown and the shutdown hook is run but the clean up is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org