[jira] [Commented] (SPARK-18027) .sparkStaging not clean on RM ApplicationNotFoundException
[ https://issues.apache.org/jira/browse/SPARK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15601231#comment-15601231 ] David Shar commented on SPARK-18027: Yes, I believe it is safer, we cannot be sure what Yarn is doing on connection failure. > .sparkStaging not clean on RM ApplicationNotFoundException > -- > > Key: SPARK-18027 > URL: https://issues.apache.org/jira/browse/SPARK-18027 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: David Shar >Priority: Minor > > Hi, > It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder > cleanup. > in Client.scala:monitorApplication > {code} > val report: ApplicationReport = > try { > getApplicationReport(appId) > } catch { > case e: ApplicationNotFoundException => > logError(s"Application $appId not found.") > return (YarnApplicationState.KILLED, > FinalApplicationStatus.KILLED) > case NonFatal(e) => > logError(s"Failed to contact YARN for application $appId.", e) > return (YarnApplicationState.FAILED, > FinalApplicationStatus.FAILED) > } > > if (state == YarnApplicationState.FINISHED || > state == YarnApplicationState.FAILED || > state == YarnApplicationState.KILLED) { > cleanupStagingDir(appId) > return (state, report.getFinalApplicationStatus) > } > {code} > In case of ApplicationNotFoundException, we don't cleanup the sparkStaging > folder. > I believe we should call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18027) .sparkStaging not clean on RM ApplicationNotFoundException
[ https://issues.apache.org/jira/browse/SPARK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598281#comment-15598281 ] David Shar edited comment on SPARK-18027 at 10/22/16 6:36 PM: -- I believe it there is a major difference between the 2 exceptions above. 1. ApplicationNotFoundException means there is no such running app according to Yarn and it is safe to cleanup. 2. NonFatal, fail to connect to Yarn, we can't be sure that the app is running or not, so we cannot be safe cleaning up. Therefore, just add cleanup for the first exception. was (Author: davidshar): I believe it there is a major difference between the 2 exceptions above. 1. ApplicationNotFoundException means there is no such running app according to Yarn and it is safe to cleanup. 2. NonFatal, fail to connect to Yarn, we can't be sure that the app is running or not, so we cannot be safe cleaning up. > .sparkStaging not clean on RM ApplicationNotFoundException > -- > > Key: SPARK-18027 > URL: https://issues.apache.org/jira/browse/SPARK-18027 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: David Shar >Priority: Minor > > Hi, > It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder > cleanup. > in Client.scala:monitorApplication > {code} > val report: ApplicationReport = > try { > getApplicationReport(appId) > } catch { > case e: ApplicationNotFoundException => > logError(s"Application $appId not found.") > return (YarnApplicationState.KILLED, > FinalApplicationStatus.KILLED) > case NonFatal(e) => > logError(s"Failed to contact YARN for application $appId.", e) > return (YarnApplicationState.FAILED, > FinalApplicationStatus.FAILED) > } > > if (state == YarnApplicationState.FINISHED || > state == YarnApplicationState.FAILED || > state == YarnApplicationState.KILLED) { > cleanupStagingDir(appId) > return (state, report.getFinalApplicationStatus) > } > {code} > In case of ApplicationNotFoundException, we don't cleanup the sparkStaging > folder. > I believe we should call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18027) .sparkStaging not clean on RM ApplicationNotFoundException
[ https://issues.apache.org/jira/browse/SPARK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598281#comment-15598281 ] David Shar commented on SPARK-18027: I believe it there is a major difference between the 2 exceptions above. 1. ApplicationNotFoundException means there is no such running app according to Yarn and it is safe to cleanup. 2. NonFatal, fail to connect to Yarn, we can't be sure that the app is running or not, so we cannot be safe cleaning up. > .sparkStaging not clean on RM ApplicationNotFoundException > -- > > Key: SPARK-18027 > URL: https://issues.apache.org/jira/browse/SPARK-18027 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: David Shar >Priority: Minor > > Hi, > It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder > cleanup. > in Client.scala:monitorApplication > {code} > val report: ApplicationReport = > try { > getApplicationReport(appId) > } catch { > case e: ApplicationNotFoundException => > logError(s"Application $appId not found.") > return (YarnApplicationState.KILLED, > FinalApplicationStatus.KILLED) > case NonFatal(e) => > logError(s"Failed to contact YARN for application $appId.", e) > return (YarnApplicationState.FAILED, > FinalApplicationStatus.FAILED) > } > > if (state == YarnApplicationState.FINISHED || > state == YarnApplicationState.FAILED || > state == YarnApplicationState.KILLED) { > cleanupStagingDir(appId) > return (state, report.getFinalApplicationStatus) > } > {code} > In case of ApplicationNotFoundException, we don't cleanup the sparkStaging > folder. > I believe we should call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18027) .sparkStaging not clean on RM ApplicationNotFoundException
[ https://issues.apache.org/jira/browse/SPARK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Shar updated SPARK-18027: --- Summary: .sparkStaging not clean on RM ApplicationNotFoundException (was: .sparkStaging not clean on error) > .sparkStaging not clean on RM ApplicationNotFoundException > -- > > Key: SPARK-18027 > URL: https://issues.apache.org/jira/browse/SPARK-18027 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: David Shar > > Hi, > It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder > cleanup. > in Client.scala:monitorApplication > {code} > val report: ApplicationReport = > try { > getApplicationReport(appId) > } catch { > case e: ApplicationNotFoundException => > logError(s"Application $appId not found.") > return (YarnApplicationState.KILLED, > FinalApplicationStatus.KILLED) > case NonFatal(e) => > logError(s"Failed to contact YARN for application $appId.", e) > return (YarnApplicationState.FAILED, > FinalApplicationStatus.FAILED) > } > > if (state == YarnApplicationState.FINISHED || > state == YarnApplicationState.FAILED || > state == YarnApplicationState.KILLED) { > cleanupStagingDir(appId) > return (state, report.getFinalApplicationStatus) > } > {code} > In case of ApplicationNotFoundException, we don't cleanup the sparkStaging > folder. > I believe we should call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18027) .sparkStaging not clean on error
[ https://issues.apache.org/jira/browse/SPARK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Shar updated SPARK-18027: --- Description: Hi, It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder cleanup. in Client.scala:monitorApplication {code} val report: ApplicationReport = try { getApplicationReport(appId) } catch { case e: ApplicationNotFoundException => logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => logError(s"Failed to contact YARN for application $appId.", e) return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED) } if (state == YarnApplicationState.FINISHED || state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) { cleanupStagingDir(appId) return (state, report.getFinalApplicationStatus) } {code} In case of ApplicationNotFoundException, we don't cleanup the sparkStaging folder. I believe we should call cleanupStagingDir(appId) on the catch clause above. was: Hi, It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder cleanup. in Client.scala:monitorApplication {code} val report: ApplicationReport = try { getApplicationReport(appId) } catch { case e: ApplicationNotFoundException => logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => logError(s"Failed to contact YARN for application $appId.", e) return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED) } if (state == YarnApplicationState.FINISHED || state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) { cleanupStagingDir(appId) return (state, report.getFinalApplicationStatus) } {code} In case of ApplicationNotFoundException, we don't cleanup the sparkStaging folder. I believe we call cleanupStagingDir(appId) on the catch clause above. > .sparkStaging not clean on error > > > Key: SPARK-18027 > URL: https://issues.apache.org/jira/browse/SPARK-18027 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.6.0 >Reporter: David Shar > > Hi, > It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder > cleanup. > in Client.scala:monitorApplication > {code} > val report: ApplicationReport = > try { > getApplicationReport(appId) > } catch { > case e: ApplicationNotFoundException => > logError(s"Application $appId not found.") > return (YarnApplicationState.KILLED, > FinalApplicationStatus.KILLED) > case NonFatal(e) => > logError(s"Failed to contact YARN for application $appId.", e) > return (YarnApplicationState.FAILED, > FinalApplicationStatus.FAILED) > } > > if (state == YarnApplicationState.FINISHED || > state == YarnApplicationState.FAILED || > state == YarnApplicationState.KILLED) { > cleanupStagingDir(appId) > return (state, report.getFinalApplicationStatus) > } > {code} > In case of ApplicationNotFoundException, we don't cleanup the sparkStaging > folder. > I believe we should call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18027) .sparkStaging not clean on error
David Shar created SPARK-18027: -- Summary: .sparkStaging not clean on error Key: SPARK-18027 URL: https://issues.apache.org/jira/browse/SPARK-18027 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.6.0 Reporter: David Shar Hi, It seems that SPARK-7705 didn't fix all issues with .sparkStaging folder cleanup. in Client.scala:monitorApplication {code} val report: ApplicationReport = try { getApplicationReport(appId) } catch { case e: ApplicationNotFoundException => logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => logError(s"Failed to contact YARN for application $appId.", e) return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED) } if (state == YarnApplicationState.FINISHED || state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) { cleanupStagingDir(appId) return (state, report.getFinalApplicationStatus) } {code} In case of ApplicationNotFoundException, we don't cleanup the sparkStaging folder. I believe we call cleanupStagingDir(appId) on the catch clause above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org