[ https://issues.apache.org/jira/browse/SPARK-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214635#comment-14214635 ]
Vijay commented on SPARK-4402: ------------------------------ Thanks for the explanation. It is clear now. > Output path validation of an action statement resulting in runtime exception > ---------------------------------------------------------------------------- > > Key: SPARK-4402 > URL: https://issues.apache.org/jira/browse/SPARK-4402 > Project: Spark > Issue Type: Wish > Reporter: Vijay > Priority: Minor > > Output path validation is happening at the time of statement execution as a > part of lazyevolution of action statement. But if the path already exists > then it throws a runtime exception. Hence all the processing completed till > that point is lost which results in resource wastage (processing time and CPU > usage). > If this I/O related validation is done before the RDD action operations then > this runtime exception can be avoided. > I believe similar validation/ feature is implemented in hadoop also. > Example: > SchemaRDD.saveAsTextFile() evaluated the path during runtime -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org