[ https://issues.apache.org/jira/browse/SPARK-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688502#comment-15688502 ]
Yang Li commented on SPARK-1677: -------------------------------- Hi Spark Community, I'm curious on the behavior of this "spark.hadoop.validateOutputSpecs" option. If I set it to 'false', will existing files in output directory get wiped out beforehand? For example, if spark job is to output file Y under directory A, which already contain file X, do we expect both file X and Y under folder A? Or just Y will be retained after the job completion. Thanks! > Allow users to avoid Hadoop output checks if desired > ---------------------------------------------------- > > Key: SPARK-1677 > URL: https://issues.apache.org/jira/browse/SPARK-1677 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Patrick Wendell > Assignee: Nan Zhu > Fix For: 1.0.1, 1.1.0 > > > For compatibility with older versions of Spark it would be nice to have an > option `spark.hadoop.validateOutputSpecs` (default true) and a description > "If set to true, validates the output specification used in saveAsHadoopFile > and other variants. This can be disabled to silence exceptions due to > pre-existing output directories." > This would just wrap the checking done in this PR: > https://issues.apache.org/jira/browse/SPARK-1100 > https://github.com/apache/spark/pull/11 > By first checking the spark conf. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org