Github user fangshil commented on a diff in the pull request: https://github.com/apache/spark/pull/20931#discussion_r179672722 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -186,7 +186,9 @@ class HadoopMapReduceCommitProtocol( logDebug(s"Clean up default partition directories for overwriting: $partitionPaths") for (part <- partitionPaths) { val finalPartPath = new Path(path, part) - fs.delete(finalPartPath, true) + if (!fs.delete(finalPartPath, true) && !fs.exists(finalPartPath.getParent)) { --- End diff -- The FileSystem API spec on delete says "Code SHOULD just call delete(path, recursive) and assume the destination is no longer present". Referring to its detailed spec, the only case that we may get false from delete would be <code>finalPartPath</code> does not exist. Other failures should result in exception. When <code>finalPartPath</code> does not exist, which is an expected case, we should only check if the parent of <code>finalPartPath</code> does not exist because otherwise we will have problem in rename according to its spec. Please advise if you guys think we still should double-check <code>finalPartPath</code> before rename
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org