viirya commented on a change in pull request #26312: URL: https://github.com/apache/spark/pull/26312#discussion_r506086784
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala ########## @@ -281,6 +281,10 @@ object FileFormatWriter extends Logging { } catch { case e: FetchFailedException => throw e + case f: FileAlreadyExistsException => Review comment: I see. Thanks for the details. We have different standpoints. For your cases the first one option looks a better choice. The customers we had are using HDFS and `FileAlreadyExistsException` isn't recoverable. So the pain point comes from more time spent on a failed job. I believe even SPARK-27194 is resolved, fast-fail of a failed job caused by `FileAlreadyExistsException` or maybe other errors if we know they are un-recoverable in advance, is still useful. Seems to me there are options, one is to revert this completely, second is to add a config for the fast-fail behavior and set it false by default. I prefer the second one because the reason above, we can relieve the pain of wasting time on failed job if users want. WDYT? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org