viirya commented on a change in pull request #26312:
URL: https://github.com/apache/spark/pull/26312#discussion_r506086784



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
##########
@@ -281,6 +281,10 @@ object FileFormatWriter extends Logging {
     } catch {
       case e: FetchFailedException =>
         throw e
+      case f: FileAlreadyExistsException =>

Review comment:
       I see. Thanks for the details. We have different standpoints. For your 
cases the first one option looks a better choice. The customers we had are 
using HDFS and `FileAlreadyExistsException` isn't recoverable. So the pain 
point comes from more time spent on a failed job.
   
   I believe even SPARK-27194 is resolved, fast-fail of a failed job caused by 
`FileAlreadyExistsException` or maybe other errors if we know they are 
un-recoverable in advance, is still useful.
   
   Seems to me there are options, one is to revert this completely, second is 
to add a config for the fast-fail behavior and set it false by default. I 
prefer the second one because the reason above, we can relieve the pain of 
wasting time on failed job if users want.
   
   WDYT?
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to