[ https://issues.apache.org/jira/browse/SPARK-30442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009963#comment-17009963 ]
Maxim Gekk commented on SPARK-30442: ------------------------------------ > This can cause issues, particularly with aws tools, that make it impossible >to retry. Could you clarify how it makes retry impossible. When the mode is set to overwrite, Spark deletes entire folder and writes new files - should be no clashes. In the append mode, new files are added - Spark does not append to existing files. What's the situation when files should be overwritten? > Write mode ignored when using CodecStreams > ------------------------------------------ > > Key: SPARK-30442 > URL: https://issues.apache.org/jira/browse/SPARK-30442 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.4.4 > Reporter: Jesse Collins > Priority: Major > > Overwrite is hardcoded to false in the codec stream. This can cause issues, > particularly with aws tools, that make it impossible to retry. > Ideally, this should be read from the write mode set for the DataWriter that > is writing through this codec class. > [https://github.com/apache/spark/blame/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CodecStreams.scala#L81] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org