Rahij Ramsharan created SPARK-29259:
---------------------------------------

             Summary: Filesystem.exists is called even when not necessary for 
append save mode
                 Key: SPARK-29259
                 URL: https://issues.apache.org/jira/browse/SPARK-29259
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.4
            Reporter: Rahij Ramsharan


When saving a dataframe into Hadoop 
([https://github.com/apache/spark/blob/v2.4.4/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L93]),
 spark first checks if the file exists before inspecting the SaveMode to 
determine if it should actually insert data. However, the pathExists variable 
is actually not used in the case of SaveMode.Append. In some file systems, the 
exists call can be expensive and hence this PR makes that call only when 
necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to