[ 
https://issues.apache.org/jira/browse/SPARK-29259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-29259:
-------------------------------------

    Assignee: Rahij Ramsharan

> Filesystem.exists is called even when not necessary for append save mode
> ------------------------------------------------------------------------
>
>                 Key: SPARK-29259
>                 URL: https://issues.apache.org/jira/browse/SPARK-29259
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Rahij Ramsharan
>            Assignee: Rahij Ramsharan
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> When saving a dataframe into Hadoop 
> ([https://github.com/apache/spark/blob/v2.4.4/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L93]),
>  spark first checks if the file exists before inspecting the SaveMode to 
> determine if it should actually insert data. However, the pathExists variable 
> is actually not used in the case of SaveMode.Append. In some file systems, 
> the exists call can be expensive and hence this PR makes that call only when 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to