[jira] [Resolved] (SPARK-29259) Filesystem.exists is called even when not necessary for append save mode

Dongjoon Hyun (Jira) Thu, 26 Sep 2019 15:48:47 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-29259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun resolved SPARK-29259.
-----------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

This is resolved via https://github.com/apache/spark/pull/25928

> Filesystem.exists is called even when not necessary for append save mode
> ------------------------------------------------------------------------
>
>                 Key: SPARK-29259
>                 URL: https://issues.apache.org/jira/browse/SPARK-29259
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.4
>            Reporter: Rahij Ramsharan
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> When saving a dataframe into Hadoop 
> ([https://github.com/apache/spark/blob/v2.4.4/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L93]),
>  spark first checks if the file exists before inspecting the SaveMode to 
> determine if it should actually insert data. However, the pathExists variable 
> is actually not used in the case of SaveMode.Append. In some file systems, 
> the exists call can be expensive and hence this PR makes that call only when 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-29259) Filesystem.exists is called even when not necessary for append save mode

Reply via email to