jin xing created SPARK-24240:
--------------------------------

             Summary: Add a config to control whether InMemoryFileIndex should 
update cache when refresh.
                 Key: SPARK-24240
                 URL: https://issues.apache.org/jira/browse/SPARK-24240
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: jin xing


In current 
code([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L172),]
 after data is inserted, spark will always refresh file index and update the 
cache. If the target table has tons of files, job will suffer time and OOM 
issue. Could we add a config to control whether InMemoryFileIndex should update 
cache when refresh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to