jin xing created SPARK-24240: -------------------------------- Summary: Add a config to control whether InMemoryFileIndex should update cache when refresh. Key: SPARK-24240 URL: https://issues.apache.org/jira/browse/SPARK-24240 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 2.3.0 Reporter: jin xing
In current code([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L172),] after data is inserted, spark will always refresh file index and update the cache. If the target table has tons of files, job will suffer time and OOM issue. Could we add a config to control whether InMemoryFileIndex should update cache when refresh. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org