Song Jun created SPARK-19748:
--------------------------------

             Summary: refresh for InMemoryFileIndex with FileStatusCache does 
not work correctly
                 Key: SPARK-19748
                 URL: https://issues.apache.org/jira/browse/SPARK-19748
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Song Jun


If we refresh a InMemoryFileIndex with a FileStatusCache, it will first use the 
FileStatusCache to generate the cachedLeafFiles etc, then call 
FileStatusCache.invalidateAll. the order to do these two actions is wrong, this 
lead to the refresh action does not take effect.

{code}
  override def refresh(): Unit = {
    refresh0()
    fileStatusCache.invalidateAll()
  }

  private def refresh0(): Unit = {
    val files = listLeafFiles(rootPaths)
    cachedLeafFiles =
      new mutable.LinkedHashMap[Path, FileStatus]() ++= files.map(f => 
f.getPath -> f)
    cachedLeafDirToChildrenFiles = files.toArray.groupBy(_.getPath.getParent)
    cachedPartitionSpec = null
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to