Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221898450
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -315,7 +315,12 @@ object InMemoryFileIndex e
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221895032
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -315,7 +315,12 @@ object InMemoryFileIndex ext
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221890344
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -315,7 +315,12 @@ object InMemoryFileIndex e
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221876275
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -315,7 +315,12 @@ object InMemoryFileIndex ext
GitHub user peter-toth opened a pull request:
https://github.com/apache/spark/pull/22603
SPARK-25062: clean up BlockLocations in InMemoryFileIndex
## What changes were proposed in this pull request?
`InMemoryFileIndex` caches `FileStatus` objects to paths. Each `FileStatus`