Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17591#discussion_r110692150
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala
 ---
    @@ -220,6 +221,32 @@ class FileIndexSuite extends SharedSQLContext {
           assert(catalog.leafDirPaths.head == fs.makeQualified(dirPath))
         }
       }
    +
    +  test("SPARK-20280 - FileStatusCache with a partition with very many 
files") {
    +    /* fake the size, otherwise we need to allocate 2GB of data to trigger 
this bug */
    +    class MyFileStatus extends FileStatus with KnownSizeEstimation {
    +      override def estimatedSize: Long = 1000 * 1000 * 1000
    +    }
    +    /* files * MyFileStatus.estimatedSize should overflow to negative 
integer
    +     * so, make it between 2bn and 4bn
    +     */
    +    val files = (1 to 3).map { i =>
    +      new MyFileStatus()
    +    }
    +    val fileStatusCache = FileStatusCache.getOrCreate(spark)
    +    fileStatusCache.putLeafFiles(new Path("/tmp", "abc"), files.toArray)
    +    // scalastyle:off
    --- End diff --
    
    Lets remove this comment block, the JIRA should be used for tracking these 
things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to