dongjoon-hyun commented on issue #24865: [SPARK-27100][SQL] Use `Array` instead of `Seq` in `FilePartition` to prevent `StackOverflowError ` URL: https://github.com/apache/spark/pull/24865#issuecomment-504296393 Without the patch, this test case fails with `IllegalArgumentException` due to `Array(partitionDirectory)` type mismatch. (Of course, it's natural.) ``` [info] - SPARK-27100 stack overflow: read data with large partitions *** FAILED *** (5 milliseconds) [info] java.lang.IllegalArgumentException: Can't find a private method named: createBucketedReadRDD ``` If we use `Array(partitionDirectory).toSeq` or `Seq(partitionDirectory)`, this test case cannot detect the regression. In other words, it succeeds without the patch. The test case fails only when we use the original form `Stream(partitionDirectory)`. However, `Stream(partitionDirectory)` will raise `IllegalArgumentException` with the patch due to the same type mismatch reason. To sum up, it seems that we cannot have a meaningful test case at this level. To prevent a future regression, we need a higher level test case. Could you try to add that, @parthchandra ?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org