Github user advancedxy commented on the pull request: https://github.com/apache/spark/pull/4783#issuecomment-78435831 @shivaram The reason why ExternalSorter failed is that It doesn't spilling files for these two failure tests. ( If we increase the input from `0 until 100000` to `0 until 200000`, It will spilling files to disks, which passes the tests) However the input type for sorter is Iterator[(Int, Int)], and the older SizeEstimator gives 32 bytes and the new SizeEstimator gives the same 32 bytes for (Int, Int) (64 bit JVM with UseCompressedOops on assumed). So, It's very wried to see different results. Seems @mateiz is busy. @jerryshao cloud you take a look at the failed tests, since you wrote some of the tests. And, I believe there is another bug in the current SizeEstimator, Scala specialized Int, Long, Float, Double for Tuples, So, The size of (Int, Int) should be 24 bytes, rather than 32bytes. Verified by the method introduced by this article http://www.javaworld.com/article/2077496/testing-debugging/java-tip-130--do-you-know-your-data-size-.html. I will take a look at the size problem.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org