Github user ash211 commented on the issue: https://github.com/apache/spark/pull/20372 Tagging folks who have touched this code recently: @vgankidi @ericl @davies This seems to provide a more compact packing in every scenario, which should improve execution times. One risk is that individual partitions are no longer always contiguous ranges of files in order, but rather sometimes they have a gap. In the test this is the `(file1, file6)` partition. If something depends on this past behavior it could now break, though I don't think anything should be requiring this partition ordering.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org