Repository: spark Updated Branches: refs/heads/master 3cff81615 -> 07fcbea51
[SPARK-18800][SQL] Correct the assert in UnsafeKVExternalSorter which ensures array size ## What changes were proposed in this pull request? `UnsafeKVExternalSorter` uses `UnsafeInMemorySorter` to sort the records of `BytesToBytesMap` if it is given a map. Currently we use the number of keys in `BytesToBytesMap` to determine if the array used for sort is enough or not. We has an assert that ensures the size of the array is enough: `map.numKeys() <= map.getArray().size() / 2`. However, each record in the map takes two entries in the array, one is record pointer, another is key prefix. So the correct assert should be `map.numKeys() * 2 <= map.getArray().size() / 2`. ## How was this patch tested? N/A Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Liang-Chi Hsieh <vii...@gmail.com> Closes #16232 from viirya/SPARK-18800-fix-UnsafeKVExternalSorter. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/07fcbea5 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/07fcbea5 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/07fcbea5 Branch: refs/heads/master Commit: 07fcbea516cda66498b9346467a34733f14e8605 Parents: 3cff816 Author: Liang-Chi Hsieh <vii...@gmail.com> Authored: Sat Dec 24 12:05:49 2016 +0000 Committer: Sean Owen <so...@cloudera.com> Committed: Sat Dec 24 12:05:49 2016 +0000 ---------------------------------------------------------------------- .../org/apache/spark/sql/execution/UnsafeKVExternalSorter.java | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/07fcbea5/sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java ---------------------------------------------------------------------- diff --git a/sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java b/sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java index 0d51dc9..ee5bcfd 100644 --- a/sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java +++ b/sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java @@ -97,7 +97,9 @@ public final class UnsafeKVExternalSorter { canUseRadixSort); } else { // The array will be used to do in-place sort, which require half of the space to be empty. - assert(map.numKeys() <= map.getArray().size() / 2); + // Note: each record in the map takes two entries in the array, one is record pointer, + // another is the key prefix. + assert(map.numKeys() * 2 <= map.getArray().size() / 2); // During spilling, the array in map will not be used, so we can borrow that and use it // as the underlying array for in-memory sorter (it's always large enough). // Since we will not grow the array, it's fine to pass `null` as consumer. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org