GitHub user msiddalingaiah opened a pull request: https://github.com/apache/spark/pull/1090
[SPARK-983] Support external sorting This satisfies SPARK-983 Support external sorting for RDD#sortByKey() It also adds a general sortPartitions() method to RDD. You can merge this pull request into a Git repository by running: $ git pull https://github.com/msiddalingaiah/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1090.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1090 ---- commit a4f7404ac28a913b4b5cc390e3f321ae6627b941 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-05-05T03:23:46Z Address SPARK-1717 commit db47ac52f293b32af277fb7dfdaa6f484e7dde93 Author: Siddalingaiah Madhukesh <siddalingaiahm@ip-10-191-37-223.ec2.internal> Date: 2014-05-05T14:55:57Z Merge remote-tracking branch 'upstream/master' commit eb0d39f7603627a86ca4248a260a59fd041d7068 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-01T03:01:54Z Merge remote-tracking branch 'upstream/master' commit a76b3a15bba7d23fb4d3cdd5cc57f0c8bef8efca Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-01T18:16:32Z Add sortPartitions(...) commit f7dde2ecd38e099cdfc97336230d8bfc473c8179 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-11T00:50:29Z Merge remote-tracking branch 'upstream/master' commit 63825ba3ff21a72a1308e97c9b5fe3368a86c68e Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-11T02:33:44Z Basic merge sert commit 1d7709e5f013fc76bca92a97e7108efbc38071ec Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-12T00:30:08Z [SPARK-983][WIP] Add DiskBuffer/DiskBufferIterator commit 5c2b55ab314b9c961a41910076d46820eef5e2c1 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-12T01:39:11Z [SPARK-983][WIP] refactor code/spill sublists commit 38a82c4be1e5d99fce9ed50660c2062f80af3584 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-14T03:35:52Z Merge remote-tracking branch 'upstream/master' commit 3dd4be624a082214113fbf3f7b2ab655cd14bf92 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-14T03:57:33Z [SPARK-983][WIP] fetch upstream/refactor code commit 5ffdf96780e5a48c2542c3bb10c4459c9fbaea00 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-14T18:47:21Z [SPARK-983][WIP] add SizeEstimator commit ad1a1a8b1a2e28d3bd900a0075c7a0e2ff741a20 Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-15T01:51:58Z [SPARK-983] complete disk spill logic commit 929955744fc401808a7c5ad72e581cf5a62ade0e Author: msiddalingaiah <ma...@madhu.com> Date: 2014-06-15T01:53:05Z Merge remote-tracking branch 'upstream/master' ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---