[GitHub] spark pull request: SPARK-3655 GroupByKeyAndSortValues

koertkuipers Wed, 24 Dec 2014 17:35:52 -0800

GitHub user koertkuipers reopened a pull request:

    https://github.com/apache/spark/pull/3632


    SPARK-3655 GroupByKeyAndSortValues

    See https://issues.apache.org/jira/browse/SPARK-3655
    
    This pullreq is based on the approach that uses 
repartitionAndSortWithinPartition, but only implements GroupByKeyAndSortValues 
and not foldLeft.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tresata/spark 
feat-group-by-key-and-sort-values

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3632.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3632
    
----
commit 7e3cde989ec93849d60988e6d9fae729ca0c46a4
Author: Koert Kuipers <ko...@tresata.com>
Date:   2014-12-07T20:16:53Z

    works but Iterables in signature are not right

commit 42075338a32c40e4b962b547dbc74aad89351207
Author: Koert Kuipers <ko...@tresata.com>
Date:   2014-12-07T21:57:25Z

    change groupByKeyAndSortValues to return RDD[(K, TraversableOnce[V]) 
instead of RDD[(K, Iterable[V]). i dont think the Iterable version can be 
implemented efficiently

commit 4f7defe86c514f3d153feaed804cf77f1d402f63
Author: Koert Kuipers <ko...@tresata.com>
Date:   2014-12-10T14:44:18Z

    change groupByKeyAndSortValues to return RDD[(K, Iterable[V]) where the 
values (the iterables) are in-memory arrays

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3655 GroupByKeyAndSortValues

Reply via email to