[ 
https://issues.apache.org/jira/browse/CRUNCH-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478274#comment-13478274
 ] 

Josh Wills commented on CRUNCH-96:
----------------------------------

I ran into it on a machine learning project I was working on, and it seems to 
come up fairly often in sessionization applications (e.g., group by user ID, 
sort events by timestamp), viz., 
https://www.google.com/search?q=mapreduce+secondary+sort

Your point on naming well-taken: this isn't a total ordering on the keys, it's 
just a sort on the values going into the reducer. Something more like 
GroupByKeyWithSecondarySort would be more accurate (albeit more verbose.) 
Recommendations?
                
> Add secondary sort functionality to o.a.c.lib
> ---------------------------------------------
>
>                 Key: CRUNCH-96
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-96
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core, MapReduce Patterns
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>             Fix For: 0.4.0
>
>         Attachments: CRUNCH-96.patch
>
>
> I've been working on a problem that required a secondary sorting pattern that 
> was very similar to the example that Alex Kozlov created in CRUNCH-78, so it 
> would be good to extract the pattern from the example and move it to 
> o.a.c.lib so it can be easily available to clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to