[
https://issues.apache.org/jira/browse/HADOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493233
]
Hadoop QA commented on HADOOP-485:
----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12356672/485.patch applied and
successfully tested against trunk revision r534624.
Test results:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/105/testReport/
Console output:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/105/console
> allow a different comparator for grouping keys in calls to reduce
> -----------------------------------------------------------------
>
> Key: HADOOP-485
> URL: https://issues.apache.org/jira/browse/HADOOP-485
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Affects Versions: 0.5.0
> Reporter: Owen O'Malley
> Assigned To: Tahir Hashmi
> Attachments: 485.patch, 485.patch, 485.patch, 485.patch, 485.patch,
> Hadoop-485-pre.patch, TestUserValueGrouping.java.patch
>
>
> Some algorithms require that the values to the reduce be sorted in a
> particular order, but extending the key with the additional fields causes
> them to be handled by different calls to reduce. (The user then collects the
> values until they detect a "real" key change and then processes them.)
> It would be much easier if the framework let you define a second comparator
> that did the grouping of values for reduces. So your reduce inputs look like:
> A1, V1
> A2, V2
> A3, V3
> B1, V4
> B2, V5
> instead of getting calls to reduce that look like:
> reduce(A1, {V1}); reduce(A2, {V2}); reduce(A3, {V3}); reduce(B1, {V4});
> reduce(B2, {V5});
> you could define the grouping comparator to just compare the letters and end
> up with:
> reduce(A1, {V1,V2,V3}); reduce(B1, {V4,V5});
> which is the desired outcome. Note that this assumes that the "extra" part of
> the key is just for sorting because the reduce will only see the first
> representative of each equivalence class.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.