[ https://issues.apache.org/jira/browse/HADOOP-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537052 ]
Doug Cutting commented on HADOOP-1981: -------------------------------------- I'd rather keep this separate from HADOOP-2046, since it not just documentation, but an incompatible code change. As for names, I still like having 'output' in them, to remove potential confusion with join-like stuff that operates on inputs. We probably don't need 'key' in their name, since only keys are comparable anyway. So I'd vote for outputSortComparator and outputGroupComparator. Perhaps in HADOOP-2046 we should document "grouping" as a primary mapreduce pipeline stage: map, (combine), sort, group, reduce? > Need to document the controls for sorting and grouping into the reduce > ---------------------------------------------------------------------- > > Key: HADOOP-1981 > URL: https://issues.apache.org/jira/browse/HADOOP-1981 > Project: Hadoop > Issue Type: Task > Components: mapred > Reporter: Owen O'Malley > Assignee: Arun C Murthy > > The JavaDoc for the Reducer should document how to control the sort order of > keys and values via the JobConf methods: > {code} > setOutputKeyComparatorClass > setOutputValueGroupingComparator > {code} > Both methods desperately need better names. (I'd vote for > setKeySortingComparator and setKeyGroupingComparator.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.