[
https://issues.apache.org/jira/browse/HADOOP-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Owen O'Malley resolved HADOOP-1827.
-----------------------------------
Resolution: Won't Fix
> Reducer.reduce method's OutputCollector is too strict, it shoudn't need the
> key to be WritableComparable
> --------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1827
> URL: https://issues.apache.org/jira/browse/HADOOP-1827
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.14.0
> Reporter: Arun C Murthy
>
> The output of the {{Reducer}}'s reduce method is *not* sorted, hence the
> {{OutputCollector}} passed to it shouldn't require the *key* to be
> {{WritableComparable}}; passing a {{Writable}} should suffice.
> Thus
> {code: title=Reducer.java}
> public interface Reducer<K2 extends WritableComparable, V2 extends Writable,
> K3 extends WritableComparable, V3 extends Writable>
> extends JobConfigurable, Closeable {
> void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output,
> Reporter reporter)
> throws IOException;
> }
> {code}
> should, technically, be:
> {code: title=Reducer.java}
> public interface Reducer<K2 extends WritableComparable, V2 extends Writable,
> K3 extends Writable, V3 extends Writable>
> extends JobConfigurable, Closeable {
> void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output,
> Reporter reporter)
> throws IOException;
> }
> {code}
> Pros:
> It removes an artificial limitation where it forces applications to emit
> <{{WritableComparable}}, {{Writable}}> pair, rather than a <{{Writable}},
> {{Writable}}> pair, there-by easing some applications (I ran into a few
> recently... admittedly trivial ones).
> Cons:
> 1. We now need a separate {{Combiner}} interface, since the combiner's
> {{OutputCollector}} *needs* to be able to sort keys, hence requires a
> {{WritableComparable}} - same as the {{Mapper}}.
> 2. We need a separate {{SortableOutputCollector}} (for
> {{Mapper}}/{{Combiner}}) and a {{NonSortableOutputCollector}} (for
> {{Reducer}}).
> 3. Alas! As a consequence of (1) & (2)we cannot use the same class as both a
> {{Reducer}} and {{Combiner}} anymore, a serious compatibility issue.
> The purpose of this issue is two-fold:
> 1. Spark a discussion among folks, both hadoop-dev & hadoop-users, to figure
> if this really is a problem i.e. do folks really care about this anomaly in
> the existing {{Reducer}} interface? Also, is it worth the pain (@see 'Cons')
> to go fix it.
> 2. Even if we decide to live with it, this issue could record for posterity
> why we love hadoop, warts and all. *smile*
> Lets discuss...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.