[
https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Owen O'Malley updated HADOOP-2399:
----------------------------------
Attachment: 2399-3.patch
This patch fixes the value iterator to reuse the key and value between
iterations. Aggregation was assuming that the reduce inputs where not reused,
so I stringified the value. Is that ok, Runping? I got a minor speed up of 2:33
instead of 2:37 on a simple 1 node word count.
> Input key and value to combiner and reducer should be reused
> ------------------------------------------------------------
>
> Key: HADOOP-2399
> URL: https://issues.apache.org/jira/browse/HADOOP-2399
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.1
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
> Attachments: 2399-3.patch, reuse-obj-2.patch, reuse-obj.patch
>
>
> Currently, the input key and value are recreated on every iteration for input
> to the combiner and reducer. It would speed up the system substantially if we
> reused the keys and values. The down side of doing it, is that it may break
> applications that count on holding references to previous keys and values,
> but I think it is worth doing.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.