[
https://issues.apache.org/jira/browse/HADOOP-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Bowen updated HADOOP-1146:
--------------------------------
Attachment: 1146.patch
This patch:
1. Renames the counter Reduce Input Records to Reduce Input Groups since
that what it counts.
2. Adds a new counter called Reduce Input Records that does count the
records.
3. Then when testing on Wordcount, I noticed that Map Output Records and
Reduce Input Records were not the same because of the use of a Combiner. So I
added two new counters to show this: Combine Input Records and Combine Output
Records.
I'm not sure if we really need these Combine Input/Output record counters. At
the end of the job, they should be the same as Map Output Records and Reduce
Input Records respectively, but they are possibly interesting to watch as the
job proceeds.
Comments welcome.
> "Reduce input records" counter name is misleading
> -------------------------------------------------
>
> Key: HADOOP-1146
> URL: https://issues.apache.org/jira/browse/HADOOP-1146
> Project: Hadoop
> Issue Type: Bug
> Reporter: David Bowen
> Assigned To: David Bowen
> Attachments: 1146.patch
>
>
> It has been pointed out that the counter name "reduce input records" is
> misleading; this number should be called "reduce input keys" or "reduce input
> groups". It could also be useful to have the actual number of reduce input
> records, which should be the same as the number of map output records.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.