[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308171#comment-14308171
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---------------------------------------

If the counters are wrong then that's a separate JIRA that I think would be 
very well worth fixing in 2.x.  However IIUC this isn't about fixing incorrect 
counter values, rather it's about removing counters.

I can see the value of storing the separate counters, since they are not 
exactly equivalent.  One of them records the amount of bytes written to the 
filesystem overall during the life of the task, while the other records the 
amount of data written to the filesystem during the output collector's write 
method.  For many jobs these will be the same values, however if the task was 
doing out-of-band I/O with the filesystems outside of the output collector 
write method then they will not be equivalent.  Comparing these counters could 
be used to audit tasks that aren't writing data through the normal framework 
channels.

> Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
> --------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5847
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, mrv2, task
>    Affects Versions: 2.4.0
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch
>
>
> Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
> counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to