[ https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308171#comment-14308171 ]
Jason Lowe commented on MAPREDUCE-5847: --------------------------------------- If the counters are wrong then that's a separate JIRA that I think would be very well worth fixing in 2.x. However IIUC this isn't about fixing incorrect counter values, rather it's about removing counters. I can see the value of storing the separate counters, since they are not exactly equivalent. One of them records the amount of bytes written to the filesystem overall during the life of the task, while the other records the amount of data written to the filesystem during the output collector's write method. For many jobs these will be the same values, however if the task was doing out-of-band I/O with the filesystems outside of the output collector write method then they will not be equivalent. Comparing these counters could be used to audit tasks that aren't writing data through the normal framework channels. > Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask > -------------------------------------------------------------------------- > > Key: MAPREDUCE-5847 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2, task > Affects Versions: 2.4.0 > Reporter: Gera Shegalov > Assignee: Gera Shegalov > Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch > > > Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN > counter. However, {{Task.updateCounters}} uses file system stats for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)