[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308116#comment-14308116
 ] 

Allen Wittenauer commented on MAPREDUCE-5847:
-

Incompatible changes can go into trunk.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308129#comment-14308129
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

bq. Incompatible changes can go into trunk.

Understood, but I'm arguing we shouldn't break incompatibility without 
sufficient merit.  Each incompatibility instance is a hurdle someone needs to 
jump to move from Hadoop 2.x to Hadoop 3.x.  Hence I'm wondering if others feel 
this is worth adding another hurdle or not.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308140#comment-14308140
 ] 

Allen Wittenauer commented on MAPREDUCE-5847:
-

This seems like such a low risk and, as it is today, aren't we actually 
reporting wrong information? That's significantly worse!  (I know of one vendor 
that is actually mentions that they report correct values for some metrics 
since we blow it so badly in lots of places...)

While I understand the concerns about moving from 2.x to 3.x, users should 
expect some degree of pain when moving major versions. 

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308171#comment-14308171
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

If the counters are wrong then that's a separate JIRA that I think would be 
very well worth fixing in 2.x.  However IIUC this isn't about fixing incorrect 
counter values, rather it's about removing counters.

I can see the value of storing the separate counters, since they are not 
exactly equivalent.  One of them records the amount of bytes written to the 
filesystem overall during the life of the task, while the other records the 
amount of data written to the filesystem during the output collector's write 
method.  For many jobs these will be the same values, however if the task was 
doing out-of-band I/O with the filesystems outside of the output collector 
write method then they will not be equivalent.  Comparing these counters could 
be used to audit tasks that aren't writing data through the normal framework 
channels.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308114#comment-14308114
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

Looks like the patch still applies, but I'm not sure this should go in per the 
incompatibility concerns I raised earlier.  I don't think the benefits of this 
change are worth that cost, even if this just goes into trunk.  Thoughts?

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308132#comment-14308132
 ] 

Gera Shegalov commented on MAPREDUCE-5847:
--

Agreed, let us close it 'Won't fix'

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2015-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307945#comment-14307945
 ] 

Hadoop QA commented on MAPREDUCE-5847:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12640868/MAPREDUCE-5847.v02.patch
  against trunk revision e1990ab.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5166//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5166//console

This message is automatically generated.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2014-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973947#comment-13973947
 ] 

Hadoop QA commented on MAPREDUCE-5847:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12640794/MAPREDUCE-5847.v01.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4533//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4533//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4533//console

This message is automatically generated.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2014-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974443#comment-13974443
 ] 

Hadoop QA commented on MAPREDUCE-5847:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12640868/MAPREDUCE-5847.v02.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4535//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4535//console

This message is automatically generated.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2014-04-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974577#comment-13974577
 ] 

Jason Lowe commented on MAPREDUCE-5847:
---

Isn't this an incompatible change?  It looks like this will no longer compute 
anything for the 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter.BYTES_WRITTEN 
counter which will break clients who are expecting this to be present on the 
job.

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5847) Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask

2014-04-18 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974625#comment-13974625
 ] 

Gera Shegalov commented on MAPREDUCE-5847:
--

That's a good point, [~jlowe]

 Remove redundant code for fileOutputByteCounter in MapTask and ReduceTask 
 --

 Key: MAPREDUCE-5847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5847
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5847.v01.patch, MAPREDUCE-5847.v02.patch


 Both MapTask and ReduceTask carry redundant code to update BYTES_WRITTEN 
 counter. However, {{Task.updateCounters}} uses file system stats for this. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)