[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632742#comment-13632742
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Yarn-trunk #185 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/185/])
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh 
via bobby) (Revision 1468232)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468232
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: trunk, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-16 Thread Sonu Prathap (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632808#comment-13632808
 ] 

Sonu Prathap commented on MAPREDUCE-4974:
-

Could somebody kindly do a check, share , update with the Test Case with 
compressed input in TestLineRecordReader, and recheck MAPREDUCE 4974.

please refer, MAPREDUCE-5143

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: trunk, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632815#comment-13632815
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Hdfs-trunk #1374 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1374/])
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh 
via bobby) (Revision 1468232)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468232
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: trunk, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632851#comment-13632851
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Mapreduce-trunk #1401 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1401/])
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh 
via bobby) (Revision 1468232)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468232
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: trunk, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-15 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631916#comment-13631916
 ] 

Robert Joseph Evans commented on MAPREDUCE-4974:


The code changes look fine to me too, but I want to run a few more tests than 
the precommit build does.  I just don't want to have to pull it out yet again.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-15 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632220#comment-13632220
 ] 

Robert Joseph Evans commented on MAPREDUCE-4974:


The changes look fine and the tests passed, so I am going to check this into 
trunk and branch-2.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632245#comment-13632245
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-trunk-Commit #3614 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3614/])
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh 
via bobby) (Revision 1468232)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468232
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: trunk, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-11 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628715#comment-13628715
 ] 

Gera Shegalov commented on MAPREDUCE-4974:
--

[~gelesh], I am not a committer. It looks fine to me. While waiting for 
committers' review, if you have not done so yet you probably should rerun 
locally the inputformat-related test suites to make sure nothing is broken. And 
 you should maybe also try to quantify the performance improvement using some 
local MR job.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-11 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628855#comment-13628855
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~jira.shegalov] .. thanks for sharing  your thoughts,

I have tested using JUnit run of TestLineRecordReader , but as of now, for 
compressed input test case is not incorporated in TestLineRecordReader. Thats a 
place we need to cross check, but hope the code would hold good, because 
modification in this area is minimal.

The aim was to perfomance enhance, by removing the null check ..  but the 
incompatibility with any build happen upon the existing may give NPE , as 
discussed above ([~snihalani]'s comments,

The patch was limited to
1) removing the null assignments for the key  Value  
2) limiting CompressionCodecFactory ,  and Codec to method local scope
3) removing line 170-173

 if (newSize == 0) {
   break;
  }
Unnecessary ==0 check inside a look. ... Because the code to handle this is 
there iut side the loop, and the code which does the same seems of no value add.

4)  in order to achieve point 2 , private boolen isCompressedInput variable was 
introduces instead if 
private boolean isCompressedInput();
 method.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-08 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626095#comment-13626095
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~jlowe], [~revans2], [~jira.shegalov] , [~snihalani], [~kkambatl] 
Could any body please share views / suggestions ...  

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622452#comment-13622452
 ] 

Gelesh commented on MAPREDUCE-4974:
---

Since this is just an optimization and existing test case would suffice, hope 
this is +1
Could some body kindly review.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-03 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620877#comment-13620877
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~hadoopqa] Automated check is yet to act over patch .5. Kindly advice.
Please refer the review board https://reviews.apache.org/r/9440/diff/3/

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621241#comment-13621241
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576759/MAPREDUCE-4974.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3494//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3494//console

This message is automatically generated.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619696#comment-13619696
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Yarn-trunk #173 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/173/])
Reverted MAPREDUCE-4974 because of test failures. (Revision 1463359)
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh via 
bobby) (Revision 1463221)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463359
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java

bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463221
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619754#comment-13619754
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Hdfs-0.23-Build #571 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/571/])
Reverted MAPREDUCE-4974 because of test failures. (Revision 1463361)
svn merge -c 1463221 FIXES: MAPREDUCE-4974. Optimising the LineRecordReader 
initialize() method (Gelesh via bobby) (Revision 1463224)

 Result = UNSTABLE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463361
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java

bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463224
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619767#comment-13619767
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Hdfs-trunk #1362 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1362/])
Reverted MAPREDUCE-4974 because of test failures. (Revision 1463359)
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh via 
bobby) (Revision 1463221)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463359
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java

bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463221
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619812#comment-13619812
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~jira.shegalov], I too apologise for not noticing you comments over review 
board. I had not much idea about review board, and was expecting the review 
comments over here(Jira).

Thanks for sharing your thoughts.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619813#comment-13619813
 ] 

Gelesh commented on MAPREDUCE-4974:
---


[~jira.shegalov], [~revans2],
I would suggest to have isCompressedInput a private boolean variable by default 
false, instead of isCompressedInput() method.
This would help us to reduce the scope of Codec object along with 
CompressionCodecFactory object, to local. Which as of now is a class variable ?

I would be patching this modification shortly.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619823#comment-13619823
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-Mapreduce-trunk #1389 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1389/])
Reverted MAPREDUCE-4974 because of test failures. (Revision 1463359)
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh via 
bobby) (Revision 1463221)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463359
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java

bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463221
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-01 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618961#comment-13618961
 ] 

Robert Joseph Evans commented on MAPREDUCE-4974:


Sorry it has taken me so long to take a look.

The patch looks fine to me too.  All of the optimizations look fine and seem to 
not change anything with the existing behavior.  +1. I'll check it in.  If you 
want similar changes in the 1.0 line you will need to post a separate patch for 
that.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619037#comment-13619037
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-trunk-Commit #3542 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3542/])
MAPREDUCE-4974. Optimising the LineRecordReader initialize() method (Gelesh 
via bobby) (Revision 1463221)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463221
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-01 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619448#comment-13619448
 ] 

Gera Shegalov commented on MAPREDUCE-4974:
--

I already wondered why my comment on reviewboard about this is being ignored.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-01 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619455#comment-13619455
 ] 

Robert Joseph Evans commented on MAPREDUCE-4974:


I pulled the patch out.  Sorry [~gera] I did not see the review board comments. 
 I will try to be more complete next time. All of the other times that I have 
used review board the comments were posted to the JIRA, but I guess that is not 
the case any more. Either way it is my fault.

The test that caught this too is TestMRKeyValueTextInputFormat.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-04-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619548#comment-13619548
 ] 

Hudson commented on MAPREDUCE-4974:
---

Integrated in Hadoop-trunk-Commit #3546 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3546/])
Reverted MAPREDUCE-4974 because of test failures. (Revision 1463359)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463359
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.23.7, 2.0.5-beta

 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-03-31 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618339#comment-13618339
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~revans2], [~jlowe],
Could any of you please act on this ? 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-03-04 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592160#comment-13592160
 ] 

Arun A K commented on MAPREDUCE-4974:
-

@ All, Can we mark this issue as resolved? 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-03-04 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592344#comment-13592344
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


It doesn't work that way. One of the committers  will take a look at this, 
commit it to the codebase and then it will be marked as resolved. 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-03-01 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590584#comment-13590584
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


+1

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-03-01 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590639#comment-13590639
 ] 

Karthik Kambatla commented on MAPREDUCE-4974:
-

+1

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-28 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589433#comment-13589433
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~ak.a...@aol.com] has put this patch on review board.(Thanks AK)
[~snihalani], Please reffer this link, to visualize the patch diffrence
https://reviews.apache.org/r/9440/diff/#index_header

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-28 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589434#comment-13589434
 ] 

Arun A K commented on MAPREDUCE-4974:
-

Updated the review request url with the latest patch. Please find the same at - 
https://reviews.apache.org/r/9440/


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-27 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588418#comment-13588418
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


[~gelesh], Can you put this to review board give me a link to look at the final 
copy of the file like last time?

Thanks.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
 MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-26 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586946#comment-13586946
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~snihalani], I think you reffred an old patch,
Please look at  MAPREDUCE-4974.4.patch

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-25 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585797#comment-13585797
 ] 

Arun A K commented on MAPREDUCE-4974:
-

As [~gelesh] has mentioned, we had in mind, elimination of repeated null 
checks, while trying to optimize the code. If it is of not much significance, 
please go ahead with the latest available patch containing the rest of changes.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-25 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586823#comment-13586823
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


The one line of code that seems to be missing is key.set(pos). where is that 
being handled?

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584134#comment-13584134
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12570453/MAPREDUCE-4974.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3355//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3355//console

This message is automatically generated.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-22 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584291#comment-13584291
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~snihalani],
Please review,

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-22 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584384#comment-13584384
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


Is this is the final patch you are proposing to submit? It seems like it's a 
diff of only last comment's changes. 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-22 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585001#comment-13585001
 ] 

Gelesh commented on MAPREDUCE-4974:
---

My self and [~ak.a...@aol.com] , are of.the opinion that we should also do 
something upon therepeated null check, 
Ans per the discussions over here ,  that part of optimization, seems to be non 
atracrtive. Hence the latest patch , we had eliminated null chech change. The 
remaining changes done, are mentioned iin comment. Please review


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-21 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583195#comment-13583195
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


I don't think we return null if there is no key. So it no longer follows the 
contract at 
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/RecordReader.html#getCurrentKey%28%29

Can you confirm it works?

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-21 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583403#comment-13583403
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~snihalani],
Thanks for bring up that very valid point.
In That Case, What if we eliminate the null check for Value alone,
And keep the Null Check for Key as such .. ?


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-21 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583406#comment-13583406
 ] 

Gelesh commented on MAPREDUCE-4974:
---

And Also, as [~ak.a...@aol.com] has mentioned,
1) To avoid ' if (newSize == 0) ' check inside the loop,
2) if we have ' compressionCodecs  codec instantiated only if its a compressed 
input. '

Hope These two points are valid,
Please share your thoughts...

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-21 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583592#comment-13583592
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


I don't see the null check for key/value being the beneficial part of the 
optimization. Can you post the patch and I'll review it?

I agree with the change 
bq. 2) if we have ' compressionCodecs  codec instantiated only if its a 
compressed input. '


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-19 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581170#comment-13581170
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon], [~snihalani], [~kkambatl]
Please share your thoughts

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-15 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579453#comment-13579453
 ] 

Arun A K commented on MAPREDUCE-4974:
-

Please find the review request. https://reviews.apache.org/r/9440/

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-13 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577552#comment-13577552
 ] 

Gelesh commented on MAPREDUCE-4974:
---

The  Existing test case is enough , because its just a code optimization,
Could any body, have a look and comment please .. ?


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576485#comment-13576485
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12568943/MAPREDUCE-4974.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3327//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3327//console

This message is automatically generated.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
 MAPREDUCE-4974.3.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575745#comment-13575745
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12568811/MAPREDUCE-4974.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3323//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3323//console

This message is automatically generated.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-11 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575754#comment-13575754
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~hadoopqa],
 as mentioned before, this is just an improvement.
 No new features added or removed.
 Existing test case holds for this as well.


 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575906#comment-13575906
 ] 

Karthik Kambatla commented on MAPREDUCE-4974:
-

The latest approach seems correct at a first glance. However, the patch seems 
to be modifying the entire file, may be due to re-formatting. It would really 
help if it could only show the changed code.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-11 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576007#comment-13576007
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~ak.a...@aol.com] Seems like reformatting has shown entire LOC as new LOC, 
instead of changes alone.
[~kkambatl], do we need to really re put the patch, so that the patch size 
would reduce. If not, could you please act or advice over the course of action.
In case, if some changes over the code is required, please mention that too, 
our next patch would incorporate the same.
  

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-08 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574334#comment-13574334
 ] 

Arun A K commented on MAPREDUCE-4974:
-

If someone could add their review comments, we could look on for the mentioned 
changes. https://reviews.apache.org/r/9287/

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-08 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574347#comment-13574347
 ] 

Karthik Kambatla commented on MAPREDUCE-4974:
-

Looks like calling {{#nextKeyValue()}} after it has already returned false will 
throw a NPE. IIUC, it should continue to return false.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572436#comment-13572436
 ] 

Arun A K commented on MAPREDUCE-4974:
-

Kindly advice if the optimization is worth. 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572534#comment-13572534
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


One advantage we also might not be able to quantify easily would be in garbage 
collection. 

I was looking through implementation of {{LineRecordReader.nextKeyValue}}, in 
the end of the parsing, we set key and value to null. In the next call, we will 
have them as null and if the while condition of {{getFilePosition = end}} 
evaluates to {{true}}, then, we'll hit NPE because in.readLine(Text t..) does 
t.clear() first. Would there be any case in which the condition would evaluate 
to true even when we are done?
I know this probably won't happen because user will wrap this in a 
{{while(nextKeyValue())}}, I just wanted to be sure that we won't hit NPEs 
after this change, even for buggy programs.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572627#comment-13572627
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~snihalani],

 .. while condition of getFilePosition = end evaluates to true, then, we'll 
hit NPE ..
The Text object value, which is pased to readLine, would not be null, since 
that is taken care at initialize method, which is called prior to 
nextKeyValue().

While(nextKeyValue()) loop would end at once, the newSize (the size of newly 
fetched value equals zero.
Here Key And Value , are set to null.
But they aren't referred any more after While(nextKeyValue()) loop, and so NPE 
is not likely to occur.

Please verify, and kindly correct me if we have gone wrong, some where.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Surenkumar Nihalani (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573022#comment-13573022
 ] 

Surenkumar Nihalani commented on MAPREDUCE-4974:


I am referring to a buggy program that calls nextKeyValue even after we return 
false. I just wanted to be sure that the check in the while loop will guard us 
from calling in.readLine(). in.readLine when passed a null Text type, we will 
hit NPE. That's the case I am trying to be safe about.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573199#comment-13573199
 ] 

Gelesh commented on MAPREDUCE-4974:
---

As [~snihalani] has mentioned, a buggy programs that may call next KeyValue..
condition though being a little hypothetical, but still possible.

1) Inorder to avoid that, shall we have the null assignment of key  value in 
close() method.?
2) Also shall, we have compressionCodecs also assigned as null  ?

Either me or [~ak.a...@aol.com] would upload a re work on the same.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-06 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573202#comment-13573202
 ] 

Gelesh commented on MAPREDUCE-4974:
---


Also, this change has instantiated objects related to compression, only if its 
a compressed file

Inorder to ship the first line, a readLine is called, and this change would not 
create a new Text, but use the available 'value' for the method call. 

Hope some body could share their thoughts on this two changes as well.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570140#comment-13570140
 ] 

Gelesh commented on MAPREDUCE-4974:
---

Some body please review the patch,
I couldnt even see the hadoop QA running on this.
Kindly advice

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570145#comment-13570145
 ] 

Hadoop QA commented on MAPREDUCE-4974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12567831/MAPREDUCE-4974.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3297//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3297//console

This message is automatically generated.

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) optimising the LineRecordReader initialize method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570154#comment-13570154
 ] 

Gelesh commented on MAPREDUCE-4974:
---

Its a improvement to the existing, no new features added or deleted,
And hence, existing test case would suffice.

 optimising the LineRecordReader initialize method
 -

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570161#comment-13570161
 ] 

Arun A K commented on MAPREDUCE-4974:
-

Quoting the review request url for this issue - 
https://reviews.apache.org/r/9287/

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570547#comment-13570547
 ] 

Todd Lipcon commented on MAPREDUCE-4974:


Do you have any benchmark that shows this helps? Null checks can often be 
completely optimized out by the JIT, or at least hoisted out of the tight loop.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570567#comment-13570567
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon]
nextKeyValue() is called as many number of times, the delimiter, or the new 
line has occurred, with in a given split.
Each Time, it executes the below code,

-if (key == null) {
-  key = new LongWritable();
-}
-key.set(pos);
-if (value == null) {
-  value = new Text();
-}

Only at the first iteration, the condition would hold true, and Key Value 
objects would be created.
This could also be done, if we have Key  Value objects created at the 
initialize phase, and we can skip this null check.

Also,
-compressionCodecs = new CompressionCodecFactory(job);
-codec = compressionCodecs.getCodec(file);
Need to be done , only when it uses a compressed input file. This change is 
also brought. 

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571123#comment-13571123
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon]
I tried out an estimation,on Local, with small data, subtracting the the long 
value obtained from System.nanoTime() at the beginning and at the end of the 
method.

Average time difference was 200 Nano Seconds per each anomic call made to 
nextKeyValue(), excluding the very first call, since it involves the object 
creation.

The total time difference would be 200 * number of Key Value pairs generated 
per each Map Task.

 Optimising the LineRecordReader initialize() method
 ---

 Key: MAPREDUCE-4974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, mrv2, performance
Affects Versions: 2.0.2-alpha, 0.23.5
 Environment: Hadoop Linux
Reporter: Arun A K
Assignee: Gelesh
  Labels: patch, performance
 Fix For: 0.20.204.0, 0.24.0

 Attachments: MAPREDUCE-4974.1.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 I found there is a a scope of optimizing the code, over initialize() if we 
 have compressionCodecs  codec instantiated only if its a compressed input.
 Mean while Gelesh George Omathil, added if we could avoid the null check of 
 key  value. This would time save, since for every next key value generation, 
 null check is done. The intention being to instantiate only once and avoid 
 NPE as well. Hope both could be met if initialize key  value over  
 initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira