[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gelesh updated MAPREDUCE-4974: ------------------------------ Attachment: MAPREDUCE-4974.4.patch Two Changes, 1) if (newSize == 0) { break; } if (newSize < maxLineLength) { break; } The newSize==0 check is eliminated since, (newSize < maxLineLength) check includes that condition as well. The (newSize == 0) check outside the loop is retained as such. 2) compressionCodecs = new coompressionCodecFactory(job); codec = compressionCodecs.getCodec(file); These lines of code are placed inside if (isCompressedInput()) { } Block So that , these objects would only be instantiated, if the input file is of a compressed format. > Optimising the LineRecordReader initialize() method > --------------------------------------------------- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance > Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux > Reporter: Arun A K > Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, > MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira