[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Jose updated MAPREDUCE-4974: --- Affects Version/s: (was: 0.23.5) > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: trunk, 2.0.5-beta > > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4974: --- Resolution: Fixed Fix Version/s: (was: 0.23.7) trunk Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 1.0.0 (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5) Status: Resolved (was: Patch Available) Thanks Gelesh, I put this in trunk and branch-2. > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: trunk, 2.0.5-beta > > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: MAPREDUCE-4974.5.patch CompressionCodecFactory compressionCodecs, and CompressionCodec codec, object made local to initialise(), private boolean isCompressedInput introduced instead of private boolean isCompressedInput() > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.23.7, 2.0.5-beta > > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 1.0.0 (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5) Status: Patch Available (was: Reopened) Reduced the scope of compressionCodecs & codec Introduced boolean isCompressedInput instead of boolean isCompressedInput() > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 0.23.5, 2.0.2-alpha > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.23.7, 2.0.5-beta > > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4974: --- Resolution: Fixed Fix Version/s: 2.0.5-beta 0.23.7 Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 1.0.0 (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5) Status: Resolved (was: Patch Available) Thanks Galesh, I put this into trunk, branch-2, and branch-0.23 > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.23.7, 2.0.5-beta > > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: (was: MAPREDUCE-4974.1.patch) > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, > MAPREDUCE-4974.4.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: MAPREDUCE-4974.4.patch Two Changes, 1) if (newSize == 0) { break; } if (newSize < maxLineLength) { break; } The newSize==0 check is eliminated since, (newSize < maxLineLength) check includes that condition as well. The (newSize == 0) check outside the loop is retained as such. 2) compressionCodecs = new coompressionCodecFactory(job); codec = compressionCodecs.getCodec(file); These lines of code are placed inside if (isCompressedInput()) { } Block So that , these objects would only be instantiated, if the input file is of a compressed format. > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, > MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: MAPREDUCE-4974.3.patch [~ak.a...@aol.com]'s patch 4974.2 had shown all the lines as new lines, because of code reformatting. The same changes were captured, and a patch was build against previous commit. This time the size of patch is 3+KB. Please review. > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, > MAPREDUCE-4974.3.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun A K updated MAPREDUCE-4974: Attachment: MAPREDUCE-4974.2.patch Key & Value null assignment is in nextKeyValue(), is moved to close() to avoid NPE, as per the review comments. Also, if (newSize == 0) check is voided inside the loop, since, if (newSize < maxLineLength)includes the same check. How ever, if(newSize == 0) condition is checked outside the while loop. Hope this would also improve performance. Combined effort with Gelesh. > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4974: Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 1.0.0 (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5) Fix Version/s: (was: 0.24.0) (was: 0.20.204.0) Removing fix versions - usually, the committer sets these at commit time. > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Attachments: MAPREDUCE-4974.1.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun A K updated MAPREDUCE-4974: Summary: Optimising the LineRecordReader initialize() method (was: optimising the LineRecordReader initialize method) > Optimising the LineRecordReader initialize() method > --- > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.20.204.0, 0.24.0 > > Attachments: MAPREDUCE-4974.1.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) optimising the LineRecordReader initialize method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: MAPREDUCE-4974.1.patch Combined thoughts of mine & Arun AK's, > optimising the LineRecordReader initialize method > - > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 2.0.2-alpha, 0.23.5 > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.20.204.0, 0.24.0 > > Attachments: MAPREDUCE-4974.1.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) optimising the LineRecordReader initialize method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Assignee: Gelesh (was: Arun A K) Target Version/s: 0.23.5, 0.23.4, 2.0.1-alpha, 2.0.0-alpha, 1.1.1, 1.0.4, 1.0.0 (was: 1.0.0, 1.0.4, 1.1.1, 2.0.0-alpha, 2.0.1-alpha, 0.23.4, 0.23.5) Status: Patch Available (was: Open) > optimising the LineRecordReader initialize method > - > > Key: MAPREDUCE-4974 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, mrv2, performance >Affects Versions: 0.23.5, 2.0.2-alpha > Environment: Hadoop Linux >Reporter: Arun A K >Assignee: Gelesh > Labels: patch, performance > Fix For: 0.20.204.0, 0.24.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > I found there is a a scope of optimizing the code, over initialize() if we > have compressionCodecs & codec instantiated only if its a compressed input. > Mean while Gelesh George Omathil, added if we could avoid the null check of > key & value. This would time save, since for every next key value generation, > null check is done. The intention being to instantiate only once and avoid > NPE as well. Hope both could be met if initialize key & value over > initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira