[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632808#comment-13632808 ] Sonu Prathap commented on MAPREDUCE-4974: - Could somebody kindly do a check, share , update with the Test Case with compressed input in TestLineRecordReader, and recheck MAPREDUCE 4974. please refer, MAPREDUCE-5143 Optimising the LineRecordReader initialize() method --- Key: MAPREDUCE-4974 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, mrv2, performance Affects Versions: 2.0.2-alpha, 0.23.5 Environment: Hadoop Linux Reporter: Arun A K Assignee: Gelesh Labels: patch, performance Fix For: trunk, 2.0.5-beta Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch Original Estimate: 1h Remaining Estimate: 1h I found there is a a scope of optimizing the code, over initialize() if we have compressionCodecs codec instantiated only if its a compressed input. Mean while Gelesh George Omathil, added if we could avoid the null check of key value. This would time save, since for every next key value generation, null check is done. The intention being to instantiate only once and avoid NPE as well. Hope both could be met if initialize key value over initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5143) TestLineRecordReader was no test case for compressed files
Sonu Prathap created MAPREDUCE-5143: --- Summary: TestLineRecordReader was no test case for compressed files Key: MAPREDUCE-5143 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5143 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Sonu Prathap Priority: Minor TestLineRecordReader was no test case for compressed files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence
[ https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428988#comment-13428988 ] Sonu Prathap commented on MAPREDUCE-4512: - I am also facing the similar issue, Please help me to re create the fixed code using patch TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence - Key: MAPREDUCE-4512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/mumak, mr-am, mrv1, mrv2, task Affects Versions: 0.20.204.0, 0.21.0, 1.0.3, 2.0.0-alpha Environment: Linux Reporter: Gelesh Labels: patch Fix For: 0.20.204.0 Attachments: MAPREDUCE-4512.txt Original Estimate: 1m Remaining Estimate: 1m TextInputFormat delimiter bug scenario , a character sequence of the input text, in which the first character matches with the first character of delimiter, and the remaining input text character sequence matches with the entire delimiter character sequence from the starting position of the delimiter. eg delimiter =record; and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name Here string =Bangalorrecord 3: satisfy two conditions 1) contains the delimiter record 2) The character / character sequence immediately before the delimiter (ie ' r ') matches with first character (or character sequence ) of delimiter. (ie =Bangalor ends with and Delimiter starts with same character/char sequence 'r' ), Here the delimiter is not encountered by the program resulting in improper value text in map that contains the delimiter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira