[jira] [Commented] (HADOOP-8654) TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence

Gelesh (JIRA) Mon, 13 Aug 2012 07:46:39 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-8654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433184#comment-13433184
 ]


Gelesh commented on HADOOP-8654:
--------------------------------

I could write a Map Reduce, for testing
with the below code in Map Reduce Driver 

    Path inputDirectory = new Path("TestDirectory", "input");
    Path file = new Path(inputDirectory, "InputFile.txt");
    Writer writer = new OutputStreamWriter(localFs.create(file));
    writer.write("The Reruired Very Big Input String");  // Fingers crossed 

    Path outFile  =  new Path(outputTestDirectory, "part-r-00000");
    Reader reader =  new InputStreamReader(localFs.open(outFile));

Is this okay ?
                
> TextInputFormat delimiter  bug:- Input Text portion ends with & Delimiter 
> starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8654
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8654
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.204.0, 1.0.3, 0.21.0, 2.0.0-alpha
>         Environment: Linux
>            Reporter: Gelesh
>              Labels: patch
>         Attachments: MAPREDUCE-4512.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> TextInputFormat delimiter  bug scenario , a character sequence of the input 
> text,  in which the first character matches with the first character of 
> delimiter, and the remaining input text character sequence  matches with the 
> entire delimiter character sequence from the  starting position of the 
> delimiter.
> eg   delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
> Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
> .... " 
> Here string "=Bangalorrecord 3: " satisfy two conditions 
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' 
> r ') matches with first character (or character sequence ) of delimiter.  (ie 
> "=Bangalor" ends with and Delimiter starts with same character/char sequence 
> 'r' ),
> Here the delimiter is not encountered by the program resulting in improper 
> value text in map that contains the delimiter   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8654) TextInputFormat delimiter bug:- Input Text portion ends with & Delimiter starts with same char/char sequence

Reply via email to