[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4512:
------------------------------

          Description: 
TextInputFormat delimiter  bug scenario , a character sequence of the input 
text,  in which the first character matches with the first character of 
delimiter, and the remaining input text character sequence  matches with the 
entire delimiter character sequence from the  starting position of the 
delimiter.

eg   delimiter ="record";
and Text =" record 1:- name = Gelesh e mail = gelesh.had...@gmail.com Location 
Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name .... " 

Here string "=Bangalorrecord 3: " satisfy two conditions 
1) contains the delimiter "record"
2) The character / character sequence immediately before the delimiter (ie ' r 
') matches with first character (or character sequence ) of delimiter.  (ie 
"=Bangalor" ends with and Delimiter starts with same character/char sequence 
'r' ),

Here the delimiter is not encountered by the program resulting in improper 
value text in map that contains the delimiter   

  was:
TextInputFormat delimiter  bug scenario , a character sequence of the input 
text,  in which the first character matches with the first character of 
delimiter, and reaming input text character sequence  matches with the entire 
delimiter character sequence from the  starting position of the delimiter.

eg   delimiter ="record";
and Text = record 1:- name = "Gelesh" e mail = gelesh.had...@gmail.com Location 
Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name .... 

Here string "=Bangalorrecord 3: " satisfy two condition 
1) contains the delimiter "record"
2) The character / character sequence immediately b4 the delimiter (ie 'r') 
matches with first character (or character sequence ) of delimiter.  (ie 
"=Bangalor" ends with and Delimiter starts with same character/char sequence 
'r' ),

Hear the delimiter is skipped

          Environment: Linux  (was: Lynux)
    Affects Version/s: 0.20.204.0
                       0.21.0
                       1.0.3

Test case
input file text
record 1 name: Java Location:UAErecord 2 name:Gelesh Location:Bangalorrecord 3 
name Hadoop Location:Kerala

Delimiter = "record"

expected values in map
 1 name: Java Location:UAE
 2 name:Gelesh Location:Bangalor
 3 name Hadoop Location:Kerala

Actual values received in map
 1 name: Java Location:UAE
 2 name:Gelesh Location:Bangalorrecord 3 name Hadoop Location:Kerala


                
> TextInputFormat delimiter  bug:- Input Text portion ends with & Delimiter 
> starts with same char/char sequence
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4512
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/mumak, mr-am, mrv1, mrv2, task
>    Affects Versions: 0.20.204.0, 0.21.0, 1.0.3, 2.0.0-alpha
>         Environment: Linux
>            Reporter: Gelesh
>              Labels: patch
>             Fix For: 0.20.204.0
>
>         Attachments: MAPREDUCE-4512.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> TextInputFormat delimiter  bug scenario , a character sequence of the input 
> text,  in which the first character matches with the first character of 
> delimiter, and the remaining input text character sequence  matches with the 
> entire delimiter character sequence from the  starting position of the 
> delimiter.
> eg   delimiter ="record";
> and Text =" record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
> Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
> .... " 
> Here string "=Bangalorrecord 3: " satisfy two conditions 
> 1) contains the delimiter "record"
> 2) The character / character sequence immediately before the delimiter (ie ' 
> r ') matches with first character (or character sequence ) of delimiter.  (ie 
> "=Bangalor" ends with and Delimiter starts with same character/char sequence 
> 'r' ),
> Here the delimiter is not encountered by the program resulting in improper 
> value text in map that contains the delimiter   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to