[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998641#comment-12998641
 ] 

Todd Lipcon commented on MAPREDUCE-2254:
----------------------------------------

test-patch results:
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 2 new or 
modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
     [exec] 
     [exec]     +1 system test framework.  The patch passed system test 
framework compile.

kicking off unit tests now.

> Allow setting of end-of-record delimiter for TextInputFormat
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-2254
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2254
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Ahmed Radwan
>         Attachments: MAPREDUCE-2245.patch, MAPREDUCE-2254_r2.patch, 
> MAPREDUCE-2254_r3.patch
>
>
> It will be useful to allow setting the end-of-record delimiter for 
> TextInputFormat. The current implementation hardcodes '\n', '\r' or '\r\n' as 
> the only possible record delimiters. This is a problem if users have embedded 
> newlines in their data fields (which is pretty common). This is also a 
> problem for other tools using this TextInputFormat (See for example: 
> https://issues.apache.org/jira/browse/PIG-836 and 
> https://issues.cloudera.org/browse/SQOOP-136).
> I have wrote a patch to address this issue. This patch allows users to 
> specify any custom end-of-record delimiter using a new added configuration 
> property. For backward compatibility, if this new configuration property is 
> absent, then the same exact previous delimiters are used (i.e., '\n', '\r' or 
> '\r\n').

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to