[ https://issues.apache.org/jira/browse/MAPREDUCE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981447#action_12981447 ]
Ahmed Radwan commented on MAPREDUCE-2254: ----------------------------------------- Hi Todd. I agree that the changes can directly go to the LineReader. My motive was keeping the LineReader mostly unchanged, in case it is used in other contexts. The LineReader breaks the input stream using new lines, which is totally fine and it exactly does what its name suggests. This is why I thought of encapsulating the changes within the RecordReader (where conceptually these changes are required). However, I see your point that it looks a little weird. I can move the changes to LineReader but then its name will not convey its functionality, and if we rename it, this can cause other problems. What do you think? > Allow setting of end-of-record delimiter for TextInputFormat > ------------------------------------------------------------ > > Key: MAPREDUCE-2254 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2254 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Reporter: Ahmed Radwan > Attachments: MAPREDUCE-2245.patch > > > It will be useful to allow setting the end-of-record delimiter for > TextInputFormat. The current implementation hardcodes '\n', '\r' or '\r\n' as > the only possible record delimiters. This is a problem if users have embedded > newlines in their data fields (which is pretty common). This is also a > problem for other tools using this TextInputFormat (See for example: > https://issues.apache.org/jira/browse/PIG-836 and > https://issues.cloudera.org/browse/SQOOP-136). > I have wrote a patch to address this issue. This patch allows users to > specify any custom end-of-record delimiter using a new added configuration > property. For backward compatibility, if this new configuration property is > absent, then the same exact previous delimiters are used (i.e., '\n', '\r' or > '\r\n'). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.