[
https://issues.apache.org/jira/browse/HADOOP-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487898
]
Hadoop QA commented on HADOOP-819:
----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12355260/patch-819.txt applied
and successfully tested against trunk revision r527100.
Results are at
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/20/console
> LineRecordWriter should not always insert tab char between key and value
> ------------------------------------------------------------------------
>
> Key: HADOOP-819
> URL: https://issues.apache.org/jira/browse/HADOOP-819
> Project: Hadoop
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Attachments: patch-819.txt
>
>
> With the current implementation of LineRecordWriter in TextOutputFormat, the
> client cannot pass null key/or value to the write function, and a tab char is
> always inserted between the key and value. This works fine most time.
> However, in some
> cases, one just does not want to have the extra tab char. A common example is
> that, if I need to implement a utility similar
> to the unix sort with some fields in the lines as the sort key, I can have my
> map to extract the sort key from each line and pass the whole line as the
> value. The reducer just outputs the values and ignore the keys. However, if I
> use TextOutputFormat, my output will have an extra tab key in each of the
> lines, which is annoying.
> A simple solution is that let the write function of LineRecordWriter accept
> null key argument, and write out the value only if the key is null.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.