On 11 сент. 2014 г., at 0:47, Felix Chern <idry...@gmail.com> wrote:

> If you don’t want anything get inserted, just set your output to key only or 
> value only.
> TextOutputFormat$LineRecordWriter won’t insert anything unless both values 
> are set:


If I output value only, for instance, and my line contains TAB then everything 
before TAB will be lost?
If I output key only, and my line contains TAB then everything after TAB will 
be lost?


> 
>     public synchronized void write(K key, V value)
>       throws IOException {
> 
>       boolean nullKey = key == null || key instanceof NullWritable;
>       boolean nullValue = value == null || value instanceof NullWritable;
>       if (nullKey && nullValue) {
>         return;
>       }
>       if (!nullKey) {
>         writeObject(key);
>       }
>       if (!(nullKey || nullValue)) {
>         out.write(keyValueSeparator);
>       }
>       if (!nullValue) {
>         writeObject(value);
>       }
>       out.write(newline);
>     }
> 
> On Sep 10, 2014, at 1:37 PM, Dmitry Sivachenko <trtrmi...@gmail.com> wrote:
> 
>> 
>> On 10 сент. 2014 г., at 22:33, Felix Chern <idry...@gmail.com> wrote:
>> 
>>> Use ‘tr -s’ to stripe out tabs?
>>> 
>>> $ echo -e "a\t\t\tb"
>>> a                   b
>>> 
>>> $ echo -e "a\t\t\tb" | tr -s "\t"
>>> a   b
>>> 
>> 
>> There can be tabs in the input, I want to keep input lines without any 
>> modification.
>> 
>> Actually it is rather standard task: process lines one by one without 
>> inserting extra characters.  There should be standard solution for it IMO.
>> 
> 

Reply via email to