Re: LZO Compression Libraries don't appear to work properly with MultipleOutputs

Todd Lipcon Thu, 21 Oct 2010 15:32:29 -0700

Hi Ed,

Sounds like this might be a bug, either in MultipleOutputs or in LZO.


Does it work properly with gzip compression? Which LZO implementation
are you using? The one from google code or the more up to date one
from github (either kevinweil's or mine)?

Any chance you could write a unit test that shows the issue?

Thanks
-Todd

On Thu, Oct 21, 2010 at 2:52 PM, ed <hadoopn...@gmail.com> wrote:
> Hello everyone,
>
> I am having problems using MultipleOutputs with LZO compression (could be a
> bug or something wrong in my own code).
>
> In my driver I set
>
>     MultipleOutputs.addNamedOutput(job, "test", TextOutputFormat.class,
> NullWritable.class, Text.class);
>
> In my reducer I have:
>
>     MultipleOutputs<NullWritable, Text> mOutput = new
> MultipleOutputs<NullWritable, Text>(context);
>
>     public String generateFileName(Key key){
>        return "custom_file_name";
>     }
>
> Then in the reduce() method I have:
>
>     mOutput.write(mNullWritable, mValue, generateFileName(key));
>
> This results in creating LZO files that do not decompress properly (lzop -d
> throws the error "lzop: unexpected end of file: outputFile.lzo")
>
> If I switch back to the regular context.write(mNullWritable, mValue);
> everything works fine.
>
> Am I forgetting a step needed when using MultipleOutputs or is this a
> bug/non-feature of using LZO compression in Hadoop.
>
> Thank you!
>
>
> ~Ed
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: LZO Compression Libraries don't appear to work properly with MultipleOutputs

Reply via email to