Another approach, arguably a bit better in my opinion, is through the hadoop-gpl-compression project (http://code.google.com/p/hadoop-gpl-compression/ ). It also incorporates Johan Oskarsson's H-4640 patch. A detailed description on how to use it with lzo-less hadoop distribution can be found at: http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ.

Thanks, Hong

On Jul 21, 2009, at 8:05 PM, Gross, Danny wrote:

Thanks Aaron, for the quick response.

Best Regards,

Danny

-----Original Message-----
From: Aaron Kimball [mailto:aa...@cloudera.com]
Sent: Tuesday, July 21, 2009 9:10 PM
To: common-user@hadoop.apache.org
Subject: Re: native-lzo library not available issue with terasort

Native LZO support was removed from Hadoop due to licensing
restrictions. See
http://www.cloudera.com/blog/2009/06/24/parallel-lzo-splittable-compression-for-hadoop/
for a writeup on how to reenable it in your local build.

- Aaron

On Tue, Jul 21, 2009 at 7:02 PM, Gross, Danny<danny.gr...@spansion.com> wrote:
Hello,



I've been running terasort on multiple cluster configurations, and
attempted to duplicate some of the configuration settings that Yahoo!
used for the Minute Sort.



In particular, I set the mapred.map.output.compression.codec property to value "org.apache.hadoop.io.compress.LzoCodec" in hadoop- site.xml. I
am using hadoop-0.19.1.



The teragen program runs fine, and completes with improved time with my new settings. However, when I run the terasort program, the following
error is thrown from the map tasks, and the job ultimately fails:



"java.lang.RuntimeException: native-lzo library not available at
org .apache.hadoop.io.compress.LzoCodec.getCompressorType(LzoCodec.java:1
30) at
org .apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:98)
at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:93) at
org.apache.hadoop.mapred.MapTask $MapOutputBuffer.sortAndSpill(MapTask.ja
va:961) at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java: 842)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at
org.apache.hadoop.mapred.Child.main(Child.java:158)"



I've searched other places for an answer, and am coming up short. Any
help out there would be greatly appreciated.



Best regards,



Danny



Danny B. Gross

Solutions Engineering

Spansion,  Inc.

email:  danny.gr...@spansion.com <mailto:danny.gr...@spansion.com>





Reply via email to