Bobby, We're working hard to make compression easier, the biggest hurdle currently is the licensing issues around the LZO codec libs (GPL, which is not compatible with ASF bsd-style license).
Outside of making the changes to the mapred-site.xml file, with your setup would do you view as the biggest pain point? Josh Patterson Cloudera On Thu, Aug 5, 2010 at 6:52 PM, Bobby Dennett <bdennett+softw...@gmail.com> wrote: > We are looking to enable LZO compression of the map outputs on our > Cloudera 0.20.1 cluster. It seems there are various sets of > instructions available and I am curious what your thoughts are > regarding which one would be best for our Hadoop distribution and OS > (Ubuntu 8.04 64-bit). In particular, hadoop-gpl-compression > (http://code.google.com/p/hadoop-gpl-compression) vs. hadoop-lzo > (http://github.com/kevinweil/hadoop-lzo). > > Some of what appear to be the better instructions/guides out there: > * Josh Patterson's reply on June 25th to the "Newbie to HDFS > compression" thread -- > http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201006.mbox/%3caanlktileo-q8useip8y3na9pdyhlyufippr0in0lk...@mail.gmail.com%3e > * hadoop-gpl-compression FAQ -- > http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ > * "Hadoop at Twitter (part 1): Splittable LZO Compression" blog post > -- > http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/ > > Thanks in advance, > -Bobby >