On Thu, Aug 5, 2010 at 4:52 PM, Bobby Dennett <bdenn...@gmail.com> wrote:

> Hi Josh,
>
> No real pain points... just trying to investigate/research the "best"
> way to create the necessary libraries and jar files to support LZO
> compression in Hadoop. In particular, there are the 2 "repositories"
> to build from and I am trying to find out if one should be used over
> the other. For instance, in your previous posting, you refer to
> hadoop-gpl-compression while the Twitter blog post from last year
> mentions the Hadoop-LZO project. Briefly looking, it seems Hadoop-LZO
> is preferable but we're curious if there are any caveats/gotchas we
> should be aware of.
>

Yes, definitely use the hadoop-lzo project from github -- either from my
repo or from kevinweil's (the two are kept in sync)

The repo on Google Code has a number of known bugs, which is why we forked
it over to github last year.

-Todd

On Thu, Aug 5, 2010 at 15:59, Josh Patterson <j...@cloudera.com> wrote:
> > Bobby,
> >
> > We're working hard to make compression easier, the biggest hurdle
> > currently is the licensing issues around the LZO codec libs (GPL,
> > which is not compatible with ASF bsd-style license).
> >
> > Outside of making the changes to the mapred-site.xml file, with your
> > setup would do you view as the biggest pain point?
> >
> > Josh Patterson
> > Cloudera
> >
> > On Thu, Aug 5, 2010 at 6:52 PM, Bobby Dennett
> > <bdennett+softw...@gmail.com <bdennett%2bsoftw...@gmail.com>> wrote:
> >> We are looking to enable LZO compression of the map outputs on our
> >> Cloudera 0.20.1 cluster. It seems there are various sets of
> >> instructions available and I am curious what your thoughts are
> >> regarding which one would be best for our Hadoop distribution and OS
> >> (Ubuntu 8.04 64-bit). In particular, hadoop-gpl-compression
> >> (http://code.google.com/p/hadoop-gpl-compression) vs. hadoop-lzo
> >> (http://github.com/kevinweil/hadoop-lzo).
> >>
> >> Some of what appear to be the better instructions/guides out there:
> >> * Josh Patterson's reply on June 25th to the "Newbie to HDFS
> >> compression" thread --
> >>
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201006.mbox/%3caanlktileo-q8useip8y3na9pdyhlyufippr0in0lk...@mail.gmail.com%3e
> >> * hadoop-gpl-compression FAQ --
> >> http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ
> >> * "Hadoop at Twitter (part 1): Splittable LZO Compression" blog post
> >> --
> http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
> >>
> >> Thanks in advance,
> >> -Bobby
> >>
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to