Hi Utku, Apache Hadoop 0.20 cannot support Sqoop as-is. Sqoop makes use of the DataDrivenDBInputFormat (among other APIs) which are not shipped with Apache's 0.20 release. In order to get Sqoop working on 20, you'd need to apply a lengthy list of patches from the project source repository to your copy of Hadoop and recompile. Or you could just download it all from Cloudera, where we've done that work for you :)
So as it stands, Sqoop won't be able to run on 0.20 unless you choose to use Cloudera's distribution. Do note that your use of the term "fork" is a bit strong here; with the exception of (minor) modifications to make it interact in a more compatible manner with the external Linux environment, our distribution only includes code that's available to the project at large. But some of that code has not been rolled into a binary release from Apache yet. If you choose to go with Cloudera's distribution, it just means that you get publicly-available features (like Sqoop, MRUnit, etc.) a year or so ahead of what Apache has formally released, but our codebase isn't radically diverging; CDH is just somewhere ahead of the Apache 0.20 release, but behind Apache's svn trunk. (All of Sqoop, MRUnit, etc. are available in the Hadoop source repository on the trunk branch.) If you install our distribution, then Sqoop will be installed in /usr/lib/hadoop-0.20/contrib/sqoop and /usr/bin/sqoop for you. There isn't a separate package to install Sqoop independent of the rest of CDH; thus no extra download link on our site. I hope this helps! Good luck, - Aaron On Wed, Mar 17, 2010 at 4:30 AM, Reik Schatz <reik.sch...@bwin.org> wrote: > At least for MRUnit, I was not able to find it outside of the Cloudera > distribution (CDH). What I did: installing CDH locally using apt (Ubuntu), > searched for and copied the mrunit library into my local Maven repository, > and removed CDH after. I guess the same is somehow possible for Sqoop. > > /Reik > > > Utku Can Topçu wrote: > >> Dear All, >> >> I'm trying to run tests using MySQL as some kind of a datasource, so I >> thought cloudera's sqoop would be a nice project to have in the >> production. >> However, I'm not using the cloudera's hadoop distribution right now, and >> actually I'm not thinking of switching from a main project to a fork. >> >> I read the documentation on sqoop at >> http://www.cloudera.com/developers/downloads/sqoop/ but there are >> actually >> no links for downloading the sqoop itself. >> >> Has anyone here know, and tried to use sqoop with the latest apache >> hadoop? >> If so can you give me some tips and tricks on it? >> >> Best Regards, >> Utku >> >> > > -- > > *Reik Schatz* > Technical Lead, Platform > P: +46 8 562 470 00 > M: +46 76 25 29 872 > F: +46 8 562 470 01 > E: reik.sch...@bwin.org <mailto:reik.sch...@bwin.org> > */bwin/* Games AB > Klarabergsviadukten 82, > 111 64 Stockholm, Sweden > > [This e-mail may contain confidential and/or privileged information. If you > are not the intended recipient (or have received this e-mail in error) > please notify the sender immediately and destroy this e-mail. Any > unauthorised copying, disclosure or distribution of the material in this > e-mail is strictly forbidden.] > >