Hi Utku,

hiho currently works only with MySQL. It works with Hadoop 0.20. It uses
features in the MySQL Connector J to load data parallelly to the database
through multiple map tasks. The number of tasks are determined based on the
number of files you wish to load to the database. The current release has
the HDFS output to Database. I am hoping to release the import portions next
week. In case you try it, please feel free to let me know any of your
feedback.  If you need any other help, please shoot me a direct email.

I hope this helps you.

Thanks and Regards,
Sonal


On Fri, Mar 19, 2010 at 4:12 PM, Utku Can Topçu <u...@topcu.gen.tr> wrote:

> Thank you both Aaron and Sonal for your precious comments and
> contributions.
>
> I'll check both the projects and try to make a design decision.
>
> I'm familiar with the sqoop and just heard about hiho.
>
> Sonal: I guess what hiho is a single map/reduce job handling the MySQL
> hadoop Integration. Is it also possible to use it with other JDBC
> connectors
> too?
>
> Best Regards,
> Utku
>
> On Fri, Mar 19, 2010 at 5:07 AM, Sonal Goyal <sonalgoy...@gmail.com>
> wrote:
>
> > Hi Utku,
> >
> > If MySQL is your target database, you may check Meghsoft's hiho:
> >
> > http://code.google.com/p/hiho/
> >
> > The current release supports transferring data from Hadoop to the MySQL
> > database. We will be releasing the functionality of transfer from MySQL
> to
> > Hadoop soon, sometime next week.
> >
> > Thanks and Regards,
> > Sonal
> > www.meghsoft.com
> >
> >
> > On Thu, Mar 18, 2010 at 5:31 AM, Aaron Kimball <aa...@cloudera.com>
> wrote:
> >
> > > Hi Utku,
> > >
> > > Apache Hadoop 0.20 cannot support Sqoop as-is. Sqoop makes use of the
> > > DataDrivenDBInputFormat (among other APIs) which are not shipped with
> > > Apache's 0.20 release. In order to get Sqoop working on 20, you'd need
> to
> > > apply a lengthy list of patches from the project source repository to
> > your
> > > copy of Hadoop and recompile. Or you could just download it all from
> > > Cloudera, where we've done that work for you :)
> > >
> > > So as it stands, Sqoop won't be able to run on 0.20 unless you choose
> to
> > > use
> > > Cloudera's distribution.  Do note that your use of the term "fork" is a
> > bit
> > > strong here; with the exception of (minor) modifications to make it
> > > interact
> > > in a more compatible manner with the external Linux environment, our
> > > distribution only includes code that's available to the project at
> large.
> > > But some of that code has not been rolled into a binary release from
> > Apache
> > > yet. If you choose to go with Cloudera's distribution, it just means
> that
> > > you get publicly-available features (like Sqoop, MRUnit, etc.) a year
> or
> > so
> > > ahead of what Apache has formally released, but our codebase isn't
> > > radically
> > > diverging; CDH is just somewhere ahead of the Apache 0.20 release, but
> > > behind Apache's svn trunk. (All of Sqoop, MRUnit, etc. are available in
> > the
> > > Hadoop source repository on the trunk branch.)
> > >
> > > If you install our distribution, then Sqoop will be installed in
> > > /usr/lib/hadoop-0.20/contrib/sqoop and /usr/bin/sqoop for you. There
> > isn't
> > > a
> > > separate package to install Sqoop independent of the rest of CDH; thus
> no
> > > extra download link on our site.
> > >
> > > I hope this helps!
> > >
> > > Good luck,
> > > - Aaron
> > >
> > >
> > > On Wed, Mar 17, 2010 at 4:30 AM, Reik Schatz <reik.sch...@bwin.org>
> > wrote:
> > >
> > > > At least for MRUnit, I was not able to find it outside of the
> Cloudera
> > > > distribution (CDH). What I did: installing CDH locally using apt
> > > (Ubuntu),
> > > > searched for and copied the mrunit library into my local Maven
> > > repository,
> > > > and removed CDH after. I guess the same is somehow possible for
> Sqoop.
> > > >
> > > > /Reik
> > > >
> > > >
> > > > Utku Can Topçu wrote:
> > > >
> > > >> Dear All,
> > > >>
> > > >> I'm trying to run tests using MySQL as some kind of a datasource, so
> I
> > > >> thought cloudera's sqoop would be a nice project to have in the
> > > >> production.
> > > >> However, I'm not using the cloudera's hadoop distribution right now,
> > and
> > > >> actually I'm not thinking of switching from a main project to a
> fork.
> > > >>
> > > >> I read the documentation on sqoop at
> > > >> http://www.cloudera.com/developers/downloads/sqoop/ but there are
> > > >> actually
> > > >> no links for downloading the sqoop itself.
> > > >>
> > > >> Has anyone here know, and tried to use sqoop with the latest apache
> > > >> hadoop?
> > > >> If so can you give me some tips and tricks on it?
> > > >>
> > > >> Best Regards,
> > > >> Utku
> > > >>
> > > >>
> > > >
> > > > --
> > > >
> > > > *Reik Schatz*
> > > > Technical Lead, Platform
> > > > P: +46 8 562 470 00
> > > > M: +46 76 25 29 872
> > > > F: +46 8 562 470 01
> > > > E: reik.sch...@bwin.org <mailto:reik.sch...@bwin.org>
> > > > */bwin/* Games AB
> > > > Klarabergsviadukten 82,
> > > > 111 64 Stockholm, Sweden
> > > >
> > > > [This e-mail may contain confidential and/or privileged information.
> If
> > > you
> > > > are not the intended recipient (or have received this e-mail in
> error)
> > > > please notify the sender immediately and destroy this e-mail. Any
> > > > unauthorised copying, disclosure or distribution of the material in
> > this
> > > > e-mail is strictly forbidden.]
> > > >
> > > >
> > >
> >
>

Reply via email to