Hi Hadoop, Hive, and Sqoop users,

For the past year, the Apache Hadoop MapReduce project has played host to
Sqoop, a command-line tool that performs parallel imports and exports
between relational databases and HDFS. We've developed a lot of features and
gotten a lot of great feedback from users. While Sqoop was a contrib project
in Hadoop, it has been steadily improved and grown.

But the contrib directory is a home for new or small projects incubating
underneath Hadoop's umbrella. Sqoop is starting to look less like a small
project these days. In particular, a feature that has been growing in
importance for Sqoop is its ability to integrate with Hive. In order to
facilitate this integration from a compilation and testing standpoint, we've
pulled Sqoop out of contrib and into its own repository hosted on github.

You can download all the relevant bits here:
http://www.github.com/cloudera/sqoop

The code there will run in conjunction with the Apache Hadoop trunk source.
(Compatibility with other distributions/versions is forthcoming.)

While we've changed hosts, Sqoop will keep the same license -- future
improvements will continue to remain Apache 2.0-licensed. We welcome the
contributions of all in the open source community; there's a lot of exciting
work still to be done! If you'd like to help out but aren't sure where to
start, send me an email and I can recommend a few areas where improvements
would be appreciated.

Want some more information about Sqoop? An introduction is available here:
http://www.cloudera.com/sqoop
A ready-to-run release of Sqoop is included with Cloudera's Distribution for
Hadoop: http://archive.cloudera.com
And its reference manual is available for browsing at
http://archive.cloudera.com/docs/sqoop

If you have any questions about this move process, please ask me.

Regards,
- Aaron Kimball
Cloudera, Inc.

Reply via email to