Hi Hadoop, Hive, and Sqoop users, For the past year, the Apache Hadoop MapReduce project has played host to Sqoop, a command-line tool that performs parallel imports and exports between relational databases and HDFS. We've developed a lot of features and gotten a lot of great feedback from users. While Sqoop was a contrib project in Hadoop, it has been steadily improved and grown.
But the contrib directory is a home for new or small projects incubating underneath Hadoop's umbrella. Sqoop is starting to look less like a small project these days. In particular, a feature that has been growing in importance for Sqoop is its ability to integrate with Hive. In order to facilitate this integration from a compilation and testing standpoint, we've pulled Sqoop out of contrib and into its own repository hosted on github. You can download all the relevant bits here: http://www.github.com/cloudera/sqoop The code there will run in conjunction with the Apache Hadoop trunk source. (Compatibility with other distributions/versions is forthcoming.) While we've changed hosts, Sqoop will keep the same license -- future improvements will continue to remain Apache 2.0-licensed. We welcome the contributions of all in the open source community; there's a lot of exciting work still to be done! If you'd like to help out but aren't sure where to start, send me an email and I can recommend a few areas where improvements would be appreciated. Want some more information about Sqoop? An introduction is available here: http://www.cloudera.com/sqoop A ready-to-run release of Sqoop is included with Cloudera's Distribution for Hadoop: http://archive.cloudera.com And its reference manual is available for browsing at http://archive.cloudera.com/docs/sqoop If you have any questions about this move process, please ask me. Regards, - Aaron Kimball Cloudera, Inc.