Hi Simone,

I was wondering, is it possible to write AVRO files to hadoop straight from your lib ( mixed with avro libs ofcourse )? I'm currently trying to come up with a way to read from mysql ( but more complicated than sqoop can handle ) and write it out to avro files on HDFS. Is something like this feasible with this? How do you see it?

Thanks!

Bart

Simone Leo schreef op 13.11.2012 14:11:
Hello everyone,

we're happy to announce that we have just released Pydoop 0.7.0-rc1
(http://pydoop.sourceforge.net).

The main changes with respect to the previous version are:

 * support for CDH4 (MapReduce v1 only)
 * tested with the latest releases of other supported Hadoop versions
 * simpler build process
* Pydoop scripts can now accept user-defined configuration parameters
 * new wrapper object makes it easier to interact with the JobConf
 * new hdfs.path functions: isdir, isfile, kind
 * HDFS: support for string description of permission modes in chmod
 * several bug fixes

This is a release candidate.  We're working on binary packages for
the final release.  As usual, we're happy to receive your feedback on
the forum:

http://sourceforge.net/projects/pydoop/forums/forum/990018

Pydoop is a Python MapReduce and HDFS API for Hadoop, built upon the C++
Pipes and the C libhdfs APIs, that allows to write full-fledged
MapReduce applications with HDFS access. Pydoop has been maturing
nicely and is currently in production use at CRS4 as we have a few
scientific projects that are based on it, including Seal
(http://biodoop-seal.sourceforge.net), Biodoop-BLAST
(http://biodoop.sourceforge.net/blast), and more yet to be released.

Links:

 * full release notes: http://pydoop.sourceforge.net/docs/news.html
 * download page on sf: http://sourceforge.net/projects/pydoop/files
* download page on PyPI: http://pypi.python.org/pypi/pydoop/0.7.0-rc1
 * git repo:

http://pydoop.git.sourceforge.net/git/gitweb.cgi?p=pydoop/pydoop;a=summary

Happy pydooping!


The Pydoop Team

Reply via email to