Hi Simone,
I was wondering, is it possible to write AVRO files to hadoop straight
from your lib ( mixed with avro libs ofcourse )? I'm currently trying to
come up with a way to read from mysql ( but more complicated than sqoop
can handle ) and write it out to avro files on HDFS. Is something like
this feasible with this? How do you see it?
Thanks!
Bart
Simone Leo schreef op 13.11.2012 14:11:
Hello everyone,
we're happy to announce that we have just released Pydoop 0.7.0-rc1
(http://pydoop.sourceforge.net).
The main changes with respect to the previous version are:
* support for CDH4 (MapReduce v1 only)
* tested with the latest releases of other supported Hadoop versions
* simpler build process
* Pydoop scripts can now accept user-defined configuration
parameters
* new wrapper object makes it easier to interact with the JobConf
* new hdfs.path functions: isdir, isfile, kind
* HDFS: support for string description of permission modes in chmod
* several bug fixes
This is a release candidate. We're working on binary packages for
the final release. As usual, we're happy to receive your feedback on
the forum:
http://sourceforge.net/projects/pydoop/forums/forum/990018
Pydoop is a Python MapReduce and HDFS API for Hadoop, built upon the
C++
Pipes and the C libhdfs APIs, that allows to write full-fledged
MapReduce applications with HDFS access. Pydoop has been maturing
nicely and is currently in production use at CRS4 as we have a few
scientific projects that are based on it, including Seal
(http://biodoop-seal.sourceforge.net), Biodoop-BLAST
(http://biodoop.sourceforge.net/blast), and more yet to be released.
Links:
* full release notes: http://pydoop.sourceforge.net/docs/news.html
* download page on sf: http://sourceforge.net/projects/pydoop/files
* download page on PyPI:
http://pypi.python.org/pypi/pydoop/0.7.0-rc1
* git repo:
http://pydoop.git.sourceforge.net/git/gitweb.cgi?p=pydoop/pydoop;a=summary
Happy pydooping!
The Pydoop Team