[
https://issues.apache.org/jira/browse/HADOOP-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Klaas Bosteels updated HADOOP-4304:
-----------------------------------
Status: Patch Available (was: In Progress)
> Add Dumbo to contrib
> --------------------
>
> Key: HADOOP-4304
> URL: https://issues.apache.org/jira/browse/HADOOP-4304
> Project: Hadoop Core
> Issue Type: New Feature
> Reporter: Klaas Bosteels
> Assignee: Klaas Bosteels
> Priority: Minor
> Attachments: hadoop-4304-v2.patch, hadoop-4304-v3.patch,
> hadoop-4304.patch
>
>
> Originally, Dumbo was a simple Python module developed at Last.fm to make
> writing and running Hadoop Streaming programs very easy, but now it also
> consists of some (up till now unreleased) helper code in Java (although it
> can still be used without the Java code). We propose to add Dumbo to
> "src/contrib" such that the Java classes get build/installed together with
> the rest of Hadoop, and the Python module can be installed separately at
> will. A tar.gz of the directory that would have to be added to "src/contrib"
> is available at
> http://static.last.fm/dumbo/dumbo-contrib.tar.gz
> and more info about Dumbo can be found here:
> * Basic documentation: http://github.com/klbostee/dumbo/wikis
> * Presentation at HUG (where it was first suggested to add Dumbo to contrib):
> http://skillsmatter.com/podcast/home/dumbo-hadoop-streaming-made-elegant-and-easy
> * Initial announcement:
> http://blog.last.fm/2008/05/29/python-hadoop-flying-circus-elephant
> For some of the more advanced features of Dumbo (in particular the ones for
> which the Java classes are needed) there is no public documentation yet, but
> we could easily fill that gap by moving some of the internal Last.fm
> documentation to the Hadoop wiki.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.