[ https://issues.apache.org/jira/browse/DATAFU-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthew Hayes closed DATAFU-51. ------------------------------- Resolution: Won't Do Closing this as it is quite old and there have been no updates. > Add DataFu MR project, a lightweight for implementing Java/Scala MapReduce > jobs > -------------------------------------------------------------------------------- > > Key: DATAFU-51 > URL: https://issues.apache.org/jira/browse/DATAFU-51 > Project: DataFu > Issue Type: New Feature > Reporter: Mathieu Bastian > Assignee: Mathieu Bastian > Priority: Major > Attachments: DATAFU-51-v10.patch, DATAFU-51-v11.patch, > DATAFU-51-v2.patch, DATAFU-51-v3.patch, DATAFU-51-v4.patch, > DATAFU-51-v5.patch, DATAFU-51-v6.patch, DATAFU-51-v7.patch, > DATAFU-51-v8.patch, DATAFU-51-v9.patch, DATAFU-51.patch > > > New lightweight framework to develop Java/Scala MapReduce jobs. Inspired from > Matt's work on Hourglass and my experience in developing Java jobs on Hadoop. > It's a thin layer on top of the Hadoop API which mostly reduces boilerplate > code and automate configuration. > The core feature is the `AbstractJob` abstract class. Developers get all the > benefits of DataFu MR simply by using this abstract class for each MapReduce > job. > * If nested, Mapper and Reducer classes are automatically inferred > * Mapper, reducer and intermediate key/value classes are inferred when > possible > * Estimate the number of reducers needed if not provided > * Staged output to avoid deleting the existing files if the job fails > DataFu MR also plays well with Avro input/output and provide additional > features through the `AbstractAvroJob` abstract class. > * Built-in support for Avro input and output formats > * Avro schemas are inferred when using POJO objects -- This message was sent by Atlassian Jira (v8.3.4#803005)