[ 
https://issues.apache.org/jira/browse/DATAFU-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Hayes closed DATAFU-51.
-------------------------------
    Resolution: Won't Do

Closing this as it is quite old and there have been no updates.

> Add DataFu MR project, a lightweight  for implementing Java/Scala MapReduce 
> jobs
> --------------------------------------------------------------------------------
>
>                 Key: DATAFU-51
>                 URL: https://issues.apache.org/jira/browse/DATAFU-51
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: Mathieu Bastian
>            Assignee: Mathieu Bastian
>            Priority: Major
>         Attachments: DATAFU-51-v10.patch, DATAFU-51-v11.patch, 
> DATAFU-51-v2.patch, DATAFU-51-v3.patch, DATAFU-51-v4.patch, 
> DATAFU-51-v5.patch, DATAFU-51-v6.patch, DATAFU-51-v7.patch, 
> DATAFU-51-v8.patch, DATAFU-51-v9.patch, DATAFU-51.patch
>
>
> New lightweight framework to develop Java/Scala MapReduce jobs. Inspired from 
> Matt's work on Hourglass and my experience in developing Java jobs on Hadoop. 
> It's a thin layer on top of the Hadoop API which mostly reduces boilerplate 
> code and automate configuration.
> The core feature is the `AbstractJob` abstract class. Developers get all the 
> benefits of DataFu MR simply by using this abstract class for each MapReduce 
> job.
> * If nested, Mapper and Reducer classes are automatically inferred
> * Mapper, reducer and intermediate key/value classes are inferred when 
> possible
> * Estimate the number of reducers needed if not provided
> * Staged output to avoid deleting the existing files if the job fails
> DataFu MR also plays well with Avro input/output and provide additional 
> features through the `AbstractAvroJob` abstract class.
> * Built-in support for Avro input and output formats
> * Avro schemas are inferred when using POJO objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to