[ https://issues.apache.org/jira/browse/DATAFU-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathieu Bastian reassigned DATAFU-51: ------------------------------------- Assignee: Mathieu Bastian > Add DataFu MR project, a lightweight for implementing Java/Scala MapReduce > jobs > -------------------------------------------------------------------------------- > > Key: DATAFU-51 > URL: https://issues.apache.org/jira/browse/DATAFU-51 > Project: DataFu > Issue Type: New Feature > Reporter: Mathieu Bastian > Assignee: Mathieu Bastian > Attachments: DATAFU-51.patch > > > New lightweight framework to develop Java/Scala MapReduce jobs. Inspired from > Matt's work on Hourglass and my experience in developing Java jobs on Hadoop. > It's a thin layer on top of the Hadoop API which mostly reduces boilerplate > code and automate configuration. > Features (see details on README): > * Built-in support for Avro input and output formats > * Though we recommend using Avro, one can use any input/output format class > * Mapper, reducer and intermediate key/value classes are inferred when > possible > * Avro schemas are inferred when using POJO objects > * Staged output to avoid deleting the existing file if the job fails > * Estimate the number of reducers needed if not provided > * Supports `#LATEST` suffix in input paths to work with timestamped folders -- This message was sent by Atlassian JIRA (v6.2#6252)