[ https://issues.apache.org/jira/browse/MAPREDUCE-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nicu marasoiu reassigned MAPREDUCE-5287: ---------------------------------------- Assignee: nicu marasoiu (was: Owen O'Malley) > Create a generic InputFormat wrapping any other InputFormat, to control the > number of map tasks > ----------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5287 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5287 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, performance > Reporter: nicu marasoiu > Assignee: nicu marasoiu > Priority: Minor > Original Estimate: 4h > Remaining Estimate: 4h > > I wrote a generic InputFormat that wraps any other InputFormat, and creates > CompositeInputSplits to reduce the number of map tasks in a controllable > manner while preserving data locality. A correspondent CompositeRecordReader > is written to iterate through underlying RecordReaders as created by the > underlying InputFormat for each underlying raw split. > An application to this is to group TableSplits when the raw splits are coming > from multiple regions and are filtered with key ranges. We use this to > shard/distribute a time based incremental access to an hbase table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira