nicu marasoiu created MAPREDUCE-5287: ----------------------------------------
Summary: Create a generic InputFormat wrapping any other InputFormat, to control the number of map tasks Key: MAPREDUCE-5287 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5287 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Reporter: nicu marasoiu I wrote a generic InputFormat that wraps any other InputFormat, and creates CompositeInputSplits to reduce the number of map tasks in a controllable manner while preserving data locality. A correspondent CompositeRecordReader is written to iterate through underlying RecordReaders as created by the underlying InputFormat for each underlying raw split. An application to this is to group TableSplits when the raw splits are coming from multiple regions and are filtered with key ranges. We use this to shard/distribute a time based incremental access to an hbase table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira