Dave Beech created CRUNCH-165:
---------------------------------
Summary: Pipelines should automatically use CombineFileInputFormat
where input consists of many small files
Key: CRUNCH-165
URL: https://issues.apache.org/jira/browse/CRUNCH-165
Project: Crunch
Issue Type: Improvement
Components: Core
Affects Versions: 0.4.0
Reporter: Dave Beech
Assignee: Josh Wills
Hive had a feature introduced in HIVE-74 whereby CombineFileInputFormat would
be used if the input data consisted of many small files, making the resulting
mapreduce jobs more efficient by giving individual mappers more data to
process. This would be a nice feature for Crunch to have, too.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira