Hi,
Our intention is to solve this in a generic context, not just file input.
Thus the split class should be generic (very similar to CompositeInputSplit
from mapred).
We also already implement getRecordReader by iterating over record readers
created by the decorated input format (this method i
This sounds similar to MultiFileInputFormat
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/h
adoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apach
e/hadoop/mapred/MultiFileInputFormat.java?revision=1239482&view=markup
It would be nice if you could
Hi,
When running map-reduce with many splits it would be nice from a performance
perspective to have fewer splits while maintaining data locality, so that the
overhead of running a map task (jvm creation, map executor ramp-up e.g. spring
context, etc) be less impactful when frequently running m