This should be the default behavior for configurations when locality (processing the data on the same node where the input resides) cannot be achieved anyway.

For example in cases MapReduce job runs on a (small) subset the DFS nodes.



Runping Qi (JIRA) wrote:
It is nice if hadoop provides NonSplitable TextInputFormat and 
SequenceFileInputFormat
--------------------------------------------------------------------------------------

                 Key: HADOOP-1617
                 URL: https://issues.apache.org/jira/browse/HADOOP-1617
             Project: Hadoop
          Issue Type: Improvement
            Reporter: Runping Qi



As more applications find -reduce NONE option useful, they also want to control 
the splitability of the inputs.
A simple way to do this is to implement a class that extends 
SequenceFileInputFormat/TextInputFormat class.
It would be nice if the Framework provides such classes.




Reply via email to