Map tasks are generated based on InputSplits. An InputSplit is a logical
description of the work that a task should use. The array of InputSplit
objects is created on the client by the InputFormat.
org.apache.hadoop.mapreduce.InputSplit has an abstract method:
/**
* Get the list of nodes by n
Hi,
I'm running Hadoop 0.19.1 on 19 nodes. I've been benchmarking a Hadoop
workload with 115 Map tasks, on two different distributed filesystems
(KFS and PVFS); in some tests, I also have a write-intensive non-Hadoop
job running in the background (an HPC checkpointing benchmark). I've
found t