Re: Locality when placing Map tasks

2009-10-06 Thread Aaron Kimball
Map tasks are generated based on InputSplits. An InputSplit is a logical description of the work that a task should use. The array of InputSplit objects is created on the client by the InputFormat. org.apache.hadoop.mapreduce.InputSplit has an abstract method: /** * Get the list of nodes by n

Locality when placing Map tasks

2009-10-03 Thread Esteban Molina-Estolano
Hi, I'm running Hadoop 0.19.1 on 19 nodes. I've been benchmarking a Hadoop workload with 115 Map tasks, on two different distributed filesystems (KFS and PVFS); in some tests, I also have a write-intensive non-Hadoop job running in the background (an HPC checkpointing benchmark). I've found t