Well,
I think you are asking :
if you have 3 machines, and you want to start 3 maps, one for each
input file, will each maps reside on each different machine?

The answer is: not necessarily.
In responding heartbeat from a task tracker, task scheduler tries to
assign one local map task for each job in the queue. If no local map
task is available, scheduler assigns a non-local map task. So, no
matter you have 1 or 3 replications for your file in HDFS, there's
chances one machine takes 2 (or 3) maps.

Usually, your 3 machines heartbeat to job tracker almost at the same
time and get a local map each. This is most likely to happen. But if
one of your machine stuck for some reason for a while, depends on how
long a map will take, another machine may take its map.

On Fri, Mar 4, 2011 at 9:36 AM, maha <m...@umail.ucsb.edu> wrote:
> Hi,
>
>  Using 3 Machines, each has an input-File  ' f ' in its local disk in 
> addition to HDFS , assuming my program spawns a mapper/file .
>
> Does that mean that  mappers will be running on different machines?
>
> Thank you,
> Maha

Reply via email to