Just a follow up to see if anyone can shed some light on this:
My understanding is that each block after getting replicated 3 times, a map 
task is run on each of the replica in parallel.
The thing i am trying to double verify is in a scenario where a file is split 
into 10K or 100K or more blocks it will result in atleast 300K Map tasks being 
performed and this looks like an overkill from a performance or just a logical 
perspective. 
Will appreciate any thoughts on this.
Thanks
Sai

________________________________
 From: Sai Sai <saigr...@yahoo.in>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>; Sai Sai 
<saigr...@yahoo.in> 
Sent: Friday, 12 April 2013 1:37 PM
Subject: Re: Does a Map task run 3 times on 3 TTs or just once
 


Just wondering if it is right to assume that a Map task is run 3 times on 3 
different TTs in parallel and whoever completes processing the task first that 
output is picked up and written to intermediate location.
Or is it true that a map task even though its data is replicated 3 times will 
run only once and other 2 will be on the stand by just incase this fails the 
second one will run followed by 3rd one if the 2nd Mapper fails.
Plesae pour some light.
Thanks
Sai

Reply via email to