First i would like to declare that although i am not new to hadoop, but not expert on it as well.i would like to consult one issue on mapreduce framework. below is the description of the scenarios. When one reduce task is failed on one datanode, then the job tracker will try to schedule another node to set up this reduce job and continue running, my question is how to get the assigned data back on the new node? when the map phase is done, the output data will be copied to the respective partitioned reducer, now if the reduce is created on the a new node, what kind of actions does the new node take to get all the map-allocated data back.
thanks in advance. James.