Re: MapRed ports

Arun C Murthy Wed, 10 Feb 2010 01:17:54 -0800


On Feb 9, 2010, at 9:47 PM, psdc1978 wrote:

Hi,
I've some question about the MapRed ports and how a reduce knowswhere the map output is to fetch.
I know that MapRed uses jetty has a webserver.
- The JobTracker send tasks to the TaskTracker execute them throughport 50060?

TT sends a heartbeat RPC periodically, the response to which containsthe new tasks to be launched.

- Which port TaskTracker uses to send status about the task that itsexecuting to the JobTracker? Is it through port 50030?

The TT uses the JT's RPC port (which is *not* 50030 by default),configured by mapred.job.tracker.

- The Reduce task in the shuffle phase must copy the map outputs. Inwhich class is the part of the code where Reduce will fetch the mapoutput? This part of the code is executed by the TaskTracker process?

The reduce task itself (in a separate JVM from the TT) fetches mapoutputs, look at o.a.h.mapred.ReduceTask:ReduceCopier.fetchOutputs().

- The directory where the map output is to the reduce task use, issent by the JobTracker? If so, this means that the JobTracker wasinformed by the task tracker where a map run, right?

JT knows where each successful map-task was scheduled, the reduce-taskgets this information via TaskCompletionEvents(ReduceTask.ReduceCopier.GetMapEventsThread).

- The class org.apache.hadoop.mapred.ReduceTask is used? If so,which process use this class? Is it the TaskTracker process?


That is code being run in the child jvm of the ReduceTask.

Arun

Re: MapRed ports

Reply via email to