Hello,
I recall asking this question but this is in addition to what I'ev askd.
        Firstly, to recap my question and Arun's specific response:

--      On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: > Hello, >
-- Does the "Data-local map tasks" counter mean the number of tasks that the had the input data already present on the machine on they are running on?
--      i.e the wasn't a need to ship the data to them.

        Response from Arun
-- Yes. Your understanding is correct. More specifically it means that the map-task got scheduled on a machine on which one of the -- replicas of it's input-split-block was present and was served by the datanode running on that machine. *smile* Arun


Now, Is Hadoop designed to schedule a map task on a machine which has one of the replicas of it's input split block? Failing that, does then assign a map task on machine close to one that contains a replica of it's input split block?
        Are there any performance metrics for this?

        Many thanks
        Saptarshi


Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to