Bejoy, On 16-Jan-2012, at 6:46 PM, Bejoy Ks wrote: > A quick question. I have quite a few map reduce jobs running on my > cluster. One job's input itself has a large number of files, I'd like to know > which split was processed by each map task without doing any custom logging > (for successful, falied & killed tasks) . I tried digging into the job > tracker web UI but I just got a pointer as input split location which > specifies the nodes in which it is located, but what I'm looking for is the > file name and which split of that file.
Initially the status (via reporter) of a task is set to the FileSplit's path plus offset and length, but that's all. > Where can I find this information ? Unfortunately, none of this is logged by default. Please file a JIRA to have it added/discuss how to add this (do follow up this thread with the ID) > Is it available or can I make it available in in jobdetails.jsp? No, but you can write a short utility program that emulates the splitter and prints the mapping with that. > Do I need to enable some configuration parameter to display the same? No, as far as I know there is none. > Is it possible only by custom logging and don't hadoop framework provide the > same? Framework does not provide this, so custom logging is the easiest way if it is possible. -- Harsh J Customer Ops. Engineer, Cloudera