Bejoy,

On 16-Jan-2012, at 6:46 PM, Bejoy Ks wrote:
>       A quick question. I have quite a few map reduce jobs running on my 
> cluster. One job's input itself has a large number of files, I'd like to know 
> which split was processed by each map task without doing any custom logging 
> (for successful, falied & killed tasks) . I tried digging into the job 
> tracker web UI but I just got a pointer as input split location which 
> specifies the nodes in which it is located, but what I'm looking for is the 
> file name  and which split of that file.

Initially the status (via reporter) of a task is set to the FileSplit's path 
plus offset and length, but that's all.

> Where can I find this information ? 

Unfortunately, none of this is logged by default. Please file a JIRA to have it 
added/discuss how to add this (do follow up this thread with the ID)

> Is it available or can I make it available in in jobdetails.jsp? 

No, but you can write a short utility program that emulates the splitter and 
prints the mapping with that.

> Do I need to enable some configuration parameter to display the same?

No, as far as I know there is none.

> Is it possible only by custom logging and don't hadoop framework provide the 
> same?

Framework does not provide this, so custom logging is the easiest way if it is 
possible.

--
Harsh J
Customer Ops. Engineer, Cloudera

Reply via email to