On Thu, 07 Aug 2008 19:42:05 +0200, "Leon Mergen" wrote: 
>Hello John,
>
>On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann <[EMAIL PROTECTED]> wrote:
>
>>
>> I have a large Hadoop streaming job that generally works fine,
>> but a few (2-4) of the ~3000 maps and reduces have problems.
>> To make matters worse, the problems are system-dependent (we run an a
>> cluster with machines of slightly different OS versions).
>> I'd of course like to debug these problems, but they are embedded in a
>> large job.
>>
>> Is there a way to extract the input given to a reducer from a job, given
>> the task identity?  (This would also be helpful for mappers.)
>
>
>I believe you should set "keep.failed.tasks.files" to true -- this way, give
>a task id, you can see what input files it has in ~/
>taskTracker/${taskid}/work (source:
>http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#IsolationRunner
>)
>
>On top of that, you can always use the debugging facilities:
>
>http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Debugging
>
>"When map/reduce task fails, user can run script for doing post-processing
>on task logs i.e task's stdout, stderr, syslog and jobconf. The stdout and
>stderr of the user-provided debug script are printed on the diagnostics. "
>
>I hope this helps.

Thanks.

It looks like IsolationRunner is what I'm asking for.  I'll try it out.

I was aware of the logs, but unfortunately, have problems where inputs hang
or don't log meaningful information.


Separtely I found the output from the map stage
(In our config, in:
.../hadoop-hadoop/mapred/local/taskTracker/jobcache/job_200808051739_0005/attempt_200808051739_0005_r_000009_0/output/

which is a bit different than taskTracker/${taskid}/work.  There's a
work dir parallel to output, but it's empty.
)

Hopefully isolation runner will deal with this layout.

   -John

Reply via email to