Hello John,

On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann <[EMAIL PROTECTED]> wrote:

>
> I have a large Hadoop streaming job that generally works fine,
> but a few (2-4) of the ~3000 maps and reduces have problems.
> To make matters worse, the problems are system-dependent (we run an a
> cluster with machines of slightly different OS versions).
> I'd of course like to debug these problems, but they are embedded in a
> large job.
>
> Is there a way to extract the input given to a reducer from a job, given
> the task identity?  (This would also be helpful for mappers.)


I believe you should set "keep.failed.tasks.files" to true -- this way, give
a task id, you can see what input files it has in ~/
taskTracker/${taskid}/work (source:
http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#IsolationRunner
)

On top of that, you can always use the debugging facilities:

http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Debugging

"When map/reduce task fails, user can run script for doing post-processing
on task logs i.e task's stdout, stderr, syslog and jobconf. The stdout and
stderr of the user-provided debug script are printed on the diagnostics. "

I hope this helps.

Regards,

Leon Mergen

Reply via email to