Hello John, On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann <[EMAIL PROTECTED]> wrote:
> > I have a large Hadoop streaming job that generally works fine, > but a few (2-4) of the ~3000 maps and reduces have problems. > To make matters worse, the problems are system-dependent (we run an a > cluster with machines of slightly different OS versions). > I'd of course like to debug these problems, but they are embedded in a > large job. > > Is there a way to extract the input given to a reducer from a job, given > the task identity? (This would also be helpful for mappers.) I believe you should set "keep.failed.tasks.files" to true -- this way, give a task id, you can see what input files it has in ~/ taskTracker/${taskid}/work (source: http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#IsolationRunner ) On top of that, you can always use the debugging facilities: http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Debugging "When map/reduce task fails, user can run script for doing post-processing on task logs i.e task's stdout, stderr, syslog and jobconf. The stdout and stderr of the user-provided debug script are printed on the diagnostics. " I hope this helps. Regards, Leon Mergen