If you want to step through a full map / reduce job, the easiest way to do this is to run a job using the local job runner in your IDE. The local job runner will run the MR job in a single thread making it very easy to debug. You will want to use the local file system and a small amount of data during this type of testing / debugging. Note that the local job runner runs map tasks, sort and shuffle, and reducers sequentially with no parallelism.
Set the following properties to enable the local job runner and local file system: mapred.job.tracker = local fs.default.name = file:/// Attempting to attach a debugger to a real task tracker is problematic because user code is run in separate jvms, etc. It's almost never worth it. Most debugging (with a real debugger) is better done using MRUnit and the local job runner. Hope this helps and good luck. On Tue, Apr 27, 2010 at 7:27 AM, psdc1978 <[email protected]> wrote: > Hi, > > The reduce tasks are threads that are launched by the Reducer. The print > below shows the stacktrace of one reduce task. > > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchHashesOutputs(ReduceTask.java:2582) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395) > at org.apache.hadoop.mapred.Child.main(Child.java:194) > > I would like to debug this thread in a IDE but I don't know how to do it. > Should I define properties to do this? Is there a way to do it? > > Thanks > > -- > PSC > -- Eric Sammer phone: +1-917-287-2675 twitter: esammer data: www.cloudera.com
