Hi: I tried to run one of my map/reduce jobs on a cluster (hadoop 0.17.0). I used 10 reducers. 9 of them returns quickly ( in a few seconds), but one has been running for several hours, and still no sign of completion. Do you know how I can debug it or find out what is going on with this reducer?
Also, for completed reducers, I can see how many bytes it reads. Is there a way I can get this information for a running reducer? FYi, the following log messages are from tasktracker. 2008-07-28 21:42:31,849 INFO org.apache.hadoop.mapred.ReduceTask: task_200807251807_0057_r_000001_1 Got 0 known map output location(s); scheduling... 2008-07-28 21:42:31,849 INFO org.apache.hadoop.mapred.ReduceTask: task_200807251807_0057_r_000001_1 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts) 2008-07-28 21:42:31,849 INFO org.apache.hadoop.mapred.ReduceTask: task_200807251807_0057_r_000001_1 Copying of all map outputs complete. Initiating the last merge on the remaining files in ramfs://mapoutput24152206 2008-07-28 21:42:34,917 INFO org.apache.hadoop.mapred.ReduceTask: task_200807251807_0057_r_000001_1 Merge of the 130 files in InMemoryFileSystem complete. Local file is /disk1/middleware/hadoop-chliu2/mapred/local/taskTracker/jobcache/job_200807251807_0057/task_200807251807_0057_r_000001_1/output/map_29.out -- tp