On Sep 25, 2008, at 2:26 PM, Joe Shaw wrote:

Hi,

I'm trying to build an index using the "index" contrib in Hadoop
0.18.0, but the reduce tasks are consistently failing.


What did the logs for the task-attempt 'attempt_200809180916_0027_r_000007_2' look like? Did the TIP/Job succeed?

Arun

In the output from the "hadoop jar" command, I see messages like this:

08/09/25 14:12:11 INFO mapred.JobClient:  map 27% reduce 4%
08/09/25 14:12:23 INFO mapred.JobClient: Task Id :
attempt_200809180916_0027_r_000007_2, Status : FAILED
java.io.IOException: attempt_200809180916_0027_r_000007_2The reduce
copier failed
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 2209)

and eventually failing.

The output from "hadoop job -history" gives me:

Task Summary
============================
Kind Total Successful Failed Killed StartTime FinishTime

Map     57      57              0       0       25-Sep-2008 14:03:07
25-Sep-2008 14:13:17 (10mins, 9sec)
Reduce  4       0               4       0       25-Sep-2008 14:03:14
25-Sep-2008 14:13:21 (10mins, 7sec)
============================

and

FAILED REDUCE task list for job_200809180916_0027
TaskId          StartTime       FinishTime      Error
====================================================
task_200809180916_0027_r_000007 25-Sep-2008 14:03:14    25-Sep-2008
14:13:21 (10mins, 7sec)

Grepping in the logs for that task, I see this consistently on the TaskTrackers:

hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:08,685 INFO
org.apache.hadoop.mapred.TaskTracker:
attempt_200809180916_0027_r_000007_1 0.016147636% reduce > copy (14 of
289 at 8.37 MB/s) >
hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:11,904 INFO
org.apache.hadoop.mapred.TaskTracker:
attempt_200809180916_0027_r_000007_1 0.018454442% reduce > copy (16 of
289 at 7.85 MB/s) >
hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:17,337 INFO
org.apache.hadoop.mapred.TaskRunner:
attempt_200809180916_0027_r_000007_1 done; removing files.

As you can see, it apparently is unable to copy the data, but it gives
me absolutely no idea why.  The JobTracker logs also give me no useful
information.

Anybody have an idea what's going on, or how I might go about debugging this?

Thanks,
Joe

Reply via email to