Re: Adding $CLASSPATH to Map/Reduce tasks
Hi, On Fri, Sep 26, 2008 at 10:50 AM, Samuel Guo [EMAIL PROTECTED] wrote: maybe you can use bin/hadoop jar -libjars ${your-depends-jars} your.mapred.jar args see details: http://hadoop.apache.org/core/docs/r0.18.1/api/org/apache/hadoop/mapred/JobShell.html Indeed, I was having the same issue trying to get a Lucene jar file into a running task. Despite what the docs say, it works with the jar option to the hadoop command. (The docs I read said it only worked with job and a couple other commands; unfortunately I don't have a link to that page at the moment.) Joe
The reduce copier failed
Hi, I'm trying to build an index using the index contrib in Hadoop 0.18.0, but the reduce tasks are consistently failing. In the output from the hadoop jar command, I see messages like this: 08/09/25 14:12:11 INFO mapred.JobClient: map 27% reduce 4% 08/09/25 14:12:23 INFO mapred.JobClient: Task Id : attempt_200809180916_0027_r_07_2, Status : FAILED java.io.IOException: attempt_200809180916_0027_r_07_2The reduce copier failed at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) and eventually failing. The output from hadoop job -history gives me: Task Summary KindTotal Successful Failed Killed StartTime FinishTime Map 57 57 0 0 25-Sep-2008 14:03:07 25-Sep-2008 14:13:17 (10mins, 9sec) Reduce 4 0 4 0 25-Sep-2008 14:03:14 25-Sep-2008 14:13:21 (10mins, 7sec) and FAILED REDUCE task list for job_200809180916_0027 TaskId StartTime FinishTime Error task_200809180916_0027_r_07 25-Sep-2008 14:03:1425-Sep-2008 14:13:21 (10mins, 7sec) Grepping in the logs for that task, I see this consistently on the TaskTrackers: hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:08,685 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200809180916_0027_r_07_1 0.016147636% reduce copy (14 of 289 at 8.37 MB/s) hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:11,904 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200809180916_0027_r_07_1 0.018454442% reduce copy (16 of 289 at 7.85 MB/s) hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:17,337 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200809180916_0027_r_07_1 done; removing files. As you can see, it apparently is unable to copy the data, but it gives me absolutely no idea why. The JobTracker logs also give me no useful information. Anybody have an idea what's going on, or how I might go about debugging this? Thanks, Joe
Re: The reduce copier failed
Hi, On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote: What did the logs for the task-attempt 'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job succeed? You mean inside userlogs/attempt_blah_blah/syslog? I didn't know about this log file before, thanks! The last several lines of output are: 2008-09-25 17:12:18,250 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809180916_0027_r_07_2: Got 2 new map-outputs number of known map outputs is 21 2008-09-25 17:12:18,251 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200809180916_0027_r_07_2 Merge of the inmemory files threw an exception: java.io.IOException: Intermedate merge failed at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064) Caused by: org.apache.lucene.index.MergePolicy$MergeException: segment _bfu exists in external directory yet the MergeScheduler executed the merge in a separate thread at org.apache.lucene.index.IndexWriter.copyExternalSegments(IndexWriter.java:2362) at org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:2307) at org.apache.hadoop.contrib.index.mapred.IntermediateForm.process(IntermediateForm.java:135) at org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:56) at org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:38) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2160) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2120) ... 1 more 2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Read 14523136 bytes from map-output for attempt_200809180916_0027_m_16_0 2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_200809180916_0027_m_16_0 - (41, 10651153) from ars1dev6 2008-09-25 17:12:19,110 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 14506735 bytes (14506735 raw bytes) into RAM from attempt_200809180916_0027_m_09_0 2008-09-25 17:12:19,226 INFO org.apache.hadoop.mapred.ReduceTask: Closed ram manager 2008-09-25 17:12:19,228 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: attempt_200809180916_0027_r_07_2The reduce copier failed at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) Thanks, Joe
Re: The reduce copier failed
Hi again, Ugh, sorry about the butchered output. On Thu, Sep 25, 2008 at 5:42 PM, Joe Shaw [EMAIL PROTECTED] wrote: Hi, On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote: What did the logs for the task-attempt 'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job succeed? You mean inside userlogs/attempt_blah_blah/syslog? I didn't know about this log file before, thanks! The last several lines of output are: ... now attached as a text file to the email. Joe 2008-09-25 17:12:18,223 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809180916_0027_r_07_2: Got 1 new map-outputs number of known map outputs is 19 2008-09-25 17:12:18,250 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809180916_0027_r_07_2: Got 2 new map-outputs number of known map outputs is 21 2008-09-25 17:12:18,251 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200809180916_0027_r_07_2 Merge of the inmemory files threw an exception: java.io.IOException: Intermedate merge failed at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064) Caused by: org.apache.lucene.index.MergePolicy$MergeException: segment _bfu exists in external directory yet the MergeScheduler executed the merge in a separate thread at org.apache.lucene.index.IndexWriter.copyExternalSegments(IndexWriter.java:2362) at org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:2307) at org.apache.hadoop.contrib.index.mapred.IntermediateForm.process(IntermediateForm.java:135) at org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:56) at org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:38) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2160) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2120) ... 1 more 2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Read 14523136 bytes from map-output for attempt_200809180916_0027_m_16_0 2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_200809180916_0027_m_16_0 - (41, 10651153) from ars1dev6 2008-09-25 17:12:19,110 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 14506735 bytes (14506735 raw bytes) into RAM from attempt_200809180916_0027_m_09_0 2008-09-25 17:12:19,226 INFO org.apache.hadoop.mapred.ReduceTask: Closed ram manager 2008-09-25 17:12:19,228 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException: attempt_200809180916_0027_r_07_2The reduce copier failed at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)