Re: Adding $CLASSPATH to Map/Reduce tasks

2008-09-26 Thread Joe Shaw
Hi,

On Fri, Sep 26, 2008 at 10:50 AM, Samuel Guo [EMAIL PROTECTED] wrote:
 maybe you can use
 bin/hadoop jar -libjars ${your-depends-jars} your.mapred.jar args

 see details:
 http://hadoop.apache.org/core/docs/r0.18.1/api/org/apache/hadoop/mapred/JobShell.html

Indeed, I was having the same issue trying to get a Lucene jar file
into a running task.  Despite what the docs say, it works with the
jar option to the hadoop command.  (The docs I read said it only
worked with job and a couple other commands; unfortunately I don't
have a link to that page at the moment.)

Joe


The reduce copier failed

2008-09-25 Thread Joe Shaw
Hi,

I'm trying to build an index using the index contrib in Hadoop
0.18.0, but the reduce tasks are consistently failing.

In the output from the hadoop jar command, I see messages like this:

08/09/25 14:12:11 INFO mapred.JobClient:  map 27% reduce 4%
08/09/25 14:12:23 INFO mapred.JobClient: Task Id :
attempt_200809180916_0027_r_07_2, Status : FAILED
java.io.IOException: attempt_200809180916_0027_r_07_2The reduce
copier failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)

and eventually failing.

The output from hadoop job -history gives me:

Task Summary

KindTotal   Successful  Failed  Killed  StartTime   FinishTime

Map 57  57  0   0   25-Sep-2008 14:03:07
 25-Sep-2008 14:13:17 (10mins, 9sec)
Reduce  4   0   4   0   25-Sep-2008 14:03:14
 25-Sep-2008 14:13:21 (10mins, 7sec)


and

FAILED REDUCE task list for job_200809180916_0027
TaskId  StartTime   FinishTime  Error

task_200809180916_0027_r_07 25-Sep-2008 14:03:1425-Sep-2008
14:13:21 (10mins, 7sec)

Grepping in the logs for that task, I see this consistently on the TaskTrackers:

hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:08,685 INFO
org.apache.hadoop.mapred.TaskTracker:
attempt_200809180916_0027_r_07_1 0.016147636% reduce  copy (14 of
289 at 8.37 MB/s) 
hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:11,904 INFO
org.apache.hadoop.mapred.TaskTracker:
attempt_200809180916_0027_r_07_1 0.018454442% reduce  copy (16 of
289 at 7.85 MB/s) 
hadoop-jshaw-tasktracker-ars1dev3.log:2008-09-25 14:09:17,337 INFO
org.apache.hadoop.mapred.TaskRunner:
attempt_200809180916_0027_r_07_1 done; removing files.

As you can see, it apparently is unable to copy the data, but it gives
me absolutely no idea why.  The JobTracker logs also give me no useful
information.

Anybody have an idea what's going on, or how I might go about debugging this?

Thanks,
Joe


Re: The reduce copier failed

2008-09-25 Thread Joe Shaw
Hi,

On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote:
 What did the logs for the task-attempt
 'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job succeed?

You mean inside userlogs/attempt_blah_blah/syslog?  I didn't know
about this log file before, thanks!

The last several lines of output are:

2008-09-25 17:12:18,250 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200809180916_0027_r_07_2: Got 2 new map-outputs  number
of known map outputs is 21
2008-09-25 17:12:18,251 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_200809180916_0027_r_07_2 Merge of the inmemory files threw
an exception: java.io.IOException: Intermedate merge failed
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: segment
_bfu exists in external directory yet the MergeScheduler executed the
merge in a separate thread
at 
org.apache.lucene.index.IndexWriter.copyExternalSegments(IndexWriter.java:2362)
at 
org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:2307)
at 
org.apache.hadoop.contrib.index.mapred.IntermediateForm.process(IntermediateForm.java:135)
at 
org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:56)
at 
org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:38)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2160)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2120)
... 1 more

2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Read
14523136 bytes from map-output for
attempt_200809180916_0027_m_16_0
2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200809180916_0027_m_16_0 - (41, 10651153) from
ars1dev6
2008-09-25 17:12:19,110 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 14506735 bytes (14506735 raw bytes) into RAM from
attempt_200809180916_0027_m_09_0
2008-09-25 17:12:19,226 INFO org.apache.hadoop.mapred.ReduceTask:
Closed ram manager
2008-09-25 17:12:19,228 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.io.IOException: attempt_200809180916_0027_r_07_2The reduce
copier failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)

Thanks,
Joe


Re: The reduce copier failed

2008-09-25 Thread Joe Shaw
Hi again,

Ugh, sorry about the butchered output.

On Thu, Sep 25, 2008 at 5:42 PM, Joe Shaw [EMAIL PROTECTED] wrote:
 Hi,

 On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote:
 What did the logs for the task-attempt
 'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job succeed?

 You mean inside userlogs/attempt_blah_blah/syslog?  I didn't know
 about this log file before, thanks!

 The last several lines of output are:

... now attached as a text file to the email.

Joe
2008-09-25 17:12:18,223 INFO org.apache.hadoop.mapred.ReduceTask: 
attempt_200809180916_0027_r_07_2: Got 1 new map-outputs  number of known 
map outputs is 19
2008-09-25 17:12:18,250 INFO org.apache.hadoop.mapred.ReduceTask: 
attempt_200809180916_0027_r_07_2: Got 2 new map-outputs  number of known 
map outputs is 21
2008-09-25 17:12:18,251 WARN org.apache.hadoop.mapred.ReduceTask: 
attempt_200809180916_0027_r_07_2 Merge of the inmemory files threw an 
exception: java.io.IOException: Intermedate merge failed
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2133)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2064)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: segment _bfu 
exists in external directory yet the MergeScheduler executed the merge in a 
separate thread
at 
org.apache.lucene.index.IndexWriter.copyExternalSegments(IndexWriter.java:2362)
at 
org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:2307)
at 
org.apache.hadoop.contrib.index.mapred.IntermediateForm.process(IntermediateForm.java:135)
at 
org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:56)
at 
org.apache.hadoop.contrib.index.mapred.IndexUpdateCombiner.reduce(IndexUpdateCombiner.java:38)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2160)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2120)
... 1 more

2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Read 14523136 
bytes from map-output for attempt_200809180916_0027_m_16_0
2008-09-25 17:12:19,087 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from 
attempt_200809180916_0027_m_16_0 - (41, 10651153) from ars1dev6
2008-09-25 17:12:19,110 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 
14506735 bytes (14506735 raw bytes) into RAM from 
attempt_200809180916_0027_m_09_0
2008-09-25 17:12:19,226 INFO org.apache.hadoop.mapred.ReduceTask: Closed ram 
manager
2008-09-25 17:12:19,228 WARN org.apache.hadoop.mapred.TaskTracker: Error 
running child
java.io.IOException: attempt_200809180916_0027_r_07_2The reduce copier 
failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)