hi,

I am running a mapreduce job on my hadoop cluster.

I am running a 10 gigabytes data and one tiny failed task crashes the whole
operation.
I am up to 98% complete and throwing away all the finished data seems just
like an awful waste.
I'd like to save the finished data and run again only the failed ones(the
remaining 2%).

Is there any way to figure out the range of the splits that failed?
I go to "localhost:50030" to see if I can find any useful information but I
must be looking at wrong places.

Could somebody help me with this problem?


Below is the log of a failed task. Any information I can use?

*syslog logs*

Records R/W=41707/41639
2010-06-30 07:35:30,530 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=41776/41726
2010-06-30 07:35:40,554 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=41865/41804
2010-06-30 07:35:50,559 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=41970/41932
2010-06-30 07:36:00,637 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42073/42065
2010-06-30 07:36:10,772 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42258/42196
2010-06-30 07:36:20,785 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42318/42274
2010-06-30 07:36:30,985 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42378/42351
2010-06-30 07:36:41,005 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42442/42419
2010-06-30 07:36:51,149 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42499/42484
2010-06-30 07:37:01,235 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42559/42547
2010-06-30 07:37:11,242 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42626/42611
2010-06-30 07:37:21,485 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42769/42704
2010-06-30 07:37:31,617 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42845/42782
2010-06-30 07:37:41,725 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42915/42875
2010-06-30 07:37:51,733 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=42986/42949
2010-06-30 07:38:01,795 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=43070/43051
2010-06-30 07:38:11,849 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=43138/43136
2010-06-30 07:38:22,398 INFO org.apache.hadoop.streaming.PipeMapRed:
Records R/W=43258/43200
2010-06-30 07:38:31,642 INFO org.apache.hadoop.streaming.PipeMapRed:
MRErrorThread done
2010-06-30 07:38:31,643 INFO org.apache.hadoop.streaming.PipeMapRed:
MROutputThread done
2010-06-30 07:38:31,765 INFO org.apache.hadoop.streaming.PipeMapRed: log:null
R/W/S=43335/43271/0 in:7=43335/5885 [rec/s] out:7=43271/5885 [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=hadoop
HADOOP_USER=null
last Hadoop input: |null|
last tool output: |[...@d22860|
Date: Wed Jun 30 07:38:31 KST 2010
java.io.IOException: Broken pipe
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.streaming.PipeMapRed.write(PipeMapRed.java:635)
        at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:105)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


2010-06-30 07:38:31,766 INFO org.apache.hadoop.streaming.PipeMapRed:
PipeMapRed failed!
2010-06-30 07:38:31,766 INFO org.apache.hadoop.streaming.PipeMapRed:
PipeMapRed failed!
2010-06-30 07:38:32,028 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 139
        at 
org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:311)
        at 
org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:545)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:132)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
2010-06-30 07:38:32,029 INFO org.apache.hadoop.mapred.TaskRunner:
Runnning cleanup for the task

Reply via email to