Quick FYI: I've run the same job twice more without seeing the error. / Per
On Wed, Oct 1, 2008 at 11:07 AM, Per Jacobsson <[EMAIL PROTECTED]> wrote: > Hi everyone, > (apologies if this gets posted on the list twice for some reason, my first > attempt was denied as "suspected spam") > > I ran a job last night with Hadoop 0.18.0 on EC2, using the standard small > AMI. The job was producing gzipped output, otherwise I haven't changed the > configuration. > > The final reduce steps failed with this error that I haven't seem before: > > 2008-10-01 05:02:39,810 WARN org.apache.hadoop.mapred.ReduceTask: > attempt_200809301822_0005_r_000001_0 Merging of the local FS files threw an > exception: java.io.IOException: java.io.IOException: Rec# 289050: Negative > value-length: -96 > at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:331) > at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:134) > at > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:225) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:242) > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:83) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2021) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2025) > > 2008-10-01 05:02:44,131 WARN org.apache.hadoop.mapred.TaskTracker: Error > running child > java.io.IOException: attempt_200809301822_0005_r_000001_0The reduce copier > failed > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > > When I try to download the data from HDFS I get a "Found checksum error" > warning message. > > Any ideas what could be the cause? Would upgrading to 0.18.1 solve it? > Thanks, > / Per > >