[ http://issues.apache.org/jira/browse/HADOOP-573?page=all ]
Doug Cutting updated HADOOP-573: -------------------------------- Component/s: mapred Description: Many reduce tasks got killed due to checksum error. The strange thing is that the file was generated by the sort function, and was on a local disk. Here is the stack: Checksum error: ../task_0011_r_000140_0/all.2.1 at 5342920704 at org.apache.hadoop.fs.FSDataInputStream$Checker.verifySum(FSDataInputStream.java:134) at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:110) at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:313) at java.io.DataInputStream.readFully(DataInputStream.java:176) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:89) at org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1061) at org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1126) at org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:1354) at org.apache.hadoop.io.SequenceFile$Sorter$MergeStream.next(SequenceFile.java:1880) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:1938) at org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:1802) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:1749) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:1494) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:240) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) was: Many reduce tasks got killed due to checksum error. The strange thing is that the file was generated by the sort function, and was on a local disk. Here is the stack: Checksum error: ../task_0011_r_000140_0/all.2.1 at 5342920704 at org.apache.hadoop.fs.FSDataInputStream$Checker.verifySum(FSDataInputStream.java:134) at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:110) at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:313) at java.io.DataInputStream.readFully(DataInputStream.java:176) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:89) at org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1061) at org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1126) at org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:1354) at org.apache.hadoop.io.SequenceFile$Sorter$MergeStream.next(SequenceFile.java:1880) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:1938) at org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:1802) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:1749) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:1494) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:240) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) > Checksum error during sorting in reducer > ---------------------------------------- > > Key: HADOOP-573 > URL: http://issues.apache.org/jira/browse/HADOOP-573 > Project: Hadoop > Issue Type: Bug > Components: mapred > Reporter: Runping Qi > > Many reduce tasks got killed due to checksum error. The strange thing is that > the file was generated by the sort function, and was on a local disk. Here is > the stack: > Checksum error: ../task_0011_r_000140_0/all.2.1 at 5342920704 > at > org.apache.hadoop.fs.FSDataInputStream$Checker.verifySum(FSDataInputStream.java:134) > at > org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:110) > at > org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at java.io.BufferedInputStream.read(BufferedInputStream.java:313) > at java.io.DataInputStream.readFully(DataInputStream.java:176) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55) > at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:89) > at > org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1061) > at > org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1126) > at > org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:1354) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeStream.next(SequenceFile.java:1880) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:1938) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:1802) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:1749) > at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:1494) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:240) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira