[ http://issues.apache.org/jira/browse/HADOOP-502?page=comments#action_12432651 ] Doug Cutting commented on HADOOP-502: -------------------------------------
To be clear, currently we ignore errors processing checksums (checksum file not found, too short, timeouts while reading, etc.) so that the checksum system only throws user-visible exceptions when data is known to be corrupt. You're proposing we change this so that, if the checksum file is there, then we may throw user-visible exceptions for errors processing the checksum data (like unexpected eof). Is that right, or are you proposing something else? > Summer buffer overflow exception > -------------------------------- > > Key: HADOOP-502 > URL: http://issues.apache.org/jira/browse/HADOOP-502 > Project: Hadoop > Issue Type: Bug > Components: fs > Affects Versions: 0.5.0 > Reporter: Owen O'Malley > Assigned To: Owen O'Malley > Fix For: 0.6.0 > > > The extended error message with the offending values finally paid off and I > was able to get the values that were causing the Summber buffer overflow > exception. > java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, > summed=512, read=2880, bytesPerSum=1, inSum=512 > at > org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100) > at > org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:254) > at java.io.BufferedInputStream.read(BufferedInputStream.java:313) > at java.io.DataInputStream.read(DataInputStream.java:80) > at > org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190) > at > org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at java.util.zip.CRC32.update(CRC32.java:43) > at > org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98) > ... 9 more > Tracking through the code, what happens is inside of > FSDataInputStream.Checker.read() the verifySum gets an EOF Exception and > turns off the summing. Among other things this sets the bytesPerSum to 1. > Unfortunately, that leads to the ArrayIndexOutOfBoundsException. > I think the problem is that the original EOF exception was logged and > ignored. I propose that we allow the original EOF to propagate back to the > caller. (So that file not found will still disable the checksum checking, but > we will detect truncated checksum files.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira