[ 
http://issues.apache.org/jira/browse/HADOOP-502?page=comments#action_12432651 ] 
            
Doug Cutting commented on HADOOP-502:
-------------------------------------

To be clear, currently we ignore errors processing checksums (checksum file not 
found, too short, timeouts while reading, etc.) so that the checksum system 
only throws user-visible exceptions when data is known to be corrupt.  You're 
proposing we change this so that, if the checksum file is there, then we may 
throw user-visible exceptions for errors processing the checksum data (like 
unexpected eof).  Is that right, or are you proposing something else?

> Summer buffer overflow exception
> --------------------------------
>
>                 Key: HADOOP-502
>                 URL: http://issues.apache.org/jira/browse/HADOOP-502
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.5.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.0
>
>
> The extended error message with the offending values finally paid off and I 
> was able to get the values that were causing the Summber buffer overflow 
> exception.
> java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, 
> summed=512, read=2880, bytesPerSum=1, inSum=512
>         at 
> org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
>         at 
> org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
>         at java.io.DataInputStream.read(DataInputStream.java:80)
>         at 
> org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
>         at 
> org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>         at java.util.zip.CRC32.update(CRC32.java:43)
>         at 
> org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
>         ... 9 more
> Tracking through the code, what happens is inside of 
> FSDataInputStream.Checker.read() the verifySum gets an  EOF Exception and 
> turns off the summing. Among other things this sets the bytesPerSum to 1. 
> Unfortunately, that leads to the ArrayIndexOutOfBoundsException.
> I think the problem is that the original EOF exception was logged and 
> ignored. I propose that we allow the original EOF to propagate back to the 
> caller. (So that file not found will still disable the checksum checking, but 
> we will detect truncated checksum files.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to