[ https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated HADOOP-3205: -------------------------------- Attachment: hadoop-3205.txt New version of the patch. This addresses Eli's review comments, and adds some extra tests (one for truncated checksum file throwing ChecksumException, another for odd sized read buffers in a file with a few chunks). I also tidied up some of the comments to make it clearer to implementors what's going on. Just to be doubly sure, I reran all the benchmarks overnight and confirmed that reading 32 chunks at once had all the performance improvement benefits of a larger value (and uses less memory). Also reran HDFS-755 tests against this build with assertions on and everything looked good (plenty of assertion failures, but none in the new code!) > Read multiple chunks directly from FSInputChecker subclass into user buffers > ---------------------------------------------------------------------------- > > Key: HADOOP-3205 > URL: https://issues.apache.org/jira/browse/HADOOP-3205 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Reporter: Raghu Angadi > Assignee: Todd Lipcon > Attachments: hadoop-3205.txt, hadoop-3205.txt, hadoop-3205.txt, > hadoop-3205.txt, hadoop-3205.txt > > > Implementations of FSInputChecker and FSOutputSummer like DFS do not have > access to full user buffer. At any time DFS can access only up to 512 bytes > even though user usually reads with a much larger buffer (often controlled by > io.file.buffer.size). This requires implementations to double buffer data if > an implementation wants to read or write larger chunks of data from > underlying storage. > We could separate changes for FSInputChecker and FSOutputSummer into two > separate jiras. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.