[ 
https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171970#comment-13171970
 ] 

M. C. Srivas commented on HDFS-2699:
------------------------------------

Couple of observations:

a. If you want to eventually support random-IO, then a block size of 4096 is 
too large for the CRC, as it will cause a read-modify-write cycle on the entire 
4K.  512-bytes reduces this overhead.

b. Can the value of the variable "io.bytes.per.checksum" be transferred from 
the *-site.xml file into the file-properties at the NN at the time of file 
creation?  If someone messes around with it, old files will still work as before
                
> Store data and checksums together in block file
> -----------------------------------------------
>
>                 Key: HDFS-2699
>                 URL: https://issues.apache.org/jira/browse/HDFS-2699
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current implementation of HDFS stores the data in one block file and the 
> metadata(checksum) in another block file. This means that every read from 
> HDFS actually consumes two disk iops, one to the datafile and one to the 
> checksum file. This is a major problem for scaling HBase, because HBase is 
> usually  bottlenecked on the number of random disk iops that the 
> storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to