[jira] [Commented] (HDFS-2699) Store data and checksums together in block file

M. C. Srivas (Commented) (JIRA) Sun, 18 Dec 2011 23:02:59 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172059#comment-13172059
 ]


M. C. Srivas commented on HDFS-2699:
------------------------------------

@Todd: no one is arguing that putting the CRC inline is not beneficial wrt seek 
time. Recalculating CRC over with a 4K block is substantially slower than with 
a 512-byte block (256 bytes vs 2K on the average is a 10x factor). Imagine 
appending continuously to the HBase WAL with the 128-byte records that you 
mentioned in another thread ... the CPU burn will be much worse with 4K CRC 
blocks.

Secondly, the disk manufacturers guarantee only a 512-byte atomicity on disk. 
Linux doing a 4K block write guarantees almost nothing wrt atomicity of that 4K 
write to disk. On a crash, unless you are running some sort of RAID or 
data-journal, there is a likelihood of the 4K block that's in-flight getting 
corrupted.

                
> Store data and checksums together in block file
> -----------------------------------------------
>
>                 Key: HDFS-2699
>                 URL: https://issues.apache.org/jira/browse/HDFS-2699
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The current implementation of HDFS stores the data in one block file and the 
> metadata(checksum) in another block file. This means that every read from 
> HDFS actually consumes two disk iops, one to the datafile and one to the 
> checksum file. This is a major problem for scaling HBase, because HBase is 
> usually  bottlenecked on the number of random disk iops that the 
> storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2699) Store data and checksums together in block file

Reply via email to