[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anoop Sam John updated HBASE-5074: ---------------------------------- Release Note: Adds hbase.regionserver.checksum.verify. If hbase.regionserver.checksum.verify is set to true, then hbase will read data and then verify checksums. Checksum verification inside hdfs will be switched off. If the hbase-checksum verification fails, then it will switch back to using hdfs checksums for verifiying data that is being read from storage. Also adds hbase.hstore.bytes.per.checksum -- number of bytes in a newly created checksum chunk -- and hbase.hstore.checksum.algorithm, name of an algorithm that is used to compute checksums. You will currently only see benefit if you have the local read short-circuit enabled -- see http://hbase.apache.org/book.html#perf.hdfs.configs -- while HDFS-3429 goes unfixed. was: Adds hbase.regionserver.checksum.verify. If hbase.regionserver.checksum.verify is set to true, then hbase will read data and then verify checksums. Checksum verification inside hdfs will be switched off. If the hbase-checksum verification fails, then it will switch back to using hdfs checksums for verifiying data that is being read from storage. Also adds hbase.hstore.bytes.per.checksum -- number of bytes in a newly created checksum chunk -- and hbase.hstore.bytes.per.checksum, name of an algorithm that is used to compute checksums. You will currently only see benefit if you have the local read short-circuit enabled -- see http://hbase.apache.org/book.html#perf.hdfs.configs -- while HDFS-3429 goes unfixed. > support checksums in HBase block cache > -------------------------------------- > > Key: HBASE-5074 > URL: https://issues.apache.org/jira/browse/HBASE-5074 > Project: HBase > Issue Type: Improvement > Components: regionserver > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Fix For: 0.94.0, 0.95.0 > > Attachments: 5074-0.94.txt, ASF.LICENSE.NOT.GRANTED--D1521.10.patch, > ASF.LICENSE.NOT.GRANTED--D1521.10.patch, > ASF.LICENSE.NOT.GRANTED--D1521.11.patch, > ASF.LICENSE.NOT.GRANTED--D1521.11.patch, > ASF.LICENSE.NOT.GRANTED--D1521.12.patch, > ASF.LICENSE.NOT.GRANTED--D1521.12.patch, > ASF.LICENSE.NOT.GRANTED--D1521.13.patch, > ASF.LICENSE.NOT.GRANTED--D1521.13.patch, > ASF.LICENSE.NOT.GRANTED--D1521.14.patch, > ASF.LICENSE.NOT.GRANTED--D1521.14.patch, > ASF.LICENSE.NOT.GRANTED--D1521.1.patch, > ASF.LICENSE.NOT.GRANTED--D1521.1.patch, > ASF.LICENSE.NOT.GRANTED--D1521.2.patch, > ASF.LICENSE.NOT.GRANTED--D1521.2.patch, > ASF.LICENSE.NOT.GRANTED--D1521.3.patch, > ASF.LICENSE.NOT.GRANTED--D1521.3.patch, > ASF.LICENSE.NOT.GRANTED--D1521.4.patch, > ASF.LICENSE.NOT.GRANTED--D1521.4.patch, > ASF.LICENSE.NOT.GRANTED--D1521.5.patch, > ASF.LICENSE.NOT.GRANTED--D1521.5.patch, > ASF.LICENSE.NOT.GRANTED--D1521.6.patch, > ASF.LICENSE.NOT.GRANTED--D1521.6.patch, > ASF.LICENSE.NOT.GRANTED--D1521.7.patch, > ASF.LICENSE.NOT.GRANTED--D1521.7.patch, > ASF.LICENSE.NOT.GRANTED--D1521.8.patch, > ASF.LICENSE.NOT.GRANTED--D1521.8.patch, > ASF.LICENSE.NOT.GRANTED--D1521.9.patch, > ASF.LICENSE.NOT.GRANTED--D1521.9.patch, D1521.10.patch, D1521.10.patch, > D1521.10.patch > > > The current implementation of HDFS stores the data in one block file and the > metadata(checksum) in another block file. This means that every read into the > HBase block cache actually consumes two disk iops, one to the datafile and > one to the checksum file. This is a major problem for scaling HBase, because > HBase is usually bottlenecked on the number of random disk iops that the > storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira