[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213884#comment-13213884
 ] 

Phabricator commented on HBASE-5074:
------------------------------------

stack has commented on the revision "[jira] [HBASE-5074] Support checksums in 
HBase block cache".

  Answering Dhruba.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we 
could have better names for these methods, ones that give more of a clue as to 
what they are about.  getBackingFS, getNoChecksumFS?

  Maybe you are keepign them generic like this because you will be back in this 
area again soon doing another beautiful speedup on top of this checksumming fix 
(When we going to do read-ahead?  Would that speed scanning?)
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 
ok. np.
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 
Ok.  So, two readers.  Our file count is going to go up?  We should release 
note this as side effect of enabling this feature (previous you may have been 
well below xceivers limit but now you could go over the top?)  I didn't notice 
this was going on.  Need to foreground it I'd say.
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I 
figured.  Its fine as is.

REVISION DETAIL
  https://reviews.facebook.net/D1521

                
> support checksums in HBase block cache
> --------------------------------------
>
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
> D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
> D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
> D1521.7.patch
>
>
> The current implementation of HDFS stores the data in one block file and the 
> metadata(checksum) in another block file. This means that every read into the 
> HBase block cache actually consumes two disk iops, one to the datafile and 
> one to the checksum file. This is a major problem for scaling HBase, because 
> HBase is usually bottlenecked on the number of random disk iops that the 
> storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to