[ https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448432#comment-16448432 ]
Andrew Purtell commented on HBASE-6572: --------------------------------------- Yes I think it is time to write it up, since it's going to be in 2.0.0. Please be sure to mention it is an experimental feature. > Tiered HFile storage > -------------------- > > Key: HBASE-6572 > URL: https://issues.apache.org/jira/browse/HBASE-6572 > Project: HBase > Issue Type: Brainstorming > Reporter: Andrew Purtell > Priority: Major > > Consider how we might enable tiered HFile storage. If HDFS has the > capability, we could create certain files on solid state devices where they > might be frequently accessed, especially for random reads; and others (and by > default) on spinning media as before. We could support the move of frequently > read HFiles from spinning media to solid state. We already have CF statistics > for this, would only need to add requisite admin interface; could even > consider an autotiering option. > Dhruba Borthakur did some early work in this area and wrote up his findings: > http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . > It is important to note the findings but I suggest most of the > recommendations are out of scope of this JIRA. This JIRA seeks to find an > initial use case that produces a reasonable benefit, and serves as a testbed > for further improvements. If I may paraphrase Dhruba's findings (any > misstatements and errors are mine): First, the DFSClient code paths introduce > significant latency, so the HDFS client (and presumably the DataNode, as the > next bottleneck) will need significant work to knock that down. Need to > investigate optimized (perhaps read-only) DFS clients, server side read and > caching strategies. Second, RegionServers are heavily threaded and this > imposes a lot of monitor contention and context switching cost. Need to > investigate reducing the number of threads in a RegionServer, nonblocking IO > and RPC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)