Vladimir, Thanks for the Insights into Future Caching features. Looks very interesting.
- Ramu On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <vrodio...@carrieriq.com>wrote: > Ramu, > > If your working set of data fits into 192GB you may get additional boost > by utilizing OS page cache, or wait until > 0.98 release which introduces new bucket cache implementation (port of > Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not released > yet > but is due soon). Both caches stores data off-heap, but Facebook version > can store encoded and compressed data and vanilla bucket cache does not. > There are some options how to utilize efficiently available RAM (at least > in upcoming HBase releases) > . If your data set does not fit RAM then your only hope is your 24 SAS > drives. Depending on your RAID settings, disk IO perf, HDFS configuration > (I think the latest Hadoop is preferable here). > > OS page cache is most vulnerable and volatile, it can not be controlled > and can be easily polluted by either some other processes or by HBase > itself (long scan). > With Block cache you have more control but the first truly usable > *official* implementation is going to be a part of 0.98 release. > > As far as I understand, your use case would definitely covered by > something similar to BigTable ScanCache (RowCache) , but there is no such > cache in HBase yet. > One major advantage of RowCache vs BlockCache (apart from being much more > efficient in RAM usage) is resilience to Region compactions. Each minor > Region compaction invalidates partially > Region's data in BlockCache and major compaction invalidates this Region's > data completely. This is not the case with RowCache (would it be > implemented). > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vrodio...@carrieriq.com > > ________________________________________ > From: Ramu M S [ramu.ma...@gmail.com] > Sent: Monday, October 07, 2013 5:25 PM > To: user@hbase.apache.org > Subject: Re: HBase Random Read latency > 100ms > > Vladimir, > > Yes. I am fully aware of the HDD limitation and wrong configurations wrt > RAID. > Unfortunately, the hardware is leased from others for this work and I > wasn't consulted to decide the h/w specification for the tests that I am > doing now. Even the RAID cannot be turned off or set to RAID-0 > > Production system is according to the Hadoop needs (100 Nodes with 16 Core > CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned > off, so we are creating 1 Virtual Disk containing only 1 Physical Disk and > the VD RAID level set to* *RAID-0). These systems are still not available. > If > you have any suggestion on the production setup, I will be glad to hear. > > Also, as pointed out earlier, we are planning to use HBase also as an in > memory KV store to access the latest data. > That's why RAM was considered huge in this configuration. But looks like we > would run into more problems than any gains from this. > > Keeping that aside, I was trying to get the maximum out of the current > cluster or as you said Is 500-1000 OPS the max I could get out of this > setup? > > Regards, > Ramu > > > > Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended to be > read only by the individual or entity to whom this message is addressed. If > the reader of this message is not the intended recipient or an agent or > designee of the intended recipient, please note that any review, use, > disclosure or distribution of this message or its attachments, in any form, > is strictly prohibited. If you have received this message in error, please > immediately notify the sender and/or notificati...@carrieriq.com and > delete or destroy any copy of this message and its attachments. >