Re: HBase Random Read latency > 100ms

Ramu M S Tue, 08 Oct 2013 23:34:14 -0700

Hi All,

I just ran only 8 parallel clients,


With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2

I always thought SCR enabled, allows a client co-located with the DataNode
to read HDFS file blocks directly. This gives a performance boost to
distributed clients that are aware of locality.

Is my understanding wrong OR it doesn't apply to my scenario?

Meanwhile I will try setting the parameter suggested by Lars and post you
the results.

Thanks,
Ramu


On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <[email protected]> wrote:

> Good call.
> Could try to enable hbase.regionserver.checksum.verify, which will cause
> HBase to do its own checksums rather than relying on HDFS (and which saves
> 1 IO per block get).
>
> I do think you can expect the index blocks to be cached at all times.
>
> -- Lars
> ________________________________
> From: Vladimir Rodionov <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Tuesday, October 8, 2013 8:44 PM
> Subject: RE: HBase Random Read latency > 100ms
>
>
> Upd.
>
> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
> it is going to be 6 File IO in a worst case (large data set), therefore
> you will have not 5 IO requests in queue but up to 20-30 IO requests in a
> queue
> This definitely explains > 100ms avg latency.
>
>
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
>
> From: Vladimir Rodionov
> Sent: Tuesday, October 08, 2013 7:24 PM
> To: [email protected]
> Subject: RE: HBase Random Read latency > 100ms
>
> Ramu,
>
> You have 8 server boxes and 10 client. You have 40 requests in parallel -
> 5 per RS/DN?
>
> You have 5 requests on random reads in a IO queue of your single RAID1.
> With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add
> some overhead
> of HDFS + HBase and you will probably have your issue explained ?
>
> Your bottleneck is your disk system, I think. When you serve most of
> requests from disks as in your large data set scenario, make sure you have
> adequate disk sub-system and
> that it is configured properly. Block Cache and OS page can not help you
> in this case as working data set is larger than both caches.
>
> Good performance numbers in small data set scenario are explained by the
> fact that data fits into OS page cache and Block Cache - you do not read
> data from disk even if
> you disable block cache.
>
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
> From: Ramu M S [[email protected]]
> Sent: Tuesday, October 08, 2013 6:00 PM
> To: [email protected]
> Subject: Re: HBase Random Read latency > 100ms
>
> Hi All,
>
> After few suggestions from the mails earlier I changed the following,
>
> 1. Heap Size to 16 GB
> 2. Block Size to 16KB
> 3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
> 4. Data Locality Index is 100 in all RS
>
> I have clients running in 10 machines, each with 4 threads. So total 40.
> This is same in all tests.
>
> Result:
>            1. Average latency is still >100ms.
>            2. Heap occupancy is around 2-2.5 GB in all RS
>
> Few more tests carried out yesterday,
>
> TEST 1: Small data set (100 Million records, each with 724 bytes).
> ===========================================
> Configurations:
> 1. Heap Size to 1 GB
> 2. Block Size to 16KB
> 3. HFile size to 1 GB (Table now has 128 regions, 16 per server)
> 4. Data Locality Index is 100 in all RS
>
> I disabled Block Cache on the table, to make sure I read everything from
> disk, most of the time.
>
> Result:
>    1. Average Latency is 8ms and throughput went up to 6K/Sec per RS.
>    2. With Block Cache enabled again, I got average latency around 2ms
> and throughput of 10K/Sec per RS.
>        Heap occupancy around 650 MB
>    3. Increased the Heap to 16GB, with Block Cache still enabled, I got
> average latency around 1 ms and throughput 20K/Sec per RS
>        Heap Occupancy around 2-2.5 GB in all RS
>
> TEST 2: Large Data set (1.8 Billion records, each with 724 bytes)
> ==================================================
> Configurations:
> 1. Heap Size to 1 GB
> 2. Block Size to 16KB
> 3. HFile size to 1 GB (Table now has 2048 regions, 256 per server)
> 4. Data Locality Index is 100 in all RS
>
> Result:
>   1. Average Latency is > 500ms to start with and gradually decreases, but
> even after around 100 Million reads it is still >100 ms
>   2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap
> Size (1GB / 16GB) does not make any difference.
>   3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB
> under 1GB Heap.
>
> GC Time in all of the scenarios is around 2ms/Second, as shown in the
> Cloudera Manager.
>
> Reading most of the items from Disk in less data scenario gives better
> results and very low latencies.
>
> Number of regions per RS and HFile size does make a huge difference in my
> Cluster.
> Keeping 100 Regions per RS as max(Most of the discussions suggest this),
> should I restrict the HFile size to 1GB? and thus reducing the storage
> capacity (From 700 GB to 100GB per RS)?
>
> Please advice.
>
> Thanks,
> Ramu
>
>
> On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov
> <[email protected]>wrote:
>
> > What are your current heap and block cache sizes?
> >
> > Best regards,
> > Vladimir Rodionov
> > Principal Platform Engineer
> > Carrier IQ, www.carrieriq.com
> > e-mail: [email protected]
> >
> > ________________________________________
> > From: Ramu M S [[email protected]]
> > Sent: Monday, October 07, 2013 10:55 PM
> > To: [email protected]
> > Subject: Re: HBase Random Read latency > 100ms
> >
> > Hi All,
> >
> > Average Latency is still around 80ms.
> > I have done the following,
> >
> > 1. Enabled Snappy Compression
> > 2. Reduce the HFile size to 8 GB
> >
> > Should I attribute these results to bad Disk Configuration OR anything
> else
> > to investigate?
> >
> > - Ramu
> >
> >
> > On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <[email protected]> wrote:
> >
> > > Vladimir,
> > >
> > > Thanks for the Insights into Future Caching features. Looks very
> > > interesting.
> > >
> > > - Ramu
> > >
> > >
> > > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
> > > [email protected]> wrote:
> > >
> > >> Ramu,
> > >>
> > >> If your working set of data fits into 192GB you may get additional
> boost
> > >> by utilizing OS page cache, or wait until
> > >> 0.98 release which introduces new bucket cache implementation (port of
> > >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not
> > released
> > >> yet
> > >> but is due soon). Both caches stores data off-heap, but Facebook
> version
> > >> can store encoded and compressed data and vanilla bucket cache does
> not.
> > >> There are some options how to utilize efficiently available RAM (at
> > least
> > >> in upcoming HBase releases)
> > >> . If your data set does not fit RAM then your only hope is your 24 SAS
> > >> drives. Depending on your RAID settings, disk IO perf, HDFS
> > configuration
> > >> (I think the latest Hadoop is preferable here).
> > >>
> > >> OS page cache is most vulnerable and volatile, it can not be
> controlled
> > >> and can be easily polluted by either some other processes or by HBase
> > >> itself (long scan).
> > >> With Block cache you have more control but the first truly usable
> > >> *official* implementation is going to be a part of 0.98 release.
> > >>
> > >> As far as I understand, your use case would definitely covered by
> > >> something similar to BigTable ScanCache (RowCache) , but there is no
> > such
> > >> cache in HBase yet.
> > >> One major advantage of RowCache vs BlockCache (apart from being much
> > more
> > >> efficient in RAM usage) is resilience to Region compactions. Each
> minor
> > >> Region compaction invalidates partially
> > >> Region's data in BlockCache and major compaction invalidates this
> > >> Region's data completely. This is not the case with RowCache (would it
> > be
> > >> implemented).
> > >>
> > >> Best regards,
> > >> Vladimir Rodionov
> > >> Principal Platform Engineer
> > >> Carrier IQ, www.carrieriq.com
> > >> e-mail: [email protected]
> > >>
> > >> ________________________________________
> > >> From: Ramu M S [[email protected]]
> > >> Sent: Monday, October 07, 2013 5:25 PM
> > >> To: [email protected]
> > >> Subject: Re: HBase Random Read latency > 100ms
> > >>
> > >> Vladimir,
> > >>
> > >> Yes. I am fully aware of the HDD limitation and wrong configurations
> wrt
> > >> RAID.
> > >> Unfortunately, the hardware is leased from others for this work and I
> > >> wasn't consulted to decide the h/w specification for the tests that I
> am
> > >> doing now. Even the RAID cannot be turned off or set to RAID-0
> > >>
> > >> Production system is according to the Hadoop needs (100 Nodes with 16
> > Core
> > >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely
> turned
> > >> off, so we are creating 1 Virtual Disk containing only 1 Physical Disk
> > and
> > >> the VD RAID level set to* *RAID-0). These systems are still not
> > >> available. If
> > >> you have any suggestion on the production setup, I will be glad to
> hear.
> > >>
> > >> Also, as pointed out earlier, we are planning to use HBase also as an
> in
> > >> memory KV store to access the latest data.
> > >> That's why RAM was considered huge in this configuration. But looks
> like
> > >> we
> > >> would run into more problems than any gains from this.
> > >>
> > >> Keeping that aside, I was trying to get the maximum out of the current
> > >> cluster or as you said Is 500-1000 OPS the max I could get out of this
> > >> setup?
> > >>
> > >> Regards,
> > >> Ramu
> > >>
> > >>
> > >>
> > >> Confidentiality Notice:  The information contained in this message,
> > >> including any attachments hereto, may be confidential and is intended
> > to be
> > >> read only by the individual or entity to whom this message is
> > addressed. If
> > >> the reader of this message is not the intended recipient or an agent
> or
> > >> designee of the intended recipient, please note that any review, use,
> > >> disclosure or distribution of this message or its attachments, in any
> > form,
> > >> is strictly prohibited.  If you have received this message in error,
> > please
> > >> immediately notify the sender and/or [email protected] and
> > >> delete or destroy any copy of this message and its attachments.
> > >>
> > >
> > >
> >
> > Confidentiality Notice:  The information contained in this message,
> > including any attachments hereto, may be confidential and is intended to
> be
> > read only by the individual or entity to whom this message is addressed.
> If
> > the reader of this message is not the intended recipient or an agent or
> > designee of the intended recipient, please note that any review, use,
> > disclosure or distribution of this message or its attachments, in any
> form,
> > is strictly prohibited.  If you have received this message in error,
> please
> > immediately notify the sender and/or [email protected] and
> > delete or destroy any copy of this message and its attachments.
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or [email protected] and
> delete or destroy any copy of this message and its attachments.
>

Re: HBase Random Read latency > 100ms

Reply via email to