Re: HBase Random Read latency > 100ms

Ramu M S Mon, 07 Oct 2013 01:14:00 -0700

Lars,

After changing the BLOCKSIZE to 16KB, the latency has reduced a little. Now
the average is around 75ms.
Overall throughput (I am using 40 Clients to fetch records) is around 1K
OPS.


After compaction hdfsBlocksLocalityIndex is 91,88,78,90,99,82,94,97 in my 8
RS respectively.

Thanks,
Ramu


On Mon, Oct 7, 2013 at 3:51 PM, Ramu M S <ramu.ma...@gmail.com> wrote:

> Thanks Lars.
>
> I have changed the BLOCKSIZE to 16KB and triggered a major compaction. I
> will report my results once it is done.
>
> - Ramu
>
>
> On Mon, Oct 7, 2013 at 3:21 PM, lars hofhansl <la...@apache.org> wrote:
>
>> First of: 128gb heap per RegionServer. Wow.I'd be interested to hear your
>> experience with such a large heap for your RS. It's definitely big enough.
>>
>>
>> It's interesting hat 100gb do fit into the aggregate cache (of 8x32gb),
>> while 1.8tb do not.
>> Looks like ~70% of the read request would need to bring in a 64kb block
>> in order to read 724 bytes.
>>
>> Should that take 100ms? No. Something's still amiss.
>>
>> Smaller blocks might help (you'd need to bring in 4, 8, or maybe 16k to
>> read the small row). You would need to issue a major compaction for that to
>> take effect.
>> Maybe try 16k blocks. If that speeds up your random gets we know where to
>> look next... At the disk IO.
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Ramu M S <ramu.ma...@gmail.com>
>> To: user@hbase.apache.org; lars hofhansl <la...@apache.org>
>> Sent: Sunday, October 6, 2013 11:05 PM
>> Subject: Re: HBase Random Read latency > 100ms
>>
>>
>> Lars,
>>
>> In one of your old posts, you had mentioned that lowering the BLOCKSIZE is
>> good for random reads (of course with increased size for Block Indexes).
>>
>> Post is at http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow
>>
>> Will that help in my tests? Should I give it a try? If I alter my table,
>> should I trigger a major compaction again for this to take effect?
>>
>> Thanks,
>> Ramu
>>
>>
>>
>> On Mon, Oct 7, 2013 at 2:44 PM, Ramu M S <ramu.ma...@gmail.com> wrote:
>>
>> > Sorry BLOCKSIZE was wrong in my earlier post, it is the default 64 KB.
>> >
>> > {NAME => 'usertable', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING =>
>> > 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', VERSIONS =>
>> '1',
>> > COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647',
>> > KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY =>
>> 'false',
>> > ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}]}
>> >
>> > Thanks,
>> > Ramu
>> >
>> >
>> > On Mon, Oct 7, 2013 at 2:42 PM, Ramu M S <ramu.ma...@gmail.com> wrote:
>> >
>> >> Lars,
>> >>
>> >> - Yes Short Circuit reading is enabled on both HDFS and HBase.
>> >> - I had issued Major compaction after table is loaded.
>> >> - Region Servers have max heap set as 128 GB. Block Cache Size is 0.25
>> of
>> >> heap (So 32 GB for each Region Server) Do we need even more?
>> >> - Decreasing HFile Size (Default is 1GB )? Should I leave it to
>> default?
>> >> - Keys are Zipfian distributed (By YCSB)
>> >>
>> >> Bharath,
>> >>
>> >> Bloom Filters are enabled. Here is my table details,
>> >> {NAME => 'usertable', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING
>> =>
>> >> 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', VERSIONS =>
>> '1',
>> >> COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647',
>> >> KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '16384', IN_MEMORY =>
>> 'false',
>> >> ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}]}
>> >>
>> >> When the data size is around 100GB (100 Million records), then the
>> >> latency is very good. I am getting a throughput of around 300K OPS.
>> >> In both cases (100 GB and 1.8 TB) Ganglia stats show that Disk reads
>> are
>> >> around 50-60 MB/s throughout the read cycle.
>> >>
>> >> Thanks,
>> >> Ramu
>> >>
>> >>
>> >> On Mon, Oct 7, 2013 at 2:21 PM, lars hofhansl <la...@apache.org>
>> wrote:
>> >>
>> >>> Have you enabled short circuit reading? See here:
>> >>> http://hbase.apache.org/book/perf.hdfs.html
>> >>>
>> >>> How's your data locality (shown on the RegionServer UI page).
>> >>>
>> >>>
>> >>> How much memory are you giving your RegionServers?
>> >>> If you reads are truly random and the data set does not fit into the
>> >>> aggregate cache, you'll be dominated by the disk and network.
>> >>> Each read would need to bring in a 64k (default) HFile block. If short
>> >>> circuit reading is not enabled you'll get two or three context
>> switches.
>> >>>
>> >>> So I would try:
>> >>> 1. Enable short circuit reading
>> >>> 2. Increase the block cache size per RegionServer
>> >>> 3. Decrease the HFile block size
>> >>> 4. Make sure your data is local (if it is not, issue a major
>> compaction).
>> >>>
>> >>>
>> >>> -- Lars
>> >>>
>> >>>
>> >>>
>> >>> ________________________________
>> >>>  From: Ramu M S <ramu.ma...@gmail.com>
>> >>> To: user@hbase.apache.org
>> >>> Sent: Sunday, October 6, 2013 10:01 PM
>> >>> Subject: HBase Random Read latency > 100ms
>> >>>
>> >>>
>> >>> Hi All,
>> >>>
>> >>> My HBase cluster has 8 Region Servers (CDH 4.4.0, HBase 0.94.6).
>> >>>
>> >>> Each Region Server is with the following configuration,
>> >>> 16 Core CPU, 192 GB RAM, 800 GB SATA (7200 RPM) Disk
>> >>> (Unfortunately configured with RAID 1, can't change this as the
>> Machines
>> >>> are leased temporarily for a month).
>> >>>
>> >>> I am running YCSB benchmark tests on HBase and currently inserting
>> around
>> >>> 1.8 Billion records.
>> >>> (1 Key + 7 Fields of 100 Bytes = 724 Bytes per record)
>> >>>
>> >>> Currently I am getting a write throughput of around 100K OPS, but
>> random
>> >>> reads are very very slow, all gets have more than 100ms or more
>> latency.
>> >>>
>> >>> I have changed the following default configuration,
>> >>> 1. HFile Size: 16GB
>> >>> 2. HDFS Block Size: 512 MB
>> >>>
>> >>> Total Data size is around 1.8 TB (Excluding the replicas).
>> >>> My Table is split into 128 Regions (No pre-splitting used, started
>> with 1
>> >>> and grew to 128 over the insertion time)
>> >>>
>> >>> Taking some inputs from earlier discussions I have done the following
>> >>> changes to disable Nagle (In both Client and Server hbase-site.xml,
>> >>> hdfs-site.xml)
>> >>>
>> >>> <property>
>> >>>   <name>hbase.ipc.client.tcpnodelay</name>
>> >>>   <value>true</value>
>> >>> </property>
>> >>>
>> >>> <property>
>> >>>   <name>ipc.server.tcpnodelay</name>
>> >>>   <value>true</value>
>> >>> </property>
>> >>>
>> >>> Ganglia stats shows large CPU IO wait (>30% during reads).
>> >>>
>> >>> I agree that disk configuration is not ideal for Hadoop cluster, but
>> as
>> >>> told earlier it can't change for now.
>> >>> I feel the latency is way beyond any reported results so far.
>> >>>
>> >>> Any pointers on what can be wrong?
>> >>>
>> >>> Thanks,
>> >>> Ramu
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: HBase Random Read latency > 100ms

Reply via email to