Lars, In one of your old posts, you had mentioned that lowering the BLOCKSIZE is good for random reads (of course with increased size for Block Indexes).
Post is at http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow Will that help in my tests? Should I give it a try? If I alter my table, should I trigger a major compaction again for this to take effect? Thanks, Ramu On Mon, Oct 7, 2013 at 2:44 PM, Ramu M S <ramu.ma...@gmail.com> wrote: > Sorry BLOCKSIZE was wrong in my earlier post, it is the default 64 KB. > > {NAME => 'usertable', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING => > 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', VERSIONS => '1', > COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', > KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', > ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}]} > > Thanks, > Ramu > > > On Mon, Oct 7, 2013 at 2:42 PM, Ramu M S <ramu.ma...@gmail.com> wrote: > >> Lars, >> >> - Yes Short Circuit reading is enabled on both HDFS and HBase. >> - I had issued Major compaction after table is loaded. >> - Region Servers have max heap set as 128 GB. Block Cache Size is 0.25 of >> heap (So 32 GB for each Region Server) Do we need even more? >> - Decreasing HFile Size (Default is 1GB )? Should I leave it to default? >> - Keys are Zipfian distributed (By YCSB) >> >> Bharath, >> >> Bloom Filters are enabled. Here is my table details, >> {NAME => 'usertable', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING => >> 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', VERSIONS => '1', >> COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', >> KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '16384', IN_MEMORY => 'false', >> ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}]} >> >> When the data size is around 100GB (100 Million records), then the >> latency is very good. I am getting a throughput of around 300K OPS. >> In both cases (100 GB and 1.8 TB) Ganglia stats show that Disk reads are >> around 50-60 MB/s throughout the read cycle. >> >> Thanks, >> Ramu >> >> >> On Mon, Oct 7, 2013 at 2:21 PM, lars hofhansl <la...@apache.org> wrote: >> >>> Have you enabled short circuit reading? See here: >>> http://hbase.apache.org/book/perf.hdfs.html >>> >>> How's your data locality (shown on the RegionServer UI page). >>> >>> >>> How much memory are you giving your RegionServers? >>> If you reads are truly random and the data set does not fit into the >>> aggregate cache, you'll be dominated by the disk and network. >>> Each read would need to bring in a 64k (default) HFile block. If short >>> circuit reading is not enabled you'll get two or three context switches. >>> >>> So I would try: >>> 1. Enable short circuit reading >>> 2. Increase the block cache size per RegionServer >>> 3. Decrease the HFile block size >>> 4. Make sure your data is local (if it is not, issue a major compaction). >>> >>> >>> -- Lars >>> >>> >>> >>> ________________________________ >>> From: Ramu M S <ramu.ma...@gmail.com> >>> To: user@hbase.apache.org >>> Sent: Sunday, October 6, 2013 10:01 PM >>> Subject: HBase Random Read latency > 100ms >>> >>> >>> Hi All, >>> >>> My HBase cluster has 8 Region Servers (CDH 4.4.0, HBase 0.94.6). >>> >>> Each Region Server is with the following configuration, >>> 16 Core CPU, 192 GB RAM, 800 GB SATA (7200 RPM) Disk >>> (Unfortunately configured with RAID 1, can't change this as the Machines >>> are leased temporarily for a month). >>> >>> I am running YCSB benchmark tests on HBase and currently inserting around >>> 1.8 Billion records. >>> (1 Key + 7 Fields of 100 Bytes = 724 Bytes per record) >>> >>> Currently I am getting a write throughput of around 100K OPS, but random >>> reads are very very slow, all gets have more than 100ms or more latency. >>> >>> I have changed the following default configuration, >>> 1. HFile Size: 16GB >>> 2. HDFS Block Size: 512 MB >>> >>> Total Data size is around 1.8 TB (Excluding the replicas). >>> My Table is split into 128 Regions (No pre-splitting used, started with 1 >>> and grew to 128 over the insertion time) >>> >>> Taking some inputs from earlier discussions I have done the following >>> changes to disable Nagle (In both Client and Server hbase-site.xml, >>> hdfs-site.xml) >>> >>> <property> >>> <name>hbase.ipc.client.tcpnodelay</name> >>> <value>true</value> >>> </property> >>> >>> <property> >>> <name>ipc.server.tcpnodelay</name> >>> <value>true</value> >>> </property> >>> >>> Ganglia stats shows large CPU IO wait (>30% during reads). >>> >>> I agree that disk configuration is not ideal for Hadoop cluster, but as >>> told earlier it can't change for now. >>> I feel the latency is way beyond any reported results so far. >>> >>> Any pointers on what can be wrong? >>> >>> Thanks, >>> Ramu >>> >> >> >