How about GC activity? ApplicationStopTime? Do you track that? Is the issue reproducible? or you got it first time?
Start with RS logs and try to find anything suspicious in a period of a very high latency. 1.5 sec HBase write latency does not look right. -Vlad On Tue, Aug 11, 2015 at 2:08 PM, Serega Sheypak <serega.shey...@gmail.com> wrote: > Hi Vladimir! > > Here are graphs. Servlet (3 tomcats on 3 different hosts write to HBase) > http://www.bigdatapath.com/wp-content/uploads/2015/08/01_apps1.png > See how response time jump. I can't explain it. Write load is really-really > low. > > all RS have even load. I see request-metrics in HBase master web UI. > Tables are pre-splitted. I have 10 RS and pre-splitted tables on 50 > regions. > > >1. How large is your single write? > 1-2KB > > >2. Do you see any RegionTooBusyException in a client log files > no HBase related exceptions. Response > > > 3. How large is your table ( # of regions, # of column families) > 1 column familiy, table_01 150GB, table_02 130 GB > > I have two "major tables", here are stats for them: > http://www.bigdatapath.com/wp-content/uploads/2015/08/table_02.png > http://www.bigdatapath.com/wp-content/uploads/2015/08/table_01.png > >4. RS memory related config: Max heap > > 5. memstore size (if not default - 0.4) > hbase.regionserver.global.memstore.upperLimit=0.4 > hbase.regionserver.global.memstore.lowerLimit=0.38 > RS heapsize=8GB > > >*Do you see any region splits? * > no, never happened since tables are pre-splitted > > 2015-08-11 18:54 GMT+02:00 Vladimir Rodionov <vladrodio...@gmail.com>: > > > *Common questions:* > > > > > > 1. How large is your single write? > > 2. Do you see any RegionTooBusyException in a client log files > > 3. How large is your table ( # of regions, # of column families) > > 4. RS memory related config: Max heap > > 5. memstore size (if not default - 0.4) > > > > > > Memstore flush > > > > hbase.hregion.memstore.flush.size = 256M > > hbase.hregion.memstore.block.multiplier = N (do not block writes) N * > 256M > > MUST be greater than overall memstore size (HBASE_HEAPSIZE * > > hbase.regionserver.global.memstore.size) > > > > WAL files. > > > > Set HDFS block size to 256MB. hbase.regionserver.hlog.blocksize = 0.95 > HDFS > > block size (256MB * 0.95). Keep hbase.regionserver.hlog.blocksize * > > hbase.regionserver.maxlogs just a bit above > > hbase.regionserver.global.memstore.lowerLimit > > (0.35-0.45) * HBASE_HEAPSIZE to avoid premature memstore flushing. > > > > *Do you see any region splits? * > > > > Region split blocks writes. Try to presplit table and avoid splitting > after > > that. Disable splitting completely > > > > hbase.regionserver.region.split.policy > > =org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy > > > > -Vlad > > > > > > > > > > On Tue, Aug 11, 2015 at 3:22 AM, Serega Sheypak < > serega.shey...@gmail.com> > > wrote: > > > > > Hi, we are using version 1.0.0+cdh5.4.4+160 > > > We have heavy write load, ~ 10K per econd > > > We have 10 nodes 7 disks each. I read some perf notes, they state that > > > HBase can handle 1K per second writes per node without any problems. > > > > > > > > > I see some spikes on "writers". Write operation timing "jumps" from > > 40-50ms > > > to 200-500ms Probably I hit memstore limit. RegionServer starts to > flush > > > memstore and stop to accept updates. > > > > > > I have several questions: > > > 1. Does 4/(8 in hyperthreading) CPU + 7HDD node could absorb 1K writes > > per > > > second? > > > 2. What is the right way to fight with blocked writes? > > > 2.1. What I did: > > > hbase.hregion.memstore.flush.size to 256M to produce larger HFiles when > > > flushing memstore > > > base.hregion.memstore.block.multiplier to 4, since I have only one > > > intensive-write table. Let it grow > > > hbase.regionserver.optionallogflushinterval to 10s, i CAN loose some > > data, > > > NP here. The idea that I reduce I/O pressure on disks. > > > === > > > Not sure if I can correctly play with these parameters. > > > hbase.hstore.blockingStoreFiles=10 > > > hbase.hstore.compactionThreshold=3 > > > > > >