Hiļ¼all I modify the configuration, set hbase.hstore.flusher.count to 20 (default value is 2), and run the YCSB to load data with 32 threads, it works fine.
On Nov 25, 2014, at 10:22, mail list <louis.hust...@gmail.com> wrote: > hi, all > > I retest the YCSB load data, and here is a situation which may explain the > load data blocked. > > I use too many threads to insert values, so the flush thread is not > effectively to handle all memstore, > and the user9099 memstore is queued at last, and waiting for flush too long > which blocks the YCSB request. > > Is it possible? > > > On Nov 21, 2014, at 13:33, Qiang Tian <tian...@gmail.com> wrote: > >> ---pc3 as RegionServer and datanode >> you only have 1 RS(split to 100 region), HDFS replicate to 3? perhaps the >> compaction cannot catch the flush speed, # of store files hit >> "hbase.hstore.blockingStoreFiles" >> and flush blocked sometime(90s by default). during this period, memstore >> continues to grow and finally reached blocking size(128M*4) ... >> >> turning on debug can see more messages. ps "hbase.hregion.majorcompaction"=0 >> cannot turn off major compaction. >> >> >> >> >> On Fri, Nov 21, 2014 at 12:55 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Louis: >>> See this thread: >>> http://search-hadoop.com/m/DHED4XrURi2 >>> >>> On Thu, Nov 20, 2014 at 7:33 PM, mail list <louis.hust...@gmail.com> >>> wrote: >>> >>>> If set the target, YCSB will sleep to control the flow, so it looks like >>>> the same as fewer threads. >>>> But I want to know with heavy write, why some region which exceeds the >>>> limit no flushed. >>>> Maybe like you said, it wait in a flush queue, i will try to repeat the >>>> scenario and lookup the >>>> flush queue. >>>> >>>> On Nov 21, 2014, at 1:44, Vladimir Rodionov <vladrodio...@gmail.com> >>>> wrote: >>>> >>>>> Could you please rerun your tests with -target ? >>>>> You should limit # of transactions per second and find the maximum your >>>>> cluster can sustain. >>>>> >>>>> -Vladimir Rodionov >>>>> >>>>> On Wed, Nov 19, 2014 at 10:53 PM, mail list <louis.hust...@gmail.com> >>>> wrote: >>>>> >>>>>> hi all, >>>>>> >>>>>> I build an HBASE test environment, with three PC server, with CHD >>> 5.1.0 >>>>>> >>>>>> pc1 pc2 pc3 >>>>>> >>>>>> pc1 and pc2 as HMASTER and hadoop namenode >>>>>> pc3 as RegionServer and datanode >>>>>> >>>>>> Then I create user as following: >>>>>> >>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i| >>>> "user#{1000+i*(9999-1000)/100}"} } >>>>>> >>>>>> >>>>>> Using YCSB for load data as following: >>>>>> >>>>>> ./bin/ycsb load hbase -P workloads/workloadc -p >>> columnfamily=family >>>>>> -p recordcount=1000000000 -p threadcount=32 -s > result/workloadc >>>>>> >>>>>> >>>>>> But when after a while, the ycsb return with following error: >>>>>> >>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable, >>>>>> attempt=35/35 failed 715 ops, last exception: >>>>>> org.apache.hadoop.hbase.RegionTooBusyException: >>>>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, >>>>>> >>>> >>> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9., >>>>>> server=l-hbase10.dba.cn1.qunar.com,60020,1416451280772, >>>>>> memstoreSize=536897120, blockingMemStoreSize=536870912 >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) >>>>>> at >>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) >>>>>> at >>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) >>>>>> at >>>>>> >>>> >>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) >>>>>> at java.lang.Thread.run(Thread.java:744) >>>>>> on l-hbase10.dba.cn1.qunar.com,60020,1416451280772, tracking started >>>> Thu >>>>>> Nov 20 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops. >>>>>> >>>>>> >>>>>> It seems the user9099 region is too busy, so I lookup the memstore >>>> metrics >>>>>> in web: >>>>>> >>>>>> >>>>>> >>>>>> As you see, the user9099 is bigger than other region, I think it is >>>>>> flushing, but after a while, it does not change to a small size and >>> YCSB >>>>>> quit finally. >>>>>> >>>>>> The region server is configured as below: >>>>>> >>>>>> Summary: HP DL380p Gen8, 1 x Xeon E5-2630 v2 2.60GHz, 126GB / 128GB >>>>>> 1600MHz DDR3 >>>>>> System: HP ProLiant DL380p Gen8 >>>>>> Processors: 1 (of 2) x Xeon E5-2630 v2 2.60GHz 100MHz FSB (HT >>> enabled, 6 >>>>>> cores, 24 threads) >>>>>> Memory: 126GB / 128GB 1600MHz DDR3 == 16 x 8GB, 8 x empty >>>>>> Disk: sda (scsi2): 450GB (1%) JBOD == 1 x HP-LOGICAL-VOLUME >>>>>> Disk: sdb (scsi2): 27TB (0%) JBOD == 1 x HP-LOGICAL-VOLUME >>>>>> Disk-Control: ata_piix0: Intel C600/X79 series chipset 4-Port SATA IDE >>>>>> Controller >>>>>> Disk-Control: hpsa0: Hewlett-Packard Company Smart Array Gen8 >>>> Controllers >>>>>> Disk-Control: shannon0: Device 1cb0:0275 >>>>>> Network: eth0 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe, >>>>>> 40:a8:f0:23:55:fc, 1000Mb/s <full-duplex> >>>>>> Network: eth1 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe, >>>>>> 40:a8:f0:23:55:fd, no carrier >>>>>> Network: eth2 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe, >>>>>> 40:a8:f0:23:55:fe, no carrier >>>>>> Network: eth3 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe, >>>>>> 40:a8:f0:23:55:ff, no carrier >>>>>> OS: CentOS 6.4 (Final), Linux 2.6.32-358.23.2.el6.x86_64 x86_64, >>> 64-bit >>>>>> BIOS: HP P70 02/10/2014 >>>>>> Hostname: l-hbase10.dba.cn1.qunar.com >>>>>> >>>>>> And i attach the hbase configuration file. >>>>>> >>>>>> >>>>>> I am new to HBase, any idea will be appreciated! >>>>>> >>>>>> >>>>>> >>>> >>>> >>> >