Re: YCSB load data quit because hbase region too busy

mail list Mon, 24 Nov 2014 18:58:08 -0800

Hi，all

I modify the configuration, set hbase.hstore.flusher.count to 20 (default value 
is 2), and run the YCSB to load data with 32 threads,
it works fine.


On Nov 25, 2014, at 10:22, mail list <louis.hust...@gmail.com> wrote:

> hi, all
> 
> I retest the YCSB load data, and here is a situation which may explain the 
> load data blocked.
> 
> I use too many threads to insert values, so the flush thread is not 
> effectively to handle all memstore,
> and the user9099 memstore is queued at last, and waiting for flush too long 
> which blocks the YCSB request.
> 
> Is it possible?
> 
> 
> On Nov 21, 2014, at 13:33, Qiang Tian <tian...@gmail.com> wrote:
> 
>> ---pc3 as RegionServer and datanode
>> you only have 1 RS(split to 100 region), HDFS replicate to 3? perhaps the
>> compaction cannot catch the flush speed,  # of store files hit
>> "hbase.hstore.blockingStoreFiles"
>> and flush blocked sometime(90s by default). during this period, memstore
>> continues to grow and finally reached blocking size(128M*4) ...
>> 
>> turning on debug can see more messages.  ps "hbase.hregion.majorcompaction"=0
>> cannot turn off major compaction.
>> 
>> 
>> 
>> 
>> On Fri, Nov 21, 2014 at 12:55 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> 
>>> Louis:
>>> See this thread:
>>> http://search-hadoop.com/m/DHED4XrURi2
>>> 
>>> On Thu, Nov 20, 2014 at 7:33 PM, mail list <louis.hust...@gmail.com>
>>> wrote:
>>> 
>>>> If set the target,  YCSB will sleep to control the flow, so it looks like
>>>> the same as fewer threads.
>>>> But I want to know with heavy write, why some region which exceeds the
>>>> limit no flushed.
>>>> Maybe like you said, it wait in a flush queue, i will try to repeat the
>>>> scenario and lookup the
>>>> flush queue.
>>>> 
>>>> On Nov 21, 2014, at 1:44, Vladimir Rodionov <vladrodio...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Could you please rerun your tests with -target ?
>>>>> You should limit # of transactions per second and find the maximum your
>>>>> cluster can sustain.
>>>>> 
>>>>> -Vladimir Rodionov
>>>>> 
>>>>> On Wed, Nov 19, 2014 at 10:53 PM, mail list <louis.hust...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> hi all,
>>>>>> 
>>>>>> I build an HBASE test environment, with three PC server, with CHD
>>> 5.1.0
>>>>>> 
>>>>>> pc1 pc2 pc3
>>>>>> 
>>>>>> pc1 and pc2 as HMASTER and hadoop namenode
>>>>>> pc3 as RegionServer and datanode
>>>>>> 
>>>>>> Then I create user as following:
>>>>>> 
>>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
>>>> "user#{1000+i*(9999-1000)/100}"} }
>>>>>> 
>>>>>> 
>>>>>> Using YCSB for load data as following:
>>>>>> 
>>>>>> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
>>> columnfamily=family
>>>>>> -p recordcount=1000000000   -p threadcount=32  -s  > result/workloadc
>>>>>> 
>>>>>> 
>>>>>> But when after a while, the ycsb return with following error:
>>>>>> 
>>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
>>>>>> attempt=35/35 failed 715 ops, last exception:
>>>>>> org.apache.hadoop.hbase.RegionTooBusyException:
>>>>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
>>>>>> 
>>>> 
>>> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
>>>>>> server=l-hbase10.dba.cn1.qunar.com,60020,1416451280772,
>>>>>> memstoreSize=536897120, blockingMemStoreSize=536870912
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
>>>>>>      at
>>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>>>>>>      at
>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>>>>>>      at
>>>>>> 
>>>> 
>>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>>>>>>      at java.lang.Thread.run(Thread.java:744)
>>>>>> on l-hbase10.dba.cn1.qunar.com,60020,1416451280772, tracking started
>>>> Thu
>>>>>> Nov 20 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
>>>>>> 
>>>>>> 
>>>>>> It seems the user9099 region is too busy, so I lookup the memstore
>>>> metrics
>>>>>> in web:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> As you see, the user9099 is bigger than other region, I think it is
>>>>>> flushing, but after a while, it does not change to a small size and
>>> YCSB
>>>>>> quit finally.
>>>>>> 
>>>>>> The region server is configured as below:
>>>>>> 
>>>>>> Summary: HP DL380p Gen8, 1 x Xeon E5-2630 v2 2.60GHz, 126GB / 128GB
>>>>>> 1600MHz DDR3
>>>>>> System: HP ProLiant DL380p Gen8
>>>>>> Processors: 1 (of 2) x Xeon E5-2630 v2 2.60GHz 100MHz FSB (HT
>>> enabled, 6
>>>>>> cores, 24 threads)
>>>>>> Memory: 126GB / 128GB 1600MHz DDR3 == 16 x 8GB, 8 x empty
>>>>>> Disk: sda (scsi2): 450GB (1%) JBOD == 1 x HP-LOGICAL-VOLUME
>>>>>> Disk: sdb (scsi2): 27TB (0%) JBOD == 1 x HP-LOGICAL-VOLUME
>>>>>> Disk-Control: ata_piix0: Intel C600/X79 series chipset 4-Port SATA IDE
>>>>>> Controller
>>>>>> Disk-Control: hpsa0: Hewlett-Packard Company Smart Array Gen8
>>>> Controllers
>>>>>> Disk-Control: shannon0: Device 1cb0:0275
>>>>>> Network: eth0 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>>> 40:a8:f0:23:55:fc, 1000Mb/s <full-duplex>
>>>>>> Network: eth1 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>>> 40:a8:f0:23:55:fd, no carrier
>>>>>> Network: eth2 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>>> 40:a8:f0:23:55:fe, no carrier
>>>>>> Network: eth3 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>>> 40:a8:f0:23:55:ff, no carrier
>>>>>> OS: CentOS 6.4 (Final), Linux 2.6.32-358.23.2.el6.x86_64 x86_64,
>>> 64-bit
>>>>>> BIOS: HP P70 02/10/2014
>>>>>> Hostname: l-hbase10.dba.cn1.qunar.com
>>>>>> 
>>>>>> And i attach the hbase configuration file.
>>>>>> 
>>>>>> 
>>>>>> I am new to HBase, any idea will be appreciated!
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>

Re: YCSB load data quit because hbase region too busy

Reply via email to