---pc3 as RegionServer and datanode
you only have 1 RS(split to 100 region), HDFS replicate to 3? perhaps the
compaction cannot catch the flush speed,  # of store files hit
"hbase.hstore.blockingStoreFiles"
and flush blocked sometime(90s by default). during this period, memstore
continues to grow and finally reached blocking size(128M*4) ...

turning on debug can see more messages.  ps "hbase.hregion.majorcompaction"=0
cannot turn off major compaction.




On Fri, Nov 21, 2014 at 12:55 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Louis:
> See this thread:
> http://search-hadoop.com/m/DHED4XrURi2
>
> On Thu, Nov 20, 2014 at 7:33 PM, mail list <louis.hust...@gmail.com>
> wrote:
>
> > If set the target,  YCSB will sleep to control the flow, so it looks like
> > the same as fewer threads.
> > But I want to know with heavy write, why some region which exceeds the
> > limit no flushed.
> > Maybe like you said, it wait in a flush queue, i will try to repeat the
> > scenario and lookup the
> > flush queue.
> >
> > On Nov 21, 2014, at 1:44, Vladimir Rodionov <vladrodio...@gmail.com>
> > wrote:
> >
> > > Could you please rerun your tests with -target ?
> > > You should limit # of transactions per second and find the maximum your
> > > cluster can sustain.
> > >
> > > -Vladimir Rodionov
> > >
> > > On Wed, Nov 19, 2014 at 10:53 PM, mail list <louis.hust...@gmail.com>
> > wrote:
> > >
> > >> hi all,
> > >>
> > >> I build an HBASE test environment, with three PC server, with CHD
> 5.1.0
> > >>
> > >> pc1 pc2 pc3
> > >>
> > >> pc1 and pc2 as HMASTER and hadoop namenode
> > >> pc3 as RegionServer and datanode
> > >>
> > >> Then I create user as following:
> > >>
> > >> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
> > "user#{1000+i*(9999-1000)/100}"} }
> > >>
> > >>
> > >> Using YCSB for load data as following:
> > >>
> > >> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
> columnfamily=family
> > >> -p recordcount=1000000000   -p threadcount=32  -s  > result/workloadc
> > >>
> > >>
> > >> But when after a while, the ycsb return with following error:
> > >>
> > >> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
> > >> attempt=35/35 failed 715 ops, last exception:
> > >> org.apache.hadoop.hbase.RegionTooBusyException:
> > >> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
> > >>
> >
> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
> > >> server=l-hbase10.dba.cn1.qunar.com,60020,1416451280772,
> > >> memstoreSize=536897120, blockingMemStoreSize=536870912
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
> > >>        at
> > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
> > >>        at
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
> > >>        at
> > >>
> >
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
> > >>        at java.lang.Thread.run(Thread.java:744)
> > >> on l-hbase10.dba.cn1.qunar.com,60020,1416451280772, tracking started
> > Thu
> > >> Nov 20 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
> > >>
> > >>
> > >> It seems the user9099 region is too busy, so I lookup the memstore
> > metrics
> > >> in web:
> > >>
> > >>
> > >>
> > >> As you see, the user9099 is bigger than other region, I think it is
> > >> flushing, but after a while, it does not change to a small size and
> YCSB
> > >> quit finally.
> > >>
> > >> The region server is configured as below:
> > >>
> > >> Summary: HP DL380p Gen8, 1 x Xeon E5-2630 v2 2.60GHz, 126GB / 128GB
> > >> 1600MHz DDR3
> > >> System: HP ProLiant DL380p Gen8
> > >> Processors: 1 (of 2) x Xeon E5-2630 v2 2.60GHz 100MHz FSB (HT
> enabled, 6
> > >> cores, 24 threads)
> > >> Memory: 126GB / 128GB 1600MHz DDR3 == 16 x 8GB, 8 x empty
> > >> Disk: sda (scsi2): 450GB (1%) JBOD == 1 x HP-LOGICAL-VOLUME
> > >> Disk: sdb (scsi2): 27TB (0%) JBOD == 1 x HP-LOGICAL-VOLUME
> > >> Disk-Control: ata_piix0: Intel C600/X79 series chipset 4-Port SATA IDE
> > >> Controller
> > >> Disk-Control: hpsa0: Hewlett-Packard Company Smart Array Gen8
> > Controllers
> > >> Disk-Control: shannon0: Device 1cb0:0275
> > >> Network: eth0 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
> > >> 40:a8:f0:23:55:fc, 1000Mb/s <full-duplex>
> > >> Network: eth1 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
> > >> 40:a8:f0:23:55:fd, no carrier
> > >> Network: eth2 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
> > >> 40:a8:f0:23:55:fe, no carrier
> > >> Network: eth3 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
> > >> 40:a8:f0:23:55:ff, no carrier
> > >> OS: CentOS 6.4 (Final), Linux 2.6.32-358.23.2.el6.x86_64 x86_64,
> 64-bit
> > >> BIOS: HP P70 02/10/2014
> > >> Hostname: l-hbase10.dba.cn1.qunar.com
> > >>
> > >> And i attach the hbase configuration file.
> > >>
> > >>
> > >> I am new to HBase, any idea will be appreciated!
> > >>
> > >>
> > >>
> >
> >
>

Reply via email to