[ 
https://issues.apache.org/jira/browse/HBASE-19639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406740#comment-16406740
 ] 

Anoop Sam John commented on HBASE-19639:
----------------------------------------

For us the default memstore global size is 40% and BC is 40%.  When it  is R+W 
workload with this default configs, it is like the working size of the server 
is always 80%+ of Xmx.  The default InitialHeapOccupancyPercentage (IHOP) is 
45% for G1GC.  The larger value for this does not make G1GC meaningful IMHO. 
G1GC is basically for predictable GC pause.   With these setup, we will get 
more GC pauses..  So tuning Xmx and this IHOP is very key for G1GC based usage. 
 

> ITBLL can't go big because RegionTooBusyException... Above memstore limit
> -------------------------------------------------------------------------
>
>                 Key: HBASE-19639
>                 URL: https://issues.apache.org/jira/browse/HBASE-19639
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: hbase-stack-regionserver-ve0528.log.gz
>
>
> Running ITBLLs, the basic link generator keeps failing because I run into 
> exceptions like below:
> {code}
> 2017-12-26 19:23:45,284 INFO [main] 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator: 
> Persisting current.length=1000000, count=1000000, id=Job: 
> job_1513025868268_0062 Task: attempt_1513025868268_0062_m_000006_2, 
> current=\x8B\xDB25\xA7*\x9A\xF5\xDEx\x83\xDF\xDC?\x94\x92, i=1000000
> 2017-12-26 19:24:18,982 INFO [htable-pool3-t6] 
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl: #2, 
> table=IntegrationTestBigLinkedList, attempt=10/11 failed=524ops, last 
> exception: org.apache.hadoop.hbase.RegionTooBusyException: 
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=IntegrationTestBigLinkedList,q\xC7\x1Cq\xC7\x1Cq\xC0,1514342757438.71ef1fbab1576588955f45796e95c08b.,
>  server=ve0538.halxg.cloudera.com,16020,1514343549993, 
> memstoreSize=538084641, blockingMemStoreSize=536870912
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4178)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3799)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3739)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:975)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:894)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2587)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>  on ve0538.halxg.cloudera.com,16020,1514343549993, tracking started null, 
> retrying after=10050ms, replay=524ops
> 2017-12-26 19:24:29,061 INFO [htable-pool3-t6] 
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl: #2, 
> table=IntegrationTestBigLinkedList, attempt=11/11 failed=524ops, last 
> exception: org.apache.hadoop.hbase.RegionTooBusyException: 
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=IntegrationTestBigLinkedList,q\xC7\x1Cq\xC7\x1Cq\xC0,1514342757438.71ef1fbab1576588955f45796e95c08b.,
>  server=ve0538.halxg.cloudera.com,16020,1514343549993, 
> memstoreSize=538084641, blockingMemStoreSize=536870912
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4178)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3799)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3739)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:975)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:894)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2587)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>  on ve0538.halxg.cloudera.com,16020,1514343549993, tracking started null, 
> retrying after=10033ms, replay=524ops
> 2017-12-26 19:24:37,183 INFO [ReadOnlyZKClient] 
> org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient: 0x015051a0 no activities 
> for 60000 ms, close active connection. Will reconnect next time when there 
> are new requests.
> 2017-12-26 19:24:39,122 WARN [htable-pool3-t6] 
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl: #2, 
> table=IntegrationTestBigLinkedList, attempt=12/11 failed=524ops, last 
> exception: org.apache.hadoop.hbase.RegionTooBusyException: 
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=IntegrationTestBigLinkedList,q\xC7\x1Cq\xC7\x1Cq\xC0,1514342757438.71ef1fbab1576588955f45796e95c08b.,
>  server=ve0538.halxg.cloudera.com,16020,1514343549993, 
> memstoreSize=538084641, blockingMemStoreSize=536870912
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4178)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3799)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3739)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:975)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:894)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2587)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> ...
> {code}
> Fails task over and over. With server-killing monkeys.
> 24Gs which should be more than enough.
> Had just finished a big compaction.
> Whats shutting us out? Why taking so long to flush? We seen stuck at limit so 
> job fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to