Hello Hbase Community,
On our production environment we are experiencing several Exception such as:
2022-03-23 10:52:38,843 INFO [AsyncFSWAL-0-hdfs://hadoopcluster/hbase]
wal.AbstractFSWAL: Slow sync cost: 120 ms, current pipeline:
[DatanodeInfoWithStorage[10.211.3.11:50010,DS-b8181e87-2f63-47d5-a9f2-4d9ca8216d93,DISK],
DatanodeInfoWithStorage[10.211.3.12:50010,DS-f32aa630-e63c-4aee-a77b-a04128edee31,DISK]]2022-03-23
10:54:15,631 WARN [hconnection-0x63191a3a-shared-pool6-t322]
client.AsyncRequestFutureImpl: b7b2a5f50bdaa3794d185ce.:n Above
parallelPutToStoreThreadLimit(10) at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
on acv-db16-hd.diennea.lan,16020,1648028001827, tracking started Wed Mar 23
10:54:12 CET 2022; NOT retrying, failed=6 -- final attempt!2022-03-23
10:54:15,632 ERROR
[RpcServer.replication.FPBQ.Fifo.handler=2,queue=0,port=16020]
regionserver.ReplicationSink: Unable to accept edit
because:org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
Failed 6 actions: org.apache.hadoop.hbase.RegionTooBusyException:
StoreTooBusy,mn1_5276_huserlog,,1647637376109.27fd761a2b7b2a5f50bdaa3794d185ce.:n
Above parallelPutToStoreThreadLimit(10) at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318):
6 times, servers with issues: acv-db16-hd,16020,1648028001827 at
org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54)
at
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1204)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:453)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:436) at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.batch(ReplicationSink.java:421)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.replicateEntries(ReplicationSink.java:251)
at
org.apache.hadoop.hbase.replication.regionserver.Replication.replicateLogEntries(Replication.java:178)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.replicateWALEntry(RSRpcServices.java:2311)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29752)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
What is the best way to manage this issue?I saw on the net the possibility of
increasing the propery hbase.region.store.parallel.put.limit, but in the hbase
documentation I don't find any reference about it.Is this property still valid?
It can be enabled at the level of hdfs-site.xml
Thanks,
Hamado Dene