Hello, Unfortunately I don’t have good guidance on what to tune this to. What I can say though is that this feature will be disabled by default starting with version 2.5.0. Part of the reason for that is we determined it is too aggressive but didn’t yet have good guidance on a better default.
So I would recommend disabling this feature by setting hbase.region.store.parallel.put.limit to 0 (zero) in your hbase-site.xml. The idea behind the feature is good, so if you’d prefer to leave it enabled I’d recommend doing some load testing based on your use case and hardware to determine a value that works for you. The general idea is that it tries to avoid painful write contention by limiting the number of parallel write operations to a single region at a time, but how many parallel writers you can withstand will be hardware dependent. On Wed, Mar 23, 2022 at 6:02 AM Hamado Dene <hamadod...@yahoo.com.invalid> wrote: > Hello Hbase Community, > On our production environment we are experiencing several Exception such > as: > 2022-03-23 10:52:38,843 INFO [AsyncFSWAL-0-hdfs://hadoopcluster/hbase] > wal.AbstractFSWAL: Slow sync cost: 120 ms, current pipeline: > [DatanodeInfoWithStorage[10.211.3.11:50010,DS-b8181e87-2f63-47d5-a9f2-4d9ca8216d93,DISK], > DatanodeInfoWithStorage[10.211.3.12:50010,DS-f32aa630-e63c-4aee-a77b-a04128edee31,DISK]]2022-03-23 > 10:54:15,631 WARN [hconnection-0x63191a3a-shared-pool6-t322] > client.AsyncRequestFutureImpl: b7b2a5f50bdaa3794d185ce.:n Above > parallelPutToStoreThreadLimit(10) at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) on > acv-db16-hd.diennea.lan,16020,1648028001827, tracking started Wed Mar 23 > 10:54:12 CET 2022; NOT retrying, failed=6 -- final attempt!2022-03-23 > 10:54:15,632 ERROR > [RpcServer.replication.FPBQ.Fifo.handler=2,queue=0,port=16020] > regionserver.ReplicationSink: Unable to accept edit > because:org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed 6 actions: org.apache.hadoop.hbase.RegionTooBusyException: > StoreTooBusy,mn1_5276_huserlog,,1647637376109.27fd761a2b7b2a5f50bdaa3794d185ce.:n > Above parallelPutToStoreThreadLimit(10) at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1083) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:986) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:951) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2783) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42290) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318): > 6 times, servers with issues: acv-db16-hd,16020,1648028001827 at > org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54) > at > org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1204) > at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:453) > at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:436) at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.batch(ReplicationSink.java:421) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.replicateEntries(ReplicationSink.java:251) > at > org.apache.hadoop.hbase.replication.regionserver.Replication.replicateLogEntries(Replication.java:178) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.replicateWALEntry(RSRpcServices.java:2311) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29752) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > What is the best way to manage this issue?I saw on the net the possibility > of increasing the propery hbase.region.store.parallel.put.limit, but in the > hbase documentation I don't find any reference about it.Is > <http://it.Is> > this property still valid? It can be enabled at the level of hdfs-site.xml > Thanks, > > Hamado Dene >