The client RPC properties are not hardcoded, "pause" is defined by
"hbase.client.pause" and for number of retries, there is
"hbase.client.retries.number". Since the attempted region got split and
will never be online again, maybe you can decrease these properties on your
coprocessor code to "fail fast", and skip the given region in case there's
a "NotServingRegionException" wrapped on the response.

Em ter, 19 de fev de 2019 às 15:57, Ben Watson <b...@bwinsights.co.uk>
escreveu:

> Hello,
>
> I’m running HBase 1.4.4. I’ve got a simple endpoint coprocessor that sums
> records when called. Whenever a split occurs, it fails when called,
> throwing a RegionNotFoundException. The error manifests itself by spending
> 10 minutes retrying the connection 35 times:
>
> 2019-02-19 09:42:34 INFO  o.a.h.h.c.RpcRetryingCaller
> [hconnection-0x100f9a76-shared--pool3-t215]: Call exception, tries=25,
> retries=35, started=331810 ms ago, cancelled=false,
> msg=org.apache.hadoop.hbase.NotServingRegionException: Region
> coprocessor-test,1,1550568604433.63f03f2a494dc5756238ba08af437af6. is not
> online on <hostname>,16020,1550568101996
>
>     at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3082)
>
>     at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1275)
>
>     at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2201)
>
>     at
>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)
>
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354)
>
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>
>     at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)
>
>     at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)
>
> row '1_pfx-cfb0e548-f399-4059-af80-54fe9b7a828f' on table
> 'coprocessor-test' at
>
> region=coprocessor-test,1_pfx-7b2b6071-7d2c-4282-9645-31ca027327dc6549,1550568988094.f6cc0c6245702c544fb7fe65c1e3299b.,
> hostname=<hostname>l,16020,1550568101996, seqNum=630
>
> before eventually failing:
>
> Tue Feb 19 09:37:02 UTC 2019,
> RpcRetryingCaller{globalStartTime=1550569022304, pause=100, retries=35},
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region
> coprocessor-test,9,1550568604433.2d98945e85cca401a2c5d8bd777a0451. is not
> online on <hostname>,16020,1550568099593
>
>         at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3082)
>
>         at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1275)
>
>         at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2201)
>
>         at
>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)
>
> If I then re-run the coprocessor, it works without any issues. So, I need a
> way to quickly catch this error and manually retry it until it works. I
> can't see a way to change any useful parameter – the 35 retries and the
> time between retries seem to be hardcoded.
>
> Can anyone suggest how I can go about solving this?
>
> Regards,
>
> Ben
>

Reply via email to