huaxiang sun created HBASE-16345: ------------------------------------ Summary: RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions Key: HBASE-16345 URL: https://issues.apache.org/jira/browse/HBASE-16345 Project: HBase Issue Type: Bug Components: Client Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun
We run into the following exception during read replica testing. {code} Caused by: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server not running at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:168) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:100) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) ... 4 more Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.regionserver.RegionServerStoppedException): org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server not running at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:162) ... 6 more {code} Checking the code, we need to catch a few exceptions from the primary region server and continue with replicas. https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L211 -- This message was sent by Atlassian JIRA (v6.3.4#6332)