[ https://issues.apache.org/jira/browse/HBASE-20908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551539#comment-16551539 ]
Hudson commented on HBASE-20908: -------------------------------- Results for branch branch-2 [build #1007 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1007/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1007//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1007//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1007//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Infinite loop on regionserver if region replica are reduced > ------------------------------------------------------------ > > Key: HBASE-20908 > URL: https://issues.apache.org/jira/browse/HBASE-20908 > Project: HBase > Issue Type: Bug > Components: read replicas > Affects Versions: 1.2.0, 2.0.0 > Reporter: Ankit Singhal > Assignee: Ankit Singhal > Priority: Major > Attachments: 20908_v3.patch, HBASE-20908.patch, HBASE-20908_v1.patch, > HBASE-20908_v3-branch-1.patch, HBASE-20908_v3.patch > > > Steps to reproduce > {code} > hbase(main):003:0> create 'myTable','cf',{REGION_REPLICATION=>3} > hbase(main):003:0> put 'myTable','r1','cf:col1','1' > 0 row(s) in 0.1230 seconds > hbase(main):004:0> disable 'myTable' > alter '0 row(s) in 2.3040 seconds > hbase(main):005:0> alter 'myTable',{REGION_REPLICATION=>1} > Updating all regions with the new schema... > 1/1 regions updated. > Done. > 0 row(s) in 11.9550 seconds > hbase(main):006:0> enable 'myTable' > 0 row(s) in 1.2620 seconds > hbase(main):007:0> put 'myTable1','r2','cf:col1','1' > 0 row(s) in 0.0060 seconds > {code} > This is the replica region request which will not be present now in Meta but > was there in cache. Server will say that he is not serving this region. > {code} > com.google.protobuf.ServiceException: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException): > org.apache.hadoop.hbase.NotServingRegionException: Region > d997d9b47a106216b9b117617ec09015 is not online on > 10.22.9.76,16020,1531341039091 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3124) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3106) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1714) > at > org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22773) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > {code} > Eventually, when we will update our cache after looking into meta , we will > get into an infinite loop as this event will not be replicated because the > location of the replica will not appear again. > {code} > java.net.SocketTimeoutException: callTimeout=1200000, callDuration=2181316: > Can't get the location null > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:170) > at > org.apache.hadoop.hbase.replication.regionserver.RegionReplicaReplicationEndpoint$RetryingRpcCallable.call(RegionReplicaReplicationEndpoint.java:606) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't > get the location > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:178) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:105) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:89) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134) > ... 5 more > Caused by: java.io.IOException: HRegionInfo was null in myTable, > row=keyvalues={myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:regioninfo/1531262022425/Put/vlen=41/seqid=0, > > myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:seqnumDuringOpen/1531341209944/Put/vlen=8/seqid=0, > > myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:server/1531341209944/Put/vlen=16/seqid=0, > > myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:serverstartcode/1531341209944/Put/vlen=8/seqid=0} > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1289) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1179) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:170) > ... 8 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)