[ 
https://issues.apache.org/jira/browse/HBASE-10852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949949#comment-13949949
 ] 

Nick Dimiduk commented on HBASE-10852:
--------------------------------------

Simple enough, +1. Is 2 retries enough to weather a RIT? Make it 5?

> TestDistributedLogSplitting#testDisallowWritesInRecovering occasionally fails
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-10852
>                 URL: https://issues.apache.org/jira/browse/HBASE-10852
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Minor
>             Fix For: 0.99.0
>
>         Attachments: 10852-v1.txt
>
>
> Here was the failure:
> {code}
> java.lang.AssertionError: No RegionInRecoveryException. Following exceptions 
> returned=[org.apache.hadoop.hbase.NotServingRegionException: 
> org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is 
> not online on c64-s12.cs1cloud.internal,52861,1395905929889
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2676)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4095)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2826)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28857)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
>       at 
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>       at 
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>       at 
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>       at java.lang.Thread.run(Thread.java:722)
> ]
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testDisallowWritesInRecovering(TestDistributedLogSplitting.java:924)
> {code}
> Here was the cause:
> {code}
> 2014-03-27 00:39:01,398 DEBUG [RS_OPEN_META-c64-s12:44281-0] 
> handler.OpenRegionHandler(179): Opened hbase:meta,,1.1588230740 on 
> c64-s12.cs1cloud.internal,44281,1395905929927
> 2014-03-27 00:39:01,405 DEBUG [Thread-2811-EventThread] 
> zookeeper.ZooKeeperWatcher(310): master:32796-0x1450278b68f01cc, 
> quorum=localhost:50923, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDeleted, state=SyncConnected, 
> path=/hbase/region-in-transition/1588230740
> 2014-03-27 00:39:01,405 DEBUG [Thread-2811-EventThread] 
> zookeeper.ZooKeeperWatcher(310): master:32796-0x1450278b68f01cc, 
> quorum=localhost:50923, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, 
> path=/hbase/region-in-transition
> 2014-03-27 00:39:01,406 DEBUG [AM.ZK.Worker-pool1213-t19] 
> zookeeper.ZKAssign(480): master:32796-0x1450278b68f01cc, 
> quorum=localhost:50923, baseZNode=/hbase Deleted unassigned node 1588230740 
> in expected state RS_ZK_REGION_OPENED
> 2014-03-27 00:39:01,406 DEBUG [AM.ZK.Worker-pool1213-t19] 
> master.AssignmentManager$4(1186): Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OPEN, ts=1395905941397, 
> server=c64-s12.cs1cloud.internal,44281,1395905929927}
> 2014-03-27 00:39:01,406 INFO  [AM.ZK.Worker-pool1213-t19] 
> master.RegionStates(413): Onlined 1588230740 on 
> c64-s12.cs1cloud.internal,44281,1395905929927
> 2014-03-27 00:39:01,406 INFO  [AM.ZK.Worker-pool1213-t19] 
> master.RegionStates(417): Offlined 1588230740 from 
> c64-s12.cs1cloud.internal,52861,1395905929889
> 2014-03-27 00:39:01,547 WARN  [Thread-2811] 
> client.ConnectionManager$HConnectionImplementation(1221): Encountered 
> problems when prefetch hbase:meta table: 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=1, exceptions:
> Thu Mar 27 00:39:01 PDT 2014, 
> org.apache.hadoop.hbase.client.RpcRetryingCaller@23136717, 
> org.apache.hadoop.hbase.NotServingRegionException: 
> org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is 
> not online on c64-s12.cs1cloud.internal,52861,1395905929889
> {code}
> hbase:meta was moving but client didn't retry (attempts=1).
> Thanks to Jeff who helped identify the issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to