[ https://issues.apache.org/jira/browse/HBASE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756811#comment-13756811 ]
Jimmy Xiang commented on HBASE-9387: ------------------------------------ +1 on v8 for commit if test is green. One nit can be fixed at commit: there is abort method taking just a reason. Another nit, the warning message is shared for several different scenarios, if ok, it may be better to have a little extra info specific to each scenario. > Region could get lost during assignment > --------------------------------------- > > Key: HBASE-9387 > URL: https://issues.apache.org/jira/browse/HBASE-9387 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 0.95.2 > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Critical > Attachments: 9387-v1.txt, 9387-v3.txt, 9387-v4.2.txt, 9387-v4.3.txt, > 9387-v4.4.txt, 9387-v4.txt, 9387-v5.txt, 9387-v6.txt, 9387-v7.txt, > 9387-v8.txt, hbase-9387.patch, > org.apache.hadoop.hbase.TestFullLogReconstruction-output.txt > > > I observed test timeout running against hadoop 2.1.0 with distributed log > replay turned on. > Looks like region state for 1588230740 became inconsistent between master and > the surviving region server: > {code} > 2013-08-29 22:15:34,180 INFO [AM.ZK.Worker-pool2-t4] > master.RegionStates(299): Onlined 1588230740 on > kiyo.gq1.ygridcore.net,57016,1377814510039 > ... > 2013-08-29 22:15:34,587 DEBUG [Thread-221] > client.HConnectionManager$HConnectionImplementation(1269): locateRegionInMeta > parentTable=hbase:meta, metaLocation={region=hbase:meta,,1.1588230740, > hostname=kiyo.gq1.ygridcore.net,57016,1377814510039, seqNum=0}, attempt=2 of > 35 failed; retrying after sleep of 302 because: > org.apache.hadoop.hbase.exceptions.RegionOpeningException: Region is being > opened: 1588230740 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2574) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3949) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2733) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26965) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2063) > at > org.apache.hadoop.hbase.ipc.RpcServer$CallRunner.run(RpcServer.java:1800) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:165) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:41) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira