[ https://issues.apache.org/jira/browse/HBASE-21017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592508#comment-16592508 ]
Duo Zhang commented on HBASE-21017: ----------------------------------- OK I think I find the race here {noformat} 2018-08-24 12:03:19,255 INFO [RS-EventLoopGroup-8-9] ipc.ServerRpcConnection(556): Connection from 67.195.81.136:48580, version=3.0.0-SNAPSHOT, sasl=false, ugi=jenkins (auth:SIMPLE), service=ClientService 2018-08-24 12:03:19,297 INFO [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=45229] master.MasterRpcServices(579): Client=jenkins//67.195.81.136 assign testEnableTableWithNoRegionServers,,1535112163487.37ec5bc06522d2e4e51a73fb48d03962. 2018-08-24 12:03:19,521 DEBUG [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=45229] procedure2.ProcedureExecutor(1004): Stored pid=29, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=testEnableTableWithNoRegionServers, region=37ec5bc06522d2e4e51a73fb48d03962, ASSIGN 2018-08-24 12:03:19,521 INFO [PEWorker-8] procedure.MasterProcedureScheduler(689): pid=29, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=testEnableTableWithNoRegionServers, region=37ec5bc06522d2e4e51a73fb48d03962, ASSIGN checking lock on 37ec5bc06522d2e4e51a73fb48d03962 2018-08-24 12:03:19,572 DEBUG [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=45229] procedure.ProcedureSyncWait(188): waitFor pid=29 2018-08-24 12:03:19,655 INFO [PEWorker-8] assignment.TransitRegionStateProcedure(155): Setting lastHost as the region location asf916.gq1.ygridcore.net,34328,1535112151646 2018-08-24 12:03:19,655 INFO [PEWorker-8] assignment.TransitRegionStateProcedure(159): Starting pid=29, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=testEnableTableWithNoRegionServers, region=37ec5bc06522d2e4e51a73fb48d03962, ASSIGN; rit=OPEN, location=asf916.gq1.ygridcore.net,34328,1535112151646; forceNewPlan=false, retain=true 2018-08-24 12:03:19,826 INFO [PEWorker-9] assignment.RegionStateStore(199): pid=29 updating hbase:meta row=37ec5bc06522d2e4e51a73fb48d03962, regionState=OPENING, regionLocation=asf916.gq1.ygridcore.net,39326,1535112180502 java.lang.Exception at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:199) at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:138) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.transitStateAndUpdate(AssignmentManager.java:1423) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.regionOpening(AssignmentManager.java:1435) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.openRegion(TransitRegionStateProcedure.java:176) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.executeFromState(TransitRegionStateProcedure.java:311) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.executeFromState(TransitRegionStateProcedure.java:96) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.execute(TransitRegionStateProcedure.java:283) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.execute(TransitRegionStateProcedure.java:96) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:873) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1577) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1365) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:77) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1877) 2018-08-24 12:03:19,832 INFO [PEWorker-9] procedure2.ProcedureExecutor(1612): Initialized subprocedures=[{pid=30, ppid=29, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] 2018-08-24 12:03:19,932 INFO [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45229] assignment.RegionStateStore(199): pid=29 updating hbase:meta row=37ec5bc06522d2e4e51a73fb48d03962, regionState=OPEN, openSeqNum=8, regionLocation=asf916.gq1.ygridcore.net,39326,1535112180502 java.lang.Exception at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:199) at org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:138) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.transitStateAndUpdate(AssignmentManager.java:1423) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.regionOpened(AssignmentManager.java:1471) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.reportTransitionOpened(TransitRegionStateProcedure.java:361) at org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.reportTransition(TransitRegionStateProcedure.java:402) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportTransition(AssignmentManager.java:899) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1060) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:998) at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:483) at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:15170) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) 2018-08-24 12:03:20,145 WARN [RpcServer.priority.FPBQ.Fifo.handler=4,queue=0,port=39326] regionserver.RSRpcServices(2006): Received OPEN for the region:testEnableTableWithNoRegionServers,,1535112163487.37ec5bc06522d2e4e51a73fb48d03962., which is already online {noformat} 1. We update the state to OPENING and schedule the OpenRegionProcedure 2. There is a regionServerReport and we change the state to OPEN 3. OpenRegionProcedure is scheduled and send request to RS 4. RS ignored the request. 5. Since the state has been changed to OPEN, we will not wake the event any more. > Revisit the expected states for open/close > ------------------------------------------ > > Key: HBASE-21017 > URL: https://issues.apache.org/jira/browse/HBASE-21017 > Project: HBase > Issue Type: Sub-task > Components: amv2 > Reporter: Duo Zhang > Assignee: Duo Zhang > Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21017-debug.patch, HBASE-21017.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)