[ https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576082#comment-16576082 ]
Allan Yang commented on HBASE-20976: ------------------------------------ {code} I think there may still be races? As if the previous SCP has also been done and removed from ProcedureExecutor, and then the UnassignProcedure tries to expire the server... {code} Maybe I deleted some procedures wals which causing this. But whatever, I think a double check won't hurt. > SCP can be scheduled multiple times for the same RS > --------------------------------------------------- > > Key: HBASE-20976 > URL: https://issues.apache.org/jira/browse/HBASE-20976 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.1.0, 2.0.1 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > Fix For: 2.0.2 > > Attachments: HBASE-20976.branch-2.0.001.patch, > HBASE-20976.branch-2.0.002.patch > > > SCP can be scheduled multiple times for the same RS: > 1. a RS crashed, a SCP was submitted for it > 2. before this SCP finish, the Master crashed > 3. The new master will scan the meta table and find some region is still open > on a dead server > 4. The new master submit a SCP for the dead server again > The two SCP for the same RS can even execute concurrently if without > HBASE-20846… > Provided a test case to reproduce this issue and a fix solution in the patch. > Another case that SCP might be scheduled multiple times for the same RS(with > HBASE-20708.): > 1. a RS crashed, a SCP was submitted for it > 2. A new RS on the same host started, the old RS's Serveranme was remove from > DeadServer.deadServers > 3. after the SCP passed the Handle_RIT state, a UnassignProcedure need to > send a close region operation to the crashed RS > 4. The UnassignProcedure's dispatch failed since 'NoServerDispatchException' > 5. Begin to expire the RS, but only find it not online and not in deadServer > list, so a SCP was submitted for the same RS again > -- This message was sent by Atlassian JIRA (v7.6.3#76005)