[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang updated HBASE-20846: ------------------------------ Attachment: HBASE-20846-v3.patch > Restore procedure locks when master restarts > -------------------------------------------- > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.1.0 > Reporter: Allan Yang > Assignee: Duo Zhang > Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846-v3.patch, HBASE-20846.branch-2.0.002.patch, > HBASE-20846.branch-2.0.patch, HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)