[ https://issues.apache.org/jira/browse/HBASE-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Stepachev updated HBASE-13061: ------------------------------------- Attachment: HBASE-13061.patch > RegionStates can remove wrong region from server holdings > --------------------------------------------------------- > > Key: HBASE-13061 > URL: https://issues.apache.org/jira/browse/HBASE-13061 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 1.0.0, 2.0.0 > Reporter: Andrey Stepachev > Assignee: Andrey Stepachev > Attachments: HBASE-13061.patch > > > Got failed test in HBASE-13017. It seems that with zk nodes were ordered in > one way and test didn't trigger error, but with new meta rows ordered > differently test became flakey. > That leads to interesting sequence of offline/online regions and triggers bug > and NPE in AM (thats seen in TestZKLessAMOnCluster) > That can happen if region was moved from RS1 to other region server RS2, and > thats happens that RS2 failed. Region remains in PENDING_OPEN. SSH will > offline it from RS1(without removing from oldAssignments because of disabled > table). When AssingnmentManager come and assign region it then removes > oldAssignment of region from serverHoldings. And thats happen to be our just > assigned RS1. > Small bit of code. > {code} > 2015-02-18 01:21:18,338 INFO [Thread-436] master.RegionStates(1109): > Transition {b73fe9f1185361e846b0e1ceb7d6d64e state=PENDING_OPEN, > ts=1424222478324, server=octobook.home,65370,1424222474885} to > {b73fe9f1185361e846b0e1ceb7d6d64e state=OFFLINE, ts=1424222478338, > server=octobook.home,65370,1424222474885} > 2015-02-18 01:21:18,339 INFO [Thread-436] master.RegionStateStore(218): > Updating row > testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e. > with state=OFFLINE > 2015-02-18 01:21:18,340 DEBUG [Thread-436] master.RegionStates(591): Old > server name for {ENCODED => b73fe9f1185361e846b0e1ceb7d6d64e, NAME => > 'testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.', > STARTKEY => 'I', ENDKEY => 'Q'} is null > 2015-02-18 01:21:18,340 INFO [Thread-436] master.RegionStates(1109): > Transition {b73fe9f1185361e846b0e1ceb7d6d64e state=OFFLINE, ts=1424222478338, > server=octobook.home,65370,1424222474885} to > {b73fe9f1185361e846b0e1ceb7d6d64e state=OPEN, ts=1424222478340, > server=octobook.home,65359,1424222474743} > 2015-02-18 01:21:18,341 INFO [Thread-436] master.RegionStateStore(218): > Updating row > testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e. > with state=OPEN&sn=octobook.home,65359,1424222474743 > 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(457): Onlined > b73fe9f1185361e846b0e1ceb7d6d64e on octobook.home,65359,1424222474743 > {ENCODED => b73fe9f1185361e846b0e1ceb7d6d64e, NAME => > 'testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.', > STARTKEY => 'I', ENDKEY => 'Q'} > 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(481): Adding > b73fe9f1185361e846b0e1ceb7d6d64e to server octobook.home,65359,1424222474743 > 2015-02-18 01:21:18,342 INFO [Thread-436] master.RegionStates(467): Offlined > b73fe9f1185361e846b0e1ceb7d6d64e from octobook.home,65359,1424222474743 > 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(496): Removing > b73fe9f1185361e846b0e1ceb7d6d64e from server octobook.home,65359,1424222474743 > 2015-02-18 01:21:18,347 INFO [Thread-436] hbase.MetaTableAccessor(1437): > Updated table testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState > state to DISABLED in META > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)