[ 
https://issues.apache.org/jira/browse/HBASE-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13061:
---------------------------
       Resolution: Fixed
    Fix Version/s: 1.1.0
                   1.0.1
                   2.0.0
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Test failure was not related.

Thanks for the patch, Andrey.

> RegionStates can remove wrong region from server holdings
> ---------------------------------------------------------
>
>                 Key: HBASE-13061
>                 URL: https://issues.apache.org/jira/browse/HBASE-13061
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 1.0.0, 2.0.0
>            Reporter: Andrey Stepachev
>            Assignee: Andrey Stepachev
>             Fix For: 2.0.0, 1.0.1, 1.1.0
>
>         Attachments: HBASE-13061.patch
>
>
> Got failed test in HBASE-13017. It seems that with zk nodes were ordered in 
> one way and test didn't trigger error, but with new meta rows ordered 
> differently test became flakey.
> That leads to interesting sequence of offline/online regions and triggers bug 
> and NPE in AM (thats seen in TestZKLessAMOnCluster)
> That can happen if region was moved from RS1 to other region server RS2, and 
> thats happens that RS2 failed. Region remains in PENDING_OPEN. SSH will 
> offline it from RS1(without removing from oldAssignments because of disabled 
> table). When AssingnmentManager come and assign region it then removes 
> oldAssignment of region from serverHoldings. And thats happen to be our just 
> assigned RS1.
> Small bit of logs. Most interesting are last 3 lines, region 
> b73fe9f1185361e846b0e1ceb7d6d64e added to server and immediately removed from 
> it. Later that triggers NPE in disable table handler.
> {code}
> 2015-02-18 01:21:18,338 INFO  [Thread-436] master.RegionStates(1109): 
> Transition {b73fe9f1185361e846b0e1ceb7d6d64e state=PENDING_OPEN, 
> ts=1424222478324, server=octobook.home,65370,1424222474885} to 
> {b73fe9f1185361e846b0e1ceb7d6d64e state=OFFLINE, ts=1424222478338, 
> server=octobook.home,65370,1424222474885}
> 2015-02-18 01:21:18,339 INFO  [Thread-436] master.RegionStateStore(218): 
> Updating row 
> testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.
>  with state=OFFLINE
> 2015-02-18 01:21:18,340 DEBUG [Thread-436] master.RegionStates(591): Old 
> server name for {ENCODED => b73fe9f1185361e846b0e1ceb7d6d64e, NAME => 
> 'testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.',
>  STARTKEY => 'I', ENDKEY => 'Q'} is null
> 2015-02-18 01:21:18,340 INFO  [Thread-436] master.RegionStates(1109): 
> Transition {b73fe9f1185361e846b0e1ceb7d6d64e state=OFFLINE, ts=1424222478338, 
> server=octobook.home,65370,1424222474885} to 
> {b73fe9f1185361e846b0e1ceb7d6d64e state=OPEN, ts=1424222478340, 
> server=octobook.home,65359,1424222474743}
> 2015-02-18 01:21:18,341 INFO  [Thread-436] master.RegionStateStore(218): 
> Updating row 
> testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.
>  with state=OPEN&sn=octobook.home,65359,1424222474743
> 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(457): Onlined 
> b73fe9f1185361e846b0e1ceb7d6d64e on octobook.home,65359,1424222474743 
> {ENCODED => b73fe9f1185361e846b0e1ceb7d6d64e, NAME => 
> 'testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState,I,1424222477651.b73fe9f1185361e846b0e1ceb7d6d64e.',
>  STARTKEY => 'I', ENDKEY => 'Q'}
> 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(481): Adding  
> b73fe9f1185361e846b0e1ceb7d6d64e to server octobook.home,65359,1424222474743
> 2015-02-18 01:21:18,342 INFO  [Thread-436] master.RegionStates(467): Offlined 
> b73fe9f1185361e846b0e1ceb7d6d64e from octobook.home,65359,1424222474743
> 2015-02-18 01:21:18,342 DEBUG [Thread-436] master.RegionStates(496): Removing 
> b73fe9f1185361e846b0e1ceb7d6d64e from server octobook.home,65359,1424222474743
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to