[ 
https://issues.apache.org/jira/browse/HBASE-17682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17682:
-----------------------------------
    Fix Version/s:     (was: 1.4.0)

> Region stuck in merging_new state indefinitely
> ----------------------------------------------
>
>                 Key: HBASE-17682
>                 URL: https://issues.apache.org/jira/browse/HBASE-17682
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.0
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>             Fix For: 2.0.0, 1.3.1, 1.2.5, 1.1.10
>
>         Attachments: HBASE-17682.branch-1.3.001.patch, 
> HBASE-17682.master.001.patch
>
>
> Ran into issue while tinkering around with a chaos monkey that did splits, 
> merges and kills exclusively, which resulted in regions getting stuck in 
> transition in merging new state indefinitely which i think happens when the 
> rs is killed during the merge but before the ponr, in which case the new 
> regions state in master is merging new. When the rs dies at this point the 
> master executes RegionStates.serverOffline() for the rs which does
> {code}
> for (RegionState state : regionsInTransition.values()) {
>         HRegionInfo hri = state.getRegion();
>         if (assignedRegions.contains(hri)) {
>           // Region is open on this region server, but in transition.
>           // This region must be moving away from this server, or 
> splitting/merging.
>           // SSH will handle it, either skip assigning, or re-assign.
>           LOG.info("Transitioning " + state + " will be handled by 
> ServerCrashProcedure for " + sn);
>         } else if (sn.equals(state.getServerName())) {
>           // Region is in transition on this region server, and this
>           // region is not open on this server. So the region must be
>           // moving to this server from another one (i.e. opening or
>           // pending open on this server, was open on another one.
>           // Offline state is also kind of pending open if the region is in
>           // transition. The region could be in failed_close state too if we 
> have
>           // tried several times to open it while this region server is not 
> reachable)
>           if (state.isPendingOpenOrOpening() || state.isFailedClose() || 
> state.isOffline()) {
>             LOG.info("Found region in " + state +
>               " to be reassigned by ServerCrashProcedure for " + sn);
>             rits.add(hri);
>           } else if(state.isSplittingNew()) {
>             regionsToCleanIfNoMetaEntry.add(state.getRegion());
>           } else {
>             LOG.warn("THIS SHOULD NOT HAPPEN: unexpected " + state);
>           }
>         }
>       }
> {code}
> We donot handle merging new here and end up with "THIS SHOULD NOT HAPPEN: 
> unexpected ...". Post this we have the new region which does not have any 
> data stuck which leads to the balancer not running.
> I think we should handle mergingnew the same way as splittingnew. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to