[jira] [Updated] (HBASE-26864) SplitTableRegionProcedure, it calls openParentRegions() at a wrong state during rollback.

2022-03-22 Thread Huaxiang Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxiang Sun updated HBASE-26864:
-
Description: 
Changed the issue title and description for the scope of the work. 

there is a bug in handling Rollback in SplitTableRegionProcedure.

[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java#L304]

[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java#L385]
{code:java}
In the state machine:


        case SPLIT_TABLE_REGION_CLOSE_PARENT_REGION:
          addChildProcedure(createUnassignProcedures(env));
  // Comments from HX:
          // createUnassignProcedures() can throw out IOException. If this 
happens,
          // it wont reach state SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGION and no 
parent regions
          // is closed as all created UnassignProcedures are rolled back. If it 
rolls back with
          // state SPLIT_TABLE_REGION_CLOSE_PARENT_REGION, no need to call 
openParentRegion(),
          // otherwise, it will result in OpenRegionProcedure for an already 
open region.
          
setNextState(SplitTableRegionState.SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS);
          break;


In the rollback,


        case SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS:
          // Doing nothing, in SPLIT_TABLE_REGION_CLOSE_PARENT_REGION,
          // we will bring parent region online
          break;
        case SPLIT_TABLE_REGION_CLOSE_PARENT_REGION:
  // Comments from HX: 
  // OpenParentRegion() should not be called here as explained above.
          openParentRegion(env);
          break; {code}

  was:
Changed the issue title and description for the scope of the work. 

The reason 


> SplitTableRegionProcedure, it calls openParentRegions() at a wrong state 
> during rollback.
> -
>
> Key: HBASE-26864
> URL: https://issues.apache.org/jira/browse/HBASE-26864
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.4.10
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>
> Changed the issue title and description for the scope of the work. 
> there is a bug in handling Rollback in SplitTableRegionProcedure.
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java#L304]
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java#L385]
> {code:java}
> In the state machine:
>         case SPLIT_TABLE_REGION_CLOSE_PARENT_REGION:
>           addChildProcedure(createUnassignProcedures(env));
>   // Comments from HX:
>           // createUnassignProcedures() can throw out IOException. If this 
> happens,
>           // it wont reach state SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGION and 
> no parent regions
>           // is closed as all created UnassignProcedures are rolled back. If 
> it rolls back with
>           // state SPLIT_TABLE_REGION_CLOSE_PARENT_REGION, no need to call 
> openParentRegion(),
>           // otherwise, it will result in OpenRegionProcedure for an already 
> open region.
>           
> setNextState(SplitTableRegionState.SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS);
>           break;
> In the rollback,
>         case SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS:
>           // Doing nothing, in SPLIT_TABLE_REGION_CLOSE_PARENT_REGION,
>           // we will bring parent region online
>           break;
>         case SPLIT_TABLE_REGION_CLOSE_PARENT_REGION:
>   // Comments from HX: 
>   // OpenParentRegion() should not be called here as explained above.
>           openParentRegion(env);
>           break; {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26864) SplitTableRegionProcedure, it calls openParentRegions() at a wrong state during rollback.

2022-03-22 Thread Huaxiang Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxiang Sun updated HBASE-26864:
-
Description: 
Changed the issue title and description for the scope of the work. 

The reason 

  was:
For some upgrading cases, we found that master issues RegionOpen for an already 
open region and Region Sever simply logs 
{code:java}
2022-03-17 22:16:55,595 WARN 
org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler: Received OPEN 
for 
foo,b2875fcb-7bc0-4fa9-a980-e902faf7f151,1631771037620.def199cc7208615b783b285f582ddfa4.
 which is already online {code}
and it does not ack or nack master. This OpenRegionProceduce is stuck forever.

In this specific case, it needs to ack master that region is open. 

 

For the cause of why it sent an OpenRegion request for an already open region, 
it will be followed by another issue.


> SplitTableRegionProcedure, it calls openParentRegions() at a wrong state 
> during rollback.
> -
>
> Key: HBASE-26864
> URL: https://issues.apache.org/jira/browse/HBASE-26864
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.4.10
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>
> Changed the issue title and description for the scope of the work. 
> The reason 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HBASE-26864) SplitTableRegionProcedure, it calls openParentRegions() at a wrong state during rollback.

2022-03-22 Thread Huaxiang Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxiang Sun updated HBASE-26864:
-
Summary: SplitTableRegionProcedure, it calls openParentRegions() at a wrong 
state during rollback.  (was: Region Server does not send Ack back to master 
after receiving an OpenRegionReq for already opened regions, causing 
OpenRegionProcedure stay forever.)

> SplitTableRegionProcedure, it calls openParentRegions() at a wrong state 
> during rollback.
> -
>
> Key: HBASE-26864
> URL: https://issues.apache.org/jira/browse/HBASE-26864
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.4.10
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
>
> For some upgrading cases, we found that master issues RegionOpen for an 
> already open region and Region Sever simply logs 
> {code:java}
> 2022-03-17 22:16:55,595 WARN 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler: Received 
> OPEN for 
> foo,b2875fcb-7bc0-4fa9-a980-e902faf7f151,1631771037620.def199cc7208615b783b285f582ddfa4.
>  which is already online {code}
> and it does not ack or nack master. This OpenRegionProceduce is stuck forever.
> In this specific case, it needs to ack master that region is open. 
>  
> For the cause of why it sent an OpenRegion request for an already open 
> region, it will be followed by another issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)