Apache9 commented on code in PR #4945:
URL: https://github.com/apache/hbase/pull/4945#discussion_r1082188117


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java:
##########
@@ -153,6 +153,25 @@ public class AssignmentManager {
   private static final int DEFAULT_RIT_STUCK_WARNING_THRESHOLD = 60 * 1000;
   public static final String UNEXPECTED_STATE_REGION = "Unexpected state for ";
 
+  public static final String FORCE_REGION_RETAINMENT = 
"hbase.master.scp.retain.assignment.force";
+
+  public static final boolean DEFAULT_FORCE_REGION_RETAINMENT = false;
+
+  /** The wait time in millis before checking again if the region's previous 
RS is back online */
+  public static final String FORCE_REGION_RETAINMENT_WAIT =

Review Comment:
   Better call it wait interval?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java:
##########
@@ -188,6 +196,27 @@ protected boolean waitInitialized(MasterProcedureEnv env) {
     return am.waitMetaLoaded(this) || am.waitMetaAssigned(this, getRegion());
   }
 
+  private void checkAndWaitForOriginalServer(MasterProcedureEnv env, 
ServerName lastHost)
+    throws ProcedureSuspendedException {
+    ServerManager serverManager = env.getMasterServices().getServerManager();
+    ServerName newNameForServer = 
serverManager.findServerWithSameHostnamePortWithLock(lastHost);
+    boolean isOnline = 
serverManager.createDestinationServersList().contains(newNameForServer);
+
+    if (!isOnline && retries < 
env.getAssignmentManager().getForceRegionRetainmentRetries()) {
+      retries++;
+      LOG.info("Suspending the TRSP PID={} because {} is true and previous 
host {} "
+        + "for region is not yet online.", this.getProcId(), 
FORCE_REGION_RETAINMENT, lastHost);
+      setTimeout(env.getAssignmentManager().getForceRegionRetainmentWait());
+      setState(ProcedureProtos.ProcedureState.WAITING_TIMEOUT);
+      throw new ProcedureSuspendedException();
+    }
+    LOG.info(

Review Comment:
   Need to reset retries to 0 here. And better name it 
waitForOriginalServerRetries.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java:
##########
@@ -188,6 +196,27 @@ protected boolean waitInitialized(MasterProcedureEnv env) {
     return am.waitMetaLoaded(this) || am.waitMetaAssigned(this, getRegion());
   }
 
+  private void checkAndWaitForOriginalServer(MasterProcedureEnv env, 
ServerName lastHost)
+    throws ProcedureSuspendedException {
+    ServerManager serverManager = env.getMasterServices().getServerManager();
+    ServerName newNameForServer = 
serverManager.findServerWithSameHostnamePortWithLock(lastHost);
+    boolean isOnline = 
serverManager.createDestinationServersList().contains(newNameForServer);
+
+    if (!isOnline && retries < 
env.getAssignmentManager().getForceRegionRetainmentRetries()) {
+      retries++;
+      LOG.info("Suspending the TRSP PID={} because {} is true and previous 
host {} "
+        + "for region is not yet online.", this.getProcId(), 
FORCE_REGION_RETAINMENT, lastHost);
+      setTimeout(env.getAssignmentManager().getForceRegionRetainmentWait());

Review Comment:
   So here we do not want to use Exponential backoff?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to