ndimiduk commented on code in PR #5681:
URL: https://github.com/apache/hbase/pull/5681#discussion_r1490980457


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java:
##########
@@ -227,16 +264,47 @@ ServerName regionServerStartup(RegionServerStartupRequest 
request, int versionNu
     final String hostname =
       request.hasUseThisHostnameInstead() ? 
request.getUseThisHostnameInstead() : isaHostName;
     ServerName sn = ServerName.valueOf(hostname, request.getPort(), 
request.getServerStartCode());
+
+    // Check if the HMaster is configured to reject decommissioned hosts or 
not.
+    // When configured to do so, all hosts of decommissioned RegionServers are 
prevented from
+    // reporting for duty; otherwise, we accept them unless they come back 
with an identical
+    // ServerName that we can find in the decommissioned hosts.
+    // See HBASE-28342 for details.
+    if (shouldHostBeRejected(hostname)) {
+      LOG.warn("Rejecting RegionServer {} from reporting for duty because 
Master is configured "
+        + "to reject decommissioned hosts and this host was marked as such in 
the past.", sn);
+
+      throw new HostIsConsideredDecommissionedException(
+        "Master is configured to reject decommissioned hosts");
+    }
+
     checkClockSkew(sn, request.getServerCurrentTime());
     checkIsDead(sn, "STARTUP");
     if (!checkAndRecordNewServer(sn, ServerMetricsBuilder.of(sn, 
versionNumber, version))) {
-      LOG.warn(
-        "THIS SHOULD NOT HAPPEN, RegionServerStartup" + " could not record the 
server: " + sn);
+      LOG.warn("THIS SHOULD NOT HAPPEN, RegionServerStartup could not record 
the server: {}", sn);
     }
     storage.started(sn);
     return sn;
   }
 
+  /**
+   * Checks if the server is configured to reject decommissioned hosts, if 
that's the case, it then
+   * checks if the host exists in the list of drained servers, and when it's 
there it returns true;
+   * otherwise returns false.
+   * @param hostname The hostname of the server
+   */
+  private boolean shouldHostBeRejected(String hostname) {

Review Comment:
   Why not follow the existing style? Wrap up this logic all inside of a 
`checkDecommissionedStatus(sn)` call, like we do for the other invariants.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java:
##########
@@ -227,16 +264,47 @@ ServerName regionServerStartup(RegionServerStartupRequest 
request, int versionNu
     final String hostname =
       request.hasUseThisHostnameInstead() ? 
request.getUseThisHostnameInstead() : isaHostName;
     ServerName sn = ServerName.valueOf(hostname, request.getPort(), 
request.getServerStartCode());
+
+    // Check if the HMaster is configured to reject decommissioned hosts or 
not.
+    // When configured to do so, all hosts of decommissioned RegionServers are 
prevented from
+    // reporting for duty; otherwise, we accept them unless they come back 
with an identical
+    // ServerName that we can find in the decommissioned hosts.
+    // See HBASE-28342 for details.
+    if (shouldHostBeRejected(hostname)) {
+      LOG.warn("Rejecting RegionServer {} from reporting for duty because 
Master is configured "
+        + "to reject decommissioned hosts and this host was marked as such in 
the past.", sn);
+
+      throw new HostIsConsideredDecommissionedException(
+        "Master is configured to reject decommissioned hosts");
+    }
+
     checkClockSkew(sn, request.getServerCurrentTime());
     checkIsDead(sn, "STARTUP");
     if (!checkAndRecordNewServer(sn, ServerMetricsBuilder.of(sn, 
versionNumber, version))) {
-      LOG.warn(
-        "THIS SHOULD NOT HAPPEN, RegionServerStartup" + " could not record the 
server: " + sn);
+      LOG.warn("THIS SHOULD NOT HAPPEN, RegionServerStartup could not record 
the server: {}", sn);
     }
     storage.started(sn);
     return sn;
   }
 
+  /**
+   * Checks if the server is configured to reject decommissioned hosts, if 
that's the case, it then
+   * checks if the host exists in the list of drained servers, and when it's 
there it returns true;
+   * otherwise returns false.
+   * @param hostname The hostname of the server
+   */
+  private boolean shouldHostBeRejected(String hostname) {
+    boolean hostWasDecommissioned = false;
+    for (ServerName sn : getDrainingServersList()) {

Review Comment:
   Skip the whole loop when `rejectDecommissionedHostsConfig == false`.



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerReportForDuty.java:
##########
@@ -241,6 +243,125 @@ public void run() {
     waitForClusterOnline(master);
   }
 
+  /**
+   * Tests that the RegionServer's reportForDuty gets rejected by the master 
when the master is
+   * configured to reject decommissioned hosts and when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsRejectedByMasterWhenConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to reject decommissioned hosts
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 true);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());
+    rs2 = cluster.addRegionServer();
+    rs2.start();
+    waitForSecondRsStarted();
+
+    // Assert that the number of decommissioned and live hosts didn't change 
and that the hostname
+    // of rs2 matches that of the decommissioned server
+    String rs2HostName = rs2.getRegionServer().getServerName().getHostname();
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, master.getMaster().getLiveRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs2HostName, decommissionedServer.getHostname());
+  }
+
+  /**
+   * Tests that the RegionServer's reportForDuty gets accepted by the master 
when the master is not
+   * configured to reject decommissioned hosts, even when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsAcceptedByMasterWhenNotConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to not reject decommissioned hosts (default behavior)
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 false);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);

Review Comment:
   Is this method effectively asynchronous? Do you need to wait for some 
side-effect before the test can proceed?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java:
##########
@@ -172,6 +174,9 @@ public class ServerManager {
   /** Listeners that are called on server events. */
   private List<ServerListener> listeners = new CopyOnWriteArrayList<>();
 
+  /** Configured value of HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY */
+  private boolean rejectDecommissionedHostsConfig;

Review Comment:
   Use the `volatile` keyword on this field and omit the `synchronized` block 
below.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HostIsConsideredDecommissionedException.java:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+@InterfaceAudience.Public

Review Comment:
   This is an interface between the master and region server. Thus I don't 
think that I should be IA.Public.



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerReportForDuty.java:
##########
@@ -241,6 +243,125 @@ public void run() {
     waitForClusterOnline(master);
   }
 
+  /**
+   * Tests that the RegionServer's reportForDuty gets rejected by the master 
when the master is
+   * configured to reject decommissioned hosts and when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsRejectedByMasterWhenConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to reject decommissioned hosts
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 true);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());
+    rs2 = cluster.addRegionServer();
+    rs2.start();
+    waitForSecondRsStarted();
+
+    // Assert that the number of decommissioned and live hosts didn't change 
and that the hostname
+    // of rs2 matches that of the decommissioned server
+    String rs2HostName = rs2.getRegionServer().getServerName().getHostname();
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, master.getMaster().getLiveRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs2HostName, decommissionedServer.getHostname());
+  }
+
+  /**
+   * Tests that the RegionServer's reportForDuty gets accepted by the master 
when the master is not
+   * configured to reject decommissioned hosts, even when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsAcceptedByMasterWhenNotConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to not reject decommissioned hosts (default behavior)
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 false);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());
+    rs2 = cluster.addRegionServer();
+    rs2.start();
+    waitForSecondRsStarted();
+    
master.getMaster().recommissionRegionServer(rs2.getRegionServer().getServerName(),

Review Comment:
   why recommission this new host? Did you mean to recommission r1s?



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerReportForDuty.java:
##########
@@ -241,6 +243,125 @@ public void run() {
     waitForClusterOnline(master);
   }
 
+  /**
+   * Tests that the RegionServer's reportForDuty gets rejected by the master 
when the master is
+   * configured to reject decommissioned hosts and when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsRejectedByMasterWhenConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to reject decommissioned hosts
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 true);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());
+    rs2 = cluster.addRegionServer();
+    rs2.start();
+    waitForSecondRsStarted();
+
+    // Assert that the number of decommissioned and live hosts didn't change 
and that the hostname
+    // of rs2 matches that of the decommissioned server
+    String rs2HostName = rs2.getRegionServer().getServerName().getHostname();
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, master.getMaster().getLiveRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs2HostName, decommissionedServer.getHostname());
+  }
+
+  /**
+   * Tests that the RegionServer's reportForDuty gets accepted by the master 
when the master is not
+   * configured to reject decommissioned hosts, even when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsAcceptedByMasterWhenNotConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to not reject decommissioned hosts (default behavior)
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 false);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());

Review Comment:
   this was already set, wasn't it?



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerReportForDuty.java:
##########
@@ -241,6 +243,125 @@ public void run() {
     waitForClusterOnline(master);
   }
 
+  /**
+   * Tests that the RegionServer's reportForDuty gets rejected by the master 
when the master is
+   * configured to reject decommissioned hosts and when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsRejectedByMasterWhenConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to reject decommissioned hosts
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 true);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);
+
+    assertEquals(0, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(0, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+
+    // Decommission the region server and tries to re-add it
+    List<ServerName> serversToDecommission = new ArrayList<ServerName>();
+    serversToDecommission.add(rs.getRegionServer().getServerName());
+    master.getMaster().decommissionRegionServers(serversToDecommission, true);
+
+    // Assert that the server is now decommissioned
+    ServerName decommissionedServer = 
master.getMaster().listDecommissionedRegionServers().get(0);
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs.getRegionServer().getServerName().toString(),
+      decommissionedServer.getServerName());
+
+    // Create a second region server
+    cluster.getConfiguration().set(HConstants.REGION_SERVER_IMPL, 
MyRegionServer.class.getName());
+    rs2 = cluster.addRegionServer();
+    rs2.start();
+    waitForSecondRsStarted();
+
+    // Assert that the number of decommissioned and live hosts didn't change 
and that the hostname
+    // of rs2 matches that of the decommissioned server
+    String rs2HostName = rs2.getRegionServer().getServerName().getHostname();
+    assertEquals(1, 
master.getMaster().listDecommissionedRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getDrainingServersList().size());
+    assertEquals(1, master.getMaster().getLiveRegionServers().size());
+    assertEquals(1, 
master.getMaster().getServerManager().getOnlineServers().size());
+    assertEquals(rs2HostName, decommissionedServer.getHostname());
+  }
+
+  /**
+   * Tests that the RegionServer's reportForDuty gets accepted by the master 
when the master is not
+   * configured to reject decommissioned hosts, even when there is a match for 
the joining
+   * RegionServer in the list of decommissioned servers. Test case for 
HBASE-28342.
+   */
+  @Test
+  public void 
testReportForDutyGetsAcceptedByMasterWhenNotConfiguredToRejectDecommissionedHosts()
+    throws Exception {
+    // Start a master and wait for it to become the active/primary master.
+    // Use a random unique port
+    cluster.getConfiguration().setInt(HConstants.MASTER_PORT, 
HBaseTestingUtil.randomFreePort());
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART,
 1);
+    
cluster.getConfiguration().setInt(ServerManager.WAIT_ON_REGIONSERVERS_MAXTOSTART,
 1);
+
+    // Set the cluster to not reject decommissioned hosts (default behavior)
+    
cluster.getConfiguration().setBoolean(HConstants.REJECT_DECOMMISSIONED_HOSTS_KEY,
 false);
+
+    master = cluster.addMaster();
+    rs = cluster.addRegionServer();
+    LOG.debug("Starting master: " + master.getMaster().getServerName());
+    master.start();
+    rs.start();
+
+    waitForClusterOnline(master);

Review Comment:
   Does this wait for the rs to join the cluster, or only wait for the master 
to come online? Or maybe a single rs being present a requirement for the master 
to be online?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java:
##########
@@ -227,16 +264,47 @@ ServerName regionServerStartup(RegionServerStartupRequest 
request, int versionNu
     final String hostname =
       request.hasUseThisHostnameInstead() ? 
request.getUseThisHostnameInstead() : isaHostName;
     ServerName sn = ServerName.valueOf(hostname, request.getPort(), 
request.getServerStartCode());
+
+    // Check if the HMaster is configured to reject decommissioned hosts or 
not.
+    // When configured to do so, all hosts of decommissioned RegionServers are 
prevented from
+    // reporting for duty; otherwise, we accept them unless they come back 
with an identical
+    // ServerName that we can find in the decommissioned hosts.
+    // See HBASE-28342 for details.
+    if (shouldHostBeRejected(hostname)) {
+      LOG.warn("Rejecting RegionServer {} from reporting for duty because 
Master is configured "

Review Comment:
   I don't think this is a warn level log, info is fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to