kfaraz commented on code in PR #17863:
URL: https://github.com/apache/druid/pull/17863#discussion_r2028449944


##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+
+    if (cloneServers.isEmpty()) {
+      // No servers to be cloned.
+      return params;
+    }
+
+    // Create a map of host to historical.
+    final Map<String, ServerHolder> historicalMap = params.getDruidCluster()
+                                                          .getHistoricals()
+                                                          .values()
+                                                          .stream()
+                                                          
.flatMap(Collection::stream)
+                                                          
.collect(Collectors.toMap(
+                                                              serverHolder -> 
serverHolder.getServer().getHost(),
+                                                              serverHolder -> 
serverHolder
+                                                          ));
+
+    for (Map.Entry<String, String> entry : cloneServers.entrySet()) {
+      log.debug("Handling cloning for mapping: [%s]", entry);
+
+      final String sourceHistoricalName = entry.getKey();
+      final ServerHolder sourceServer = 
historicalMap.get(sourceHistoricalName);
+
+      if (sourceServer == null) {
+        log.warn(
+            "Could not find source historical [%s]. Skipping over clone 
mapping [%s].",
+            sourceHistoricalName,
+            entry
+        );
+        continue;
+      }
+
+      final String targetHistoricalName = entry.getValue();
+      final ServerHolder targetServer = 
historicalMap.get(targetHistoricalName);
+
+      if (targetServer == null) {
+        log.warn(
+            "Could not find target historical [%s]. Skipping over clone 
mapping [%s].",
+            targetHistoricalName,
+            entry
+        );
+        continue;
+      }
+
+      // Load any segments missing in the clone target.
+      for (DataSegment segment : sourceServer.getProjectedSegments()) {
+        if (!targetServer.getProjectedSegments().contains(segment)) {

Review Comment:
   Either use `targetServer.isProjectedSegment()` here or assign 
`targetServer.getProjectedSegments()` to a set and use it in the next for loop 
as well.



##########
server/src/main/java/org/apache/druid/server/coordinator/stats/Stats.java:
##########
@@ -65,6 +65,12 @@ public static class Segments
     // Values computed in a run
     public static final CoordinatorStat REPLICATION_THROTTLE_LIMIT
         = CoordinatorStat.toDebugOnly("replicationThrottleLimit");
+
+    // Cloned segments in a run
+    public static final CoordinatorStat CLONE_LOAD
+        = CoordinatorStat.toDebugAndEmit("cloneLoad", 
"segment/clone/assigned/count");
+    public static final CoordinatorStat CLONE_DROP

Review Comment:
   ```suggestion
       public static final CoordinatorStat DROPPED_FROM_CLONE
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorRuntimeParams.java:
##########
@@ -102,6 +105,11 @@ public StrategicSegmentAssigner getSegmentAssigner()
     return segmentAssigner;
   }
 
+  public SegmentLoadQueueManager getLoadQueueManager()

Review Comment:
   This should not be needed, please see the other comment.



##########
server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java:
##########
@@ -74,6 +74,8 @@ public class CoordinatorDynamicConfig
   private final Map<Dimension, String> validDebugDimensions;
 
   private final Set<String> turboLoadingNodes;
+  private final Set<String> unmanagedNodes;

Review Comment:
   Is this needed? Doesn't a server being in the value of the `cloneServers` 
automatically imply that the server is unmanaged?



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each
+    historicalT11 = createHistorical(1, Tier.T1, SIZE_1TB);
+    historicalT12 = createHistorical(2, Tier.T1, SIZE_1TB);
+    historicalT13 = createHistorical(3, Tier.T1, SIZE_1TB);
+  }
+
+  @Test
+  public void testSimpleCloning()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+    runCoordinatorCycle();
+
+    verifyValue(Metric.ASSIGNED_COUNT, 10L);

Review Comment:
   Maybe filter this metric by server T11 to ensure that all assignments were 
made to T11 and not the clone.



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each
+    historicalT11 = createHistorical(1, Tier.T1, SIZE_1TB);
+    historicalT12 = createHistorical(2, Tier.T1, SIZE_1TB);
+    historicalT13 = createHistorical(3, Tier.T1, SIZE_1TB);
+  }
+
+  @Test
+  public void testSimpleCloning()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+    runCoordinatorCycle();
+
+    verifyValue(Metric.ASSIGNED_COUNT, 10L);
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT11.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+  }
+
+  @Test
+  public void testAddingNewHistorical()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    // Run 1: Current state is a historical and clone already in sync.
+    Segments.WIKI_10X1D.forEach(segment -> {
+      historicalT11.addDataSegment(segment);
+      historicalT12.addDataSegment(segment);
+    });
+
+    startSimulation(sim);
+
+    runCoordinatorCycle();
+
+    // Confirm number of segments.
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+
+    // Add a new historical.
+    final DruidServer newHistorical = createHistorical(3, Tier.T1, 10_000);
+    addServer(newHistorical);
+
+    // Run 2: Let the coordinator balance segments.
+    runCoordinatorCycle();
+
+    // Check that segments have been distributed to the new historical and 
have also been dropped by the clone
+    Assert.assertEquals(5, historicalT11.getTotalSegments());
+    Assert.assertEquals(5, historicalT12.getTotalSegments());
+    Assert.assertEquals(5, newHistorical.getTotalSegments());
+    verifyValue(
+        Stats.Segments.CLONE_DROP.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        5L
+    );
+  }
+
+  @Test
+  public void testCloningServerDisappearsAndRelaunched()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
2).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+
+    // Run 1: All segments are loaded.
+    runCoordinatorCycle();
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+
+    // Target server disappears, loses loaded segments.
+    removeServer(historicalT12);
+    Segments.WIKI_10X1D.forEach(segment -> 
historicalT12.removeDataSegment(segment.getId()));
+
+    // Run 2: No change in source historical.
+    runCoordinatorCycle();
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(0, historicalT12.getTotalSegments());
+
+    // Server readded
+    addServer(historicalT12);
+
+    // Run 3: Segments recloned.
+    runCoordinatorCycle();
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+  }
+
+  @Test
+  public void testClonedServerDoesNotFollowReplicationLimit()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X100D)
+                             .withServers(historicalT11)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    Segments.WIKI_10X100D.forEach(segment -> 
historicalT11.addDataSegment(segment));
+    startSimulation(sim);
+
+    // Run 1: All segments are loaded on the source historical
+    runCoordinatorCycle();
+    Assert.assertEquals(1000, historicalT11.getTotalSegments());
+    Assert.assertEquals(0, historicalT12.getTotalSegments());
+
+    // Clone server now added.
+    addServer(historicalT12);
+
+    // Run 2: Assigns all segments to the cloned historical
+    runCoordinatorCycle();
+
+    Assert.assertEquals(1000, historicalT11.getTotalSegments());
+    Assert.assertEquals(1000, historicalT12.getTotalSegments());
+
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        1000L
+    );
+
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        1000L
+    );
+  }
+
+  @Test
+  public void testCloningHistoricalWithReplicationLimit()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12, 
historicalT13)
+                             .withRules(datasource, Load.on(Tier.T1, 
2).forever())
+                             .withImmediateSegmentLoading(true)
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         
.withReplicationThrottleLimit(2)
+                                                         
.withMaxSegmentsToMove(0)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+    Segments.WIKI_10X1D.forEach(historicalT13::addDataSegment);
+    startSimulation(sim);
+
+    // Check that only replication count segments are loaded each run and that 
the cloning server copies it.
+    while (historicalT11.getTotalSegments() < Segments.WIKI_10X1D.size()) {

Review Comment:
   Since we are doing things in a loop here, please add a timeout to the test 
and assert the number of loop iterations in the end.



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+
+    if (cloneServers.isEmpty()) {
+      // No servers to be cloned.
+      return params;
+    }
+
+    // Create a map of host to historical.
+    final Map<String, ServerHolder> historicalMap = params.getDruidCluster()
+                                                          .getHistoricals()
+                                                          .values()
+                                                          .stream()
+                                                          
.flatMap(Collection::stream)
+                                                          
.collect(Collectors.toMap(
+                                                              serverHolder -> 
serverHolder.getServer().getHost(),
+                                                              serverHolder -> 
serverHolder
+                                                          ));
+
+    for (Map.Entry<String, String> entry : cloneServers.entrySet()) {
+      log.debug("Handling cloning for mapping: [%s]", entry);
+
+      final String sourceHistoricalName = entry.getKey();
+      final ServerHolder sourceServer = 
historicalMap.get(sourceHistoricalName);
+
+      if (sourceServer == null) {
+        log.warn(
+            "Could not find source historical [%s]. Skipping over clone 
mapping [%s].",

Review Comment:
   ```suggestion
               "Could not process clone mapping[%s] as source historical[%s] 
does not exist.",
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/ServerHolder.java:
##########
@@ -264,11 +289,24 @@ public Map<DataSegment, SegmentAction> getQueuedSegments()
   }
 
   /**
-   * Segments that are expected to be loaded on this server once all the
+   * Counts for segments that are expected to be loaded on this server once 
all the
    * operations in progress have completed.
    */
-  public SegmentCountsPerInterval getProjectedSegments()
+  public SegmentCountsPerInterval getProjectedSegmentCounts()
+  {
+    return projectedSegmentCounts;
+  }
+
+  public Set<DataSegment> getProjectedSegments()

Review Comment:
   Please add a short javadoc:
   
   ```
   * Segments that are expected to be loaded on this server once all the
   * operations in progress have completed.
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.loading.SegmentLoadQueueManager;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+    final SegmentLoadQueueManager loadQueueManager = 
params.getLoadQueueManager();

Review Comment:
   Don't access the load queue manager here.
   Use the pattern followed by `UnloadUnusedSegments` i.e. pass the load queue 
manager into the constructor of this class.



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+
+    if (cloneServers.isEmpty()) {
+      // No servers to be cloned.
+      return params;
+    }
+
+    // Create a map of host to historical.
+    final Map<String, ServerHolder> historicalMap = params.getDruidCluster()
+                                                          .getHistoricals()
+                                                          .values()
+                                                          .stream()
+                                                          
.flatMap(Collection::stream)
+                                                          
.collect(Collectors.toMap(
+                                                              serverHolder -> 
serverHolder.getServer().getHost(),
+                                                              serverHolder -> 
serverHolder
+                                                          ));
+
+    for (Map.Entry<String, String> entry : cloneServers.entrySet()) {
+      log.debug("Handling cloning for mapping: [%s]", entry);
+
+      final String sourceHistoricalName = entry.getKey();
+      final ServerHolder sourceServer = 
historicalMap.get(sourceHistoricalName);
+
+      if (sourceServer == null) {
+        log.warn(
+            "Could not find source historical [%s]. Skipping over clone 
mapping [%s].",
+            sourceHistoricalName,
+            entry
+        );
+        continue;
+      }
+
+      final String targetHistoricalName = entry.getValue();
+      final ServerHolder targetServer = 
historicalMap.get(targetHistoricalName);
+
+      if (targetServer == null) {
+        log.warn(
+            "Could not find target historical [%s]. Skipping over clone 
mapping [%s].",

Review Comment:
   ```suggestion
               "Could not process clone mapping[%s] as target historical[%s] 
does not exist.",
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/stats/Stats.java:
##########
@@ -65,6 +65,12 @@ public static class Segments
     // Values computed in a run
     public static final CoordinatorStat REPLICATION_THROTTLE_LIMIT
         = CoordinatorStat.toDebugOnly("replicationThrottleLimit");
+
+    // Cloned segments in a run
+    public static final CoordinatorStat CLONE_LOAD

Review Comment:
   ```suggestion
       public static final CoordinatorStat ASSIGNED_TO_CLONE
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java:
##########
@@ -322,6 +328,18 @@ public boolean getReplicateAfterLoadTimeout()
     return replicateAfterLoadTimeout;
   }
 
+  @JsonProperty
+  public Set<String> getUnmanagedNodes()
+  {
+    return unmanagedNodes;
+  }
+
+  @JsonProperty

Review Comment:
   Please add a short javadoc mentioning what is the key and value of this map.
   We should also document this config.



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.loading.SegmentLoadQueueManager;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+    final SegmentLoadQueueManager loadQueueManager = 
params.getLoadQueueManager();
+
+    if (cloneServers.isEmpty()) {
+      // No servers to be cloned.
+      return params;
+    }
+
+    // Create a map of host to historical.
+    final Map<String, ServerHolder> historicalMap = params.getDruidCluster()
+                                                          .getHistoricals()
+                                                          .values()
+                                                          .stream()
+                                                          
.flatMap(Collection::stream)
+                                                          
.collect(Collectors.toMap(
+                                                              serverHolder -> 
serverHolder.getServer().getHost(),
+                                                              serverHolder -> 
serverHolder
+                                                          ));
+
+    for (Map.Entry<String, String> entry : cloneServers.entrySet()) {
+      log.debug("Handling cloning for mapping: [%s]", entry);
+
+      final String sourceHistoricalName = entry.getKey();
+      final ServerHolder sourceServer = 
historicalMap.get(sourceHistoricalName);
+
+      if (sourceServer == null) {
+        log.warn(
+            "Could not find source historical [%s]. Skipping over clone 
mapping [%s].",
+            sourceHistoricalName,
+            entry
+        );
+        continue;
+      }
+
+      final String targetHistoricalName = entry.getValue();
+      final ServerHolder targetServer = 
historicalMap.get(targetHistoricalName);
+
+      if (targetServer == null) {
+        log.warn(
+            "Could not find target historical [%s]. Skipping over clone 
mapping [%s].",
+            targetHistoricalName,
+            entry
+        );
+        continue;
+      }
+
+      // Load any segments missing in the clone target.
+      for (DataSegment segment : sourceServer.getProjectedSegments()) {
+        if (!targetServer.getProjectedSegments().contains(segment)) {
+          if (loadQueueManager.loadSegment(segment, targetServer, 
SegmentAction.LOAD)) {

Review Comment:
   The two ifs can be ANDed.



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/CloneHistoricals.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.duty;
+
+import org.apache.druid.java.util.common.logger.Logger;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams;
+import org.apache.druid.server.coordinator.ServerHolder;
+import org.apache.druid.server.coordinator.loading.SegmentAction;
+import org.apache.druid.server.coordinator.loading.SegmentLoadQueueManager;
+import org.apache.druid.server.coordinator.stats.CoordinatorRunStats;
+import org.apache.druid.server.coordinator.stats.Dimension;
+import org.apache.druid.server.coordinator.stats.RowKey;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.apache.druid.timeline.DataSegment;
+
+import java.util.Collection;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * Handles cloning of historicals. Given the historical to historical clone 
mappings, based on
+ * {@link CoordinatorDynamicConfig#getCloneServers()}, copies any segments 
load or unload requests from the source
+ * historical to the target historical.
+ */
+public class CloneHistoricals implements CoordinatorDuty
+{
+  private static final Logger log = new Logger(CloneHistoricals.class);
+
+  @Override
+  public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams 
params)
+  {
+    final Map<String, String> cloneServers = 
params.getCoordinatorDynamicConfig().getCloneServers();
+    final CoordinatorRunStats stats = params.getCoordinatorStats();
+    final SegmentLoadQueueManager loadQueueManager = 
params.getLoadQueueManager();
+
+    if (cloneServers.isEmpty()) {
+      // No servers to be cloned.
+      return params;
+    }
+
+    // Create a map of host to historical.
+    final Map<String, ServerHolder> historicalMap = params.getDruidCluster()
+                                                          .getHistoricals()
+                                                          .values()
+                                                          .stream()
+                                                          
.flatMap(Collection::stream)
+                                                          
.collect(Collectors.toMap(
+                                                              serverHolder -> 
serverHolder.getServer().getHost(),
+                                                              serverHolder -> 
serverHolder
+                                                          ));
+
+    for (Map.Entry<String, String> entry : cloneServers.entrySet()) {
+      log.debug("Handling cloning for mapping: [%s]", entry);
+
+      final String sourceHistoricalName = entry.getKey();
+      final ServerHolder sourceServer = 
historicalMap.get(sourceHistoricalName);
+
+      if (sourceServer == null) {
+        log.warn(

Review Comment:
   Should be error, I think, or maybe even a `log.makeAlert`.



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each

Review Comment:
   ```suggestion
       // Setup historicals for 1 tier, size 1 TB each
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java:
##########
@@ -558,6 +559,7 @@ private List<CoordinatorDuty> 
makeHistoricalManagementDuties()
         new MarkOvershadowedSegmentsAsUnused(deleteSegments),
         new MarkEternityTombstonesAsUnused(deleteSegments),
         new BalanceSegments(config.getCoordinatorPeriod()),
+        new CloneHistoricals(),

Review Comment:
   Pass in the `loadQueueManager` into the constructor here.



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each
+    historicalT11 = createHistorical(1, Tier.T1, SIZE_1TB);
+    historicalT12 = createHistorical(2, Tier.T1, SIZE_1TB);
+    historicalT13 = createHistorical(3, Tier.T1, SIZE_1TB);
+  }
+
+  @Test
+  public void testSimpleCloning()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+    runCoordinatorCycle();
+
+    verifyValue(Metric.ASSIGNED_COUNT, 10L);
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT11.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());

Review Comment:
   Also, compare the set of segments present on each server.



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each
+    historicalT11 = createHistorical(1, Tier.T1, SIZE_1TB);
+    historicalT12 = createHistorical(2, Tier.T1, SIZE_1TB);
+    historicalT13 = createHistorical(3, Tier.T1, SIZE_1TB);
+  }
+
+  @Test
+  public void testSimpleCloning()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)

Review Comment:
   Why is this false?
   Keep this to true for all tests, except one test where you set 
`smartSegmentLoading` to false and `replicationThrottleLimit` to a low value 
(say 1).



##########
server/src/test/java/org/apache/druid/server/coordinator/simulate/HistoricalCloningTest.java:
##########
@@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.server.coordinator.simulate;
+
+import org.apache.druid.client.DruidServer;
+import org.apache.druid.segment.TestDataSource;
+import org.apache.druid.server.coordinator.CoordinatorDynamicConfig;
+import org.apache.druid.server.coordinator.stats.Stats;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.Set;
+
+public class HistoricalCloningTest extends CoordinatorSimulationBaseTest
+{
+  private static final long SIZE_1TB = 1_000_000;
+
+  private DruidServer historicalT11;
+  private DruidServer historicalT12;
+  private DruidServer historicalT13;
+
+  private final String datasource = TestDataSource.WIKI;
+
+  @Override
+  public void setUp()
+  {
+    // Setup historicals for 2 tiers, size 10 GB each
+    historicalT11 = createHistorical(1, Tier.T1, SIZE_1TB);
+    historicalT12 = createHistorical(2, Tier.T1, SIZE_1TB);
+    historicalT13 = createHistorical(3, Tier.T1, SIZE_1TB);
+  }
+
+  @Test
+  public void testSimpleCloning()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+    runCoordinatorCycle();
+
+    verifyValue(Metric.ASSIGNED_COUNT, 10L);
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT11.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+  }
+
+  @Test
+  public void testAddingNewHistorical()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    // Run 1: Current state is a historical and clone already in sync.
+    Segments.WIKI_10X1D.forEach(segment -> {
+      historicalT11.addDataSegment(segment);
+      historicalT12.addDataSegment(segment);
+    });
+
+    startSimulation(sim);
+
+    runCoordinatorCycle();
+
+    // Confirm number of segments.
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+
+    // Add a new historical.
+    final DruidServer newHistorical = createHistorical(3, Tier.T1, 10_000);
+    addServer(newHistorical);
+
+    // Run 2: Let the coordinator balance segments.
+    runCoordinatorCycle();
+
+    // Check that segments have been distributed to the new historical and 
have also been dropped by the clone
+    Assert.assertEquals(5, historicalT11.getTotalSegments());
+    Assert.assertEquals(5, historicalT12.getTotalSegments());
+    Assert.assertEquals(5, newHistorical.getTotalSegments());
+    verifyValue(
+        Stats.Segments.CLONE_DROP.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        5L
+    );
+  }
+
+  @Test
+  public void testCloningServerDisappearsAndRelaunched()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X1D)
+                             .withServers(historicalT11, historicalT12)
+                             .withRules(datasource, Load.on(Tier.T1, 
2).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)
+                                                         .build()
+                             )
+                             .withImmediateSegmentLoading(true)
+                             .build();
+
+    startSimulation(sim);
+
+    // Run 1: All segments are loaded.
+    runCoordinatorCycle();
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+
+    // Target server disappears, loses loaded segments.
+    removeServer(historicalT12);
+    Segments.WIKI_10X1D.forEach(segment -> 
historicalT12.removeDataSegment(segment.getId()));
+
+    // Run 2: No change in source historical.
+    runCoordinatorCycle();
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(0, historicalT12.getTotalSegments());
+
+    // Server readded
+    addServer(historicalT12);
+
+    // Run 3: Segments recloned.
+    runCoordinatorCycle();
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+    verifyValue(
+        Stats.Segments.CLONE_LOAD.getMetricName(),
+        Map.of("server", historicalT12.getName()),
+        10L
+    );
+    verifyValue(
+        Metric.SUCCESS_ACTIONS,
+        Map.of("server", historicalT12.getName(), "description", "LOAD: 
NORMAL"),
+        10L
+    );
+
+    Assert.assertEquals(10, historicalT11.getTotalSegments());
+    Assert.assertEquals(10, historicalT12.getTotalSegments());
+  }
+
+  @Test
+  public void testClonedServerDoesNotFollowReplicationLimit()
+  {
+    final CoordinatorSimulation sim =
+        CoordinatorSimulation.builder()
+                             .withSegments(Segments.WIKI_10X100D)
+                             .withServers(historicalT11)
+                             .withRules(datasource, Load.on(Tier.T1, 
1).forever())
+                             .withDynamicConfig(
+                                 CoordinatorDynamicConfig.builder()
+                                                         
.withCloneServers(Map.of(historicalT11.getHost(), historicalT12.getHost()))
+                                                         
.withUnmanagedNodes(Set.of(historicalT12.getHost()))
+                                                         
.withSmartSegmentLoading(false)

Review Comment:
   Please specify replication throttle limit explicitly here.



##########
server/src/main/java/org/apache/druid/server/coordinator/SegmentCountsPerInterval.java:
##########
@@ -61,6 +66,11 @@ public long getTotalSegmentBytes()
     return totalSegmentBytes;
   }
 
+  public Set<DataSegment> getSegments()

Review Comment:
   On second thought, since `isProjectedSegment` is not accounting for it. 
Maybe we shouldn't account for it here either.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to