[jira] [Work logged] (HDDS-1285) Implement actions need to be taken after chill mode exit wait time

ASF GitHub Bot (JIRA) Thu, 11 Apr 2019 10:07:23 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-1285?focusedWorklogId=226176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226176
 ]


ASF GitHub Bot logged work on HDDS-1285:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Apr/19 17:06
            Start Date: 11/Apr/19 17:06
    Worklog Time Spent: 10m 
      Work Description: nandakumar131 commented on pull request #612: 
HDDS-1285. Implement actions need to be taken after chill mode exit w…
URL: https://github.com/apache/hadoop/pull/612#discussion_r274528380
 
 

 ##########
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/chillmode/TestSCMChillModeWithPipelineRules.java
 ##########
 @@ -0,0 +1,203 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.chillmode;
+
+import org.apache.hadoop.hdds.HddsConfigKeys;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.container.ReplicationManager;
+import 
org.apache.hadoop.hdds.scm.container.replication.ReplicationActivityStatus;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
+import org.apache.hadoop.hdds.scm.server.StorageContainerManager;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.util.List;
+import java.util.concurrent.TimeoutException;
+
+import static org.junit.Assert.fail;
+
+/**
+ * This class tests SCM Chill mode with pipeline rules.
+ */
+
+public class TestSCMChillModeWithPipelineRules {
+
+  private static MiniOzoneCluster cluster;
+  private OzoneConfiguration conf = new OzoneConfiguration();
+  private PipelineManager pipelineManager;
+  private MiniOzoneCluster.Builder clusterBuilder;
+
+  @Rule
+  public TemporaryFolder temporaryFolder = new TemporaryFolder();
+
+  public void setup(int numDatanodes) throws Exception {
+    conf.set(HddsConfigKeys.OZONE_METADATA_DIRS,
+        temporaryFolder.newFolder().toString());
+    conf.setBoolean(
+        HddsConfigKeys.HDDS_SCM_CHILLMODE_PIPELINE_AVAILABILITY_CHECK,
+        true);
+    conf.set(HddsConfigKeys.HDDS_SCM_WAIT_TIME_AFTER_CHILL_MODE_EXIT, "10s");
+    conf.set(ScmConfigKeys.OZONE_SCM_PIPELINE_CREATION_INTERVAL, "10s");
+    clusterBuilder = MiniOzoneCluster.newBuilder(conf)
+        .setNumDatanodes(numDatanodes)
+        .setHbInterval(1000)
+        .setHbProcessorInterval(1000);
+
+    cluster = clusterBuilder.build();
+    cluster.waitForClusterToBeReady();
+    StorageContainerManager scm = cluster.getStorageContainerManager();
+    pipelineManager = scm.getPipelineManager();
+  }
+
+
+  @Test
+  public void testScmChillMode() throws Exception {
+
+    int datanodeCount = 6;
+    setup(datanodeCount);
+
+    waitForRatis3NodePipelines(datanodeCount/3);
+    waitForRatis1NodePipelines(datanodeCount);
+
+    int totalPipelineCount = datanodeCount + (datanodeCount/3);
+
+    //Cluster is started successfully
+    cluster.stop();
+
+    cluster.restartOzoneManager();
+    cluster.restartStorageContainerManager(false);
+
+    pipelineManager = 
cluster.getStorageContainerManager().getPipelineManager();
+    List<Pipeline> pipelineList =
+        pipelineManager.getPipelines(HddsProtos.ReplicationType.RATIS,
+            HddsProtos.ReplicationFactor.THREE);
+
+
+    pipelineList.get(0).getNodes().forEach(datanodeDetails -> {
+      try {
+        cluster.restartHddsDatanode(datanodeDetails, false);
+      } catch (Exception ex) {
+        fail("Datanode restart failed");
+      }
+    });
+
+
+    SCMChillModeManager scmChillModeManager =
+        cluster.getStorageContainerManager().getScmChillModeManager();
+
+
+    // Ceil(0.1 * 2) is 1, as one pipeline is healthy healthy pipeline rule is
+    // satisfied
+
+    GenericTestUtils.waitFor(() ->
+        scmChillModeManager.getHealthyPipelineChillModeRule()
+            .validate(), 1000, 60000);
+
+    // As Ceil(0.9 * 2) is 2, and from second pipeline no datanodes's are
+    // reported this rule is not met yet.
+    GenericTestUtils.waitFor(() ->
+        !scmChillModeManager.getOneReplicaPipelineChillModeRule()
+            .validate(), 1000, 60000);
+
+    Assert.assertTrue(cluster.getStorageContainerManager().isInChillMode());
 
 Review comment:
   Can it happen that the restarted datanode registers with SCM and SCM could 
be out of chill mode?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 226176)
    Time Spent: 2h  (was: 1h 50m)

> Implement actions need to be taken after chill mode exit wait time
> ------------------------------------------------------------------
>
>                 Key: HDDS-1285
>                 URL: https://issues.apache.org/jira/browse/HDDS-1285
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: SCM
>            Reporter: Bharat Viswanadham
>            Assignee: Bharat Viswanadham
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> # Destroy and close the pipelines
>  # Close all the containers on the pipeline.
>  # trigger for pipeline creation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1285) Implement actions need to be taken after chill mode exit wait time

Reply via email to