[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222922
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 04/Apr/19 11:03
Start Date: 04/Apr/19 11:03
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #662: 
HDDS-1207. Refactor Container Report Processing logic and plugin new 
Replication Manager.
URL: https://github.com/apache/hadoop/pull/662
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222922)
Time Spent: 1h 40m  (was: 1.5h)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=77=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-77
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 03/Apr/19 12:01
Start Date: 03/Apr/19 12:01
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#issuecomment-479459525
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 25 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 7 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 30 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1050 | trunk passed |
   | +1 | compile | 74 | trunk passed |
   | +1 | checkstyle | 27 | trunk passed |
   | +1 | mvnsite | 79 | trunk passed |
   | +1 | shadedclient | 769 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 110 | trunk passed |
   | +1 | javadoc | 55 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 10 | Maven dependency ordering for patch |
   | +1 | mvninstall | 68 | the patch passed |
   | +1 | compile | 66 | the patch passed |
   | +1 | javac | 66 | the patch passed |
   | +1 | checkstyle | 22 | the patch passed |
   | +1 | mvnsite | 59 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 720 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 124 | the patch passed |
   | +1 | javadoc | 58 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 67 | common in the patch passed. |
   | +1 | unit | 100 | server-scm in the patch passed. |
   | +1 | asflicense | 28 | The patch does not generate ASF License warnings. |
   | | | 3592 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/662 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 26351f9c94b1 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 8b6deeb |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/2/testReport/ |
   | Max. process+thread count | 434 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm U: hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/2/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 77)
Time Spent: 1.5h  (was: 1h 20m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=13=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-13
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 03/Apr/19 08:51
Start Date: 03/Apr/19 08:51
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #662: 
HDDS-1207. Refactor Container Report Processing logic and plugin new 
Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271639382
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -146,68 +111,89 @@ public void onMessage(final ContainerReportFromDatanode 
reportFromDatanode,
 
   }
 
+  /**
+   * Processes the ContainerReport.
+   *
+   * @param datanodeDetails Datanode from which this report was received
+   * @param replicas list of ContainerReplicaProto
+   */
   private void processContainerReplicas(final DatanodeDetails datanodeDetails,
-  final List replicas,
-  final EventPublisher publisher) {
-final PendingDeleteStatusList pendingDeleteStatusList =
-new PendingDeleteStatusList(datanodeDetails);
+  final List replicas) {
 for (ContainerReplicaProto replicaProto : replicas) {
   try {
-final ContainerID containerID = ContainerID.valueof(
-replicaProto.getContainerID());
-
-ReportHandlerHelper.processContainerReplica(containerManager,
-containerID, replicaProto, datanodeDetails, publisher, LOG);
-
-final ContainerInfo containerInfo = containerManager
-.getContainer(containerID);
-
-if (containerInfo.getDeleteTransactionId() >
-replicaProto.getDeleteTransactionId()) {
-  pendingDeleteStatusList
-  .addPendingDeleteStatus(replicaProto.getDeleteTransactionId(),
-  containerInfo.getDeleteTransactionId(),
-  containerInfo.getContainerID());
-}
+processContainerReplica(datanodeDetails, replicaProto);
   } catch (ContainerNotFoundException e) {
-LOG.error("Received container report for an unknown container {} from"
-+ " datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Received container report for an unknown container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   } catch (IOException e) {
-LOG.error("Exception while processing container report for container"
-+ " {} from datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Exception while processing container report for container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   }
 }
-if (pendingDeleteStatusList.getNumPendingDeletes() > 0) {
-  publisher.fireEvent(SCMEvents.PENDING_DELETE_STATUS,
-  pendingDeleteStatusList);
-}
   }
 
-  private void checkReplicationState(ContainerID containerID,
-  EventPublisher publisher) {
-try {
-  ContainerInfo container = containerManager.getContainer(containerID);
-  replicateIfNeeded(container, publisher);
-} catch (ContainerNotFoundException ex) {
-  LOG.warn("Container is missing from containerStateManager. Can't request 
"
-  + "replication. {} {}", containerID, ex);
+  /**
+   * Process the missing replica on the given datanode.
+   *
+   * @param datanodeDetails DatanodeDetails
+   * @param missingReplicas ContainerID which are missing on the given datanode
+   */
+  private void processMissingReplicas(final DatanodeDetails datanodeDetails,
+  final Set missingReplicas) {
+for (ContainerID id : missingReplicas) {
+  try {
+containerManager.getContainerReplicas(id).stream()
+.filter(replica -> replica.getDatanodeDetails()
+.equals(datanodeDetails)).findFirst()
+.ifPresent(replica -> {
+  try {
+containerManager.removeContainerReplica(id, replica);
+  } catch (ContainerNotFoundException |
+  ContainerReplicaNotFoundException ignored) {
+// This should not happen, but even if it happens, not an issue
+  }
+});
+  } catch (ContainerNotFoundException e) {
+LOG.warn("Cannot remove container replica, container {} not found.",
+id, e);
+  }
 }
-
   }
 
-  private void replicateIfNeeded(ContainerInfo container,
-  EventPublisher publisher) throws ContainerNotFoundException {
-if (!container.isOpen() && replicationStatus.isReplicationEnabled()) {
-  final int existingReplicas = containerManager
-  

[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=10=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-10
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 03/Apr/19 08:47
Start Date: 03/Apr/19 08:47
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #662: 
HDDS-1207. Refactor Container Report Processing logic and plugin new 
Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271637452
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -15,129 +15,94 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.hadoop.hdds.scm.container;
 
-import java.io.IOException;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Set;
-import java.util.stream.Collectors;
+package org.apache.hadoop.hdds.scm.container;
 
 import org.apache.hadoop.hdds.protocol.DatanodeDetails;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReplicaProto;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
 import org.apache.hadoop.hdds.scm.block.PendingDeleteStatusList;
-import org.apache.hadoop.hdds.scm.container.replication
-.ReplicationActivityStatus;
-import org.apache.hadoop.hdds.scm.container.replication.ReplicationRequest;
 import org.apache.hadoop.hdds.scm.events.SCMEvents;
 import org.apache.hadoop.hdds.scm.node.NodeManager;
 import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
-import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
-import org.apache.hadoop.hdds.scm.server
-.SCMDatanodeHeartbeatDispatcher.ContainerReportFromDatanode;
+import org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher
+.ContainerReportFromDatanode;
 import org.apache.hadoop.hdds.server.events.EventHandler;
 import org.apache.hadoop.hdds.server.events.EventPublisher;
-
-import com.google.common.base.Preconditions;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
 /**
  * Handles container reports from datanode.
  */
-public class ContainerReportHandler implements
-EventHandler {
+public class ContainerReportHandler extends AbstractContainerReportHandler
+implements EventHandler {
 
   private static final Logger LOG =
   LoggerFactory.getLogger(ContainerReportHandler.class);
 
   private final NodeManager nodeManager;
-  private final PipelineManager pipelineManager;
   private final ContainerManager containerManager;
-  private final ReplicationActivityStatus replicationStatus;
 
+  /**
+   * Constructs ContainerReportHandler instance with the
+   * given NodeManager and ContainerManager instance.
+   *
+   * @param nodeManager NodeManager instance
+   * @param containerManager ContainerManager instance
+   */
   public ContainerReportHandler(final NodeManager nodeManager,
-  final PipelineManager pipelineManager,
-  final ContainerManager containerManager,
-  final ReplicationActivityStatus replicationActivityStatus) {
-Preconditions.checkNotNull(nodeManager);
-Preconditions.checkNotNull(pipelineManager);
-Preconditions.checkNotNull(containerManager);
-Preconditions.checkNotNull(replicationActivityStatus);
+final ContainerManager containerManager) {
+super(containerManager, LOG);
 this.nodeManager = nodeManager;
-this.pipelineManager = pipelineManager;
 this.containerManager = containerManager;
-this.replicationStatus = replicationActivityStatus;
   }
 
+  /**
+   * Process the container reports from datanodes.
+   *
+   * @param reportFromDatanode Container Report
+   * @param publisher EventPublisher reference
+   */
   @Override
   public void onMessage(final ContainerReportFromDatanode reportFromDatanode,
-  final EventPublisher publisher) {
+final EventPublisher publisher) {
 
 final DatanodeDetails datanodeDetails =
 reportFromDatanode.getDatanodeDetails();
-
 final ContainerReportsProto containerReport =
 reportFromDatanode.getReport();
 
 try {
+  final List replicas =
+  containerReport.getReportsList();
+  final Set containersInSCM =
+  nodeManager.getContainers(datanodeDetails);
 
-  final List replicas = containerReport
-  .getReportsList();
-
-  // ContainerIDs which SCM expects this datanode to have.
-  final Set expectedContainerIDs = nodeManager
-  .getContainers(datanodeDetails);
-
-  // ContainerIDs that 

[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222008
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271495280
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -15,129 +15,94 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.hadoop.hdds.scm.container;
 
-import java.io.IOException;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Set;
-import java.util.stream.Collectors;
+package org.apache.hadoop.hdds.scm.container;
 
 import org.apache.hadoop.hdds.protocol.DatanodeDetails;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReplicaProto;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
 import org.apache.hadoop.hdds.scm.block.PendingDeleteStatusList;
-import org.apache.hadoop.hdds.scm.container.replication
-.ReplicationActivityStatus;
-import org.apache.hadoop.hdds.scm.container.replication.ReplicationRequest;
 import org.apache.hadoop.hdds.scm.events.SCMEvents;
 import org.apache.hadoop.hdds.scm.node.NodeManager;
 import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
-import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
-import org.apache.hadoop.hdds.scm.server
-.SCMDatanodeHeartbeatDispatcher.ContainerReportFromDatanode;
+import org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher
+.ContainerReportFromDatanode;
 import org.apache.hadoop.hdds.server.events.EventHandler;
 import org.apache.hadoop.hdds.server.events.EventPublisher;
-
-import com.google.common.base.Preconditions;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
 /**
  * Handles container reports from datanode.
  */
-public class ContainerReportHandler implements
-EventHandler {
+public class ContainerReportHandler extends AbstractContainerReportHandler
+implements EventHandler {
 
   private static final Logger LOG =
   LoggerFactory.getLogger(ContainerReportHandler.class);
 
   private final NodeManager nodeManager;
-  private final PipelineManager pipelineManager;
   private final ContainerManager containerManager;
-  private final ReplicationActivityStatus replicationStatus;
 
+  /**
+   * Constructs ContainerReportHandler instance with the
+   * given NodeManager and ContainerManager instance.
+   *
+   * @param nodeManager NodeManager instance
+   * @param containerManager ContainerManager instance
+   */
   public ContainerReportHandler(final NodeManager nodeManager,
-  final PipelineManager pipelineManager,
-  final ContainerManager containerManager,
-  final ReplicationActivityStatus replicationActivityStatus) {
-Preconditions.checkNotNull(nodeManager);
-Preconditions.checkNotNull(pipelineManager);
-Preconditions.checkNotNull(containerManager);
-Preconditions.checkNotNull(replicationActivityStatus);
+final ContainerManager containerManager) {
+super(containerManager, LOG);
 this.nodeManager = nodeManager;
-this.pipelineManager = pipelineManager;
 this.containerManager = containerManager;
-this.replicationStatus = replicationActivityStatus;
   }
 
+  /**
+   * Process the container reports from datanodes.
+   *
+   * @param reportFromDatanode Container Report
+   * @param publisher EventPublisher reference
+   */
   @Override
   public void onMessage(final ContainerReportFromDatanode reportFromDatanode,
-  final EventPublisher publisher) {
+final EventPublisher publisher) {
 
 final DatanodeDetails datanodeDetails =
 reportFromDatanode.getDatanodeDetails();
-
 final ContainerReportsProto containerReport =
 reportFromDatanode.getReport();
 
 try {
+  final List replicas =
+  containerReport.getReportsList();
+  final Set containersInSCM =
+  nodeManager.getContainers(datanodeDetails);
 
-  final List replicas = containerReport
-  .getReportsList();
-
-  // ContainerIDs which SCM expects this datanode to have.
-  final Set expectedContainerIDs = nodeManager
-  .getContainers(datanodeDetails);
-
-  // ContainerIDs that this 

[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222007
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271487673
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
 ##
 @@ -160,15 +161,43 @@ public ReplicationManager(final Configuration conf,
* Starts Replication Monitor thread.
*/
   public synchronized void start() {
+start(0);
+  }
+
+  /**
+   * Starts Replication Monitor thread after the given initial delay.
+   *
+   * @param delay initial delay in milliseconds
+   */
+  public void start(final long delay) {
 if (!running) {
-  LOG.info("Starting Replication Monitor Thread.");
   running = true;
-  replicationMonitor.start();
+  CompletableFuture.runAsync(() -> {
 
 Review comment:
   Don't use the forkJoin commonPool. It has very few threads and can be easily 
exhausted. We saw this to be a frequent issue in unit tests.
   
   Instead use the overload that accepts an Executor.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222007)
Time Spent: 0.5h  (was: 20m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222009
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271487930
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
 ##
 @@ -160,15 +161,43 @@ public ReplicationManager(final Configuration conf,
* Starts Replication Monitor thread.
*/
   public synchronized void start() {
+start(0);
+  }
+
+  /**
+   * Starts Replication Monitor thread after the given initial delay.
+   *
+   * @param delay initial delay in milliseconds
+   */
+  public void start(final long delay) {
 if (!running) {
-  LOG.info("Starting Replication Monitor Thread.");
   running = true;
-  replicationMonitor.start();
+  CompletableFuture.runAsync(() -> {
+try {
+  LOG.info("Replication Monitor Thread will be started" +
+  " in {} milliseconds.", delay);
+  Thread.sleep(delay);
+} catch (InterruptedException ignored) {
+  // InterruptedException is ignored.
 
 Review comment:
   Set interrupted flag here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222009)
Time Spent: 50m  (was: 40m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222010
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271502348
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -146,68 +111,89 @@ public void onMessage(final ContainerReportFromDatanode 
reportFromDatanode,
 
   }
 
+  /**
+   * Processes the ContainerReport.
+   *
+   * @param datanodeDetails Datanode from which this report was received
+   * @param replicas list of ContainerReplicaProto
+   */
   private void processContainerReplicas(final DatanodeDetails datanodeDetails,
-  final List replicas,
-  final EventPublisher publisher) {
-final PendingDeleteStatusList pendingDeleteStatusList =
-new PendingDeleteStatusList(datanodeDetails);
+  final List replicas) {
 for (ContainerReplicaProto replicaProto : replicas) {
   try {
-final ContainerID containerID = ContainerID.valueof(
-replicaProto.getContainerID());
-
-ReportHandlerHelper.processContainerReplica(containerManager,
-containerID, replicaProto, datanodeDetails, publisher, LOG);
-
-final ContainerInfo containerInfo = containerManager
-.getContainer(containerID);
-
-if (containerInfo.getDeleteTransactionId() >
-replicaProto.getDeleteTransactionId()) {
-  pendingDeleteStatusList
-  .addPendingDeleteStatus(replicaProto.getDeleteTransactionId(),
-  containerInfo.getDeleteTransactionId(),
-  containerInfo.getContainerID());
-}
+processContainerReplica(datanodeDetails, replicaProto);
   } catch (ContainerNotFoundException e) {
-LOG.error("Received container report for an unknown container {} from"
-+ " datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Received container report for an unknown container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   } catch (IOException e) {
-LOG.error("Exception while processing container report for container"
-+ " {} from datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Exception while processing container report for container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   }
 }
-if (pendingDeleteStatusList.getNumPendingDeletes() > 0) {
-  publisher.fireEvent(SCMEvents.PENDING_DELETE_STATUS,
-  pendingDeleteStatusList);
-}
   }
 
-  private void checkReplicationState(ContainerID containerID,
-  EventPublisher publisher) {
-try {
-  ContainerInfo container = containerManager.getContainer(containerID);
-  replicateIfNeeded(container, publisher);
-} catch (ContainerNotFoundException ex) {
-  LOG.warn("Container is missing from containerStateManager. Can't request 
"
-  + "replication. {} {}", containerID, ex);
+  /**
+   * Process the missing replica on the given datanode.
+   *
+   * @param datanodeDetails DatanodeDetails
+   * @param missingReplicas ContainerID which are missing on the given datanode
+   */
+  private void processMissingReplicas(final DatanodeDetails datanodeDetails,
+  final Set missingReplicas) {
+for (ContainerID id : missingReplicas) {
+  try {
+containerManager.getContainerReplicas(id).stream()
+.filter(replica -> replica.getDatanodeDetails()
+.equals(datanodeDetails)).findFirst()
+.ifPresent(replica -> {
+  try {
+containerManager.removeContainerReplica(id, replica);
+  } catch (ContainerNotFoundException |
+  ContainerReplicaNotFoundException ignored) {
+// This should not happen, but even if it happens, not an issue
+  }
+});
+  } catch (ContainerNotFoundException e) {
+LOG.warn("Cannot remove container replica, container {} not found.",
+id, e);
+  }
 }
-
   }
 
-  private void replicateIfNeeded(ContainerInfo container,
-  EventPublisher publisher) throws ContainerNotFoundException {
-if (!container.isOpen() && replicationStatus.isReplicationEnabled()) {
-  final int existingReplicas = containerManager
-  .getContainerReplicas(container.containerID()).size();
-  

[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-03-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=220502=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220502
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 29/Mar/19 12:21
Start Date: 29/Mar/19 12:21
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#issuecomment-477978223
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 32 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 7 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 26 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1088 | trunk passed |
   | +1 | compile | 81 | trunk passed |
   | +1 | checkstyle | 26 | trunk passed |
   | +1 | mvnsite | 74 | trunk passed |
   | +1 | shadedclient | 727 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 114 | trunk passed |
   | +1 | javadoc | 59 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 10 | Maven dependency ordering for patch |
   | +1 | mvninstall | 74 | the patch passed |
   | +1 | compile | 76 | the patch passed |
   | +1 | javac | 76 | the patch passed |
   | +1 | checkstyle | 24 | the patch passed |
   | +1 | mvnsite | 59 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 721 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 120 | the patch passed |
   | +1 | javadoc | 53 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 66 | common in the patch passed. |
   | +1 | unit | 97 | server-scm in the patch passed. |
   | +1 | asflicense | 25 | The patch does not generate ASF License warnings. |
   | | | 3603 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/662 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 05c20b2ef81f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / f41f938 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/1/testReport/ |
   | Max. process+thread count | 416 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm U: hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-662/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220502)
Time Spent: 20m  (was: 10m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-03-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=220490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220490
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 29/Mar/19 11:19
Start Date: 29/Mar/19 11:19
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #662: 
HDDS-1207. Refactor Container Report Processing logic and plugin new 
Replication Manager.
URL: https://github.com/apache/hadoop/pull/662
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220490)
Time Spent: 10m
Remaining Estimate: 0h

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org