[ 
https://issues.apache.org/jira/browse/HDDS-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972177#comment-16972177
 ] 

Stephen O'Donnell commented on HDDS-2446:
-----------------------------------------

I think there could be an argument for merging datanodeDetails and datanodeInfo 
into a single object, but that is likely a very large change and I'm not sure 
its the best idea either.

{quote}
Just a thought, what if we get the state of all available datanode at the start 
of ReplicationManager cycle? We can avoid multiple lookups for same datanode.
{quote}

I had considered this, but it doesn't really give us anything, because:

1. We would need to store the state in a hashMap or similar structure, so we 
still need to pay the price of the lookup per container
2. The cached data could change part way through a run.

In order to make decisions about how to handle any ContainerReplica, we are 
going to need to know the nodeStatus (health and OpState) going forward, and I 
think its cleaner and more efficient if we reference datanodeInfo directly 
within it. The alternative is that we need to pass the NodeManager object into 
anything that needs to deal with the replicas and do a lookup per container via 
the NodeManager. That would not be terrible, but I think both DatanodeDetails 
and DatanodeInfo are tied very closely to registration in SCM, so we should be 
able to control how DatanodeInfo gets created.

> ContainerReplica should contain DatanodeInfo rather than DatanodeDetails
> ------------------------------------------------------------------------
>
>                 Key: HDDS-2446
>                 URL: https://issues.apache.org/jira/browse/HDDS-2446
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: SCM
>    Affects Versions: 0.5.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ContainerReplica object is used by the SCM to track containers reported 
> by the datanodes. The current fields stored in ContainerReplica are:
> {code}
> final private ContainerID containerID;
> final private ContainerReplicaProto.State state;
> final private DatanodeDetails datanodeDetails;
> final private UUID placeOfBirth;
> {code}
> Now we have introduced decommission and maintenance mode, the replication 
> manager (and potentially other parts of the code) need to know the status of 
> the replica in terms of IN_SERVICE, DECOMMISSIONING, DECOMMISSIONED etc to 
> make replication decisions.
> The DatanodeDetails object does not carry this information, however the 
> DatanodeInfo object extends DatanodeDetails and does carry the required 
> information.
> As DatanodeInfo extends DatanodeDetails, any place which needs a 
> DatanodeDetails can accept a DatanodeInfo instead.
> In this Jira I propose we change the DatanodeDetails stored in 
> ContainerReplica to DatanodeInfo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to