[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

Weiwei Yang (JIRA) Mon, 24 Apr 2017 10:01:23 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981506#comment-15981506
 ]


Weiwei Yang commented on HDFS-11493:
------------------------------------

Hi [~anu]

Thanks for replying my questions, I am glad to see that most of concerns were 
in your plan. Some more comments about v2 patch

*CommandQueue*
# {{getCommand()}} seems to always return DEFAULT_LIST.
# Looks like {{Commands#getCommands()}} evicts the commands for a datanode by 
setting {{this.commands}} to a new list. Why not to let 
{{Commands#getCommands()}} return {{Commands}} and modify the code to be 
something like,
{code}
public Commands getCommand(final DatanodeID datanodeID) {
  ...
  Commands cmds= commandMap.remove(datanodeID);
  return r == null ? DEFAULT_LIST : cmds;
}
{code}
# I don't understand the point of maintaining {{commandsInQueue}}, if you want 
to count the number of commands in the queue, why not count the number of 
commands in {{commandMap}}?

*HadoopExecutors*

# The  {{timeout}} argument in {{HadoopExecutors#shutdown()}} doesn't seem to 
work as expected. Imagine passing 5s timeout to shutdown method, and line 117 
waits 5s but could not get tasks gracefully shutdown, then in line 121 it will 
wait for another 5s.

*OzoneConfigKeys*

Can we move these configuration properties to {{ScmConfigKeys}}?

*SCMTestUtils*

line 144 seems unnecessary to call toString() as there is a constructor

{code}
public InetSocketAddress(InetAddress addr, int port)
{code}

accepts InetAddress as argument.

*ContainerReplicationManager*

# Instead of using following check

{code}
if (periodicPool.getPoolName().compareTo(poolName) == 0) {
  poolQueue.remove(periodicPool);
}
{code}

can we implement {{PeriodicPool#equals}} and remove line 167 check?

# Is {{ContainerReplicationManager}} better to extend {{AbstractService}}? 
Currently it starts the thread in the constructor, it's better to handle this 
in its own {{serviceStart}} and {{serviceStop}}, what do you think?

*InProgressPool*

# {{startReconciliation()}} iterates each datanode in a pool on 
{{getNodestate()}}. {{getNodestate()}}, however, would wait and retry for at 
most 100s, this will let rest of datanodes wait. Can we make this parallel?

# There seems to be a lot of variables such as {{nodeProcessed}}, 
{{containerProcessedCount}}, {{nodeCount}} are not thread safe.

I did not look at test code yet, will check those later. Hope it helps, thank 
you.

> Ozone: SCM:  Add the ability to handle container reports 
> ---------------------------------------------------------
>
>                 Key: HDFS-11493
>                 URL: https://issues.apache.org/jira/browse/HDFS-11493
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: container-replication-storage.pdf, 
> HDFS-11493-HDFS-7240.001.patch, HDFS-11493-HDFS-7240.002.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to 
> determine if the replication levels are acceptable. If it is not, SCM should 
> initiate a replication request to another datanode. This JIRA tracks how SCM  
> handles a container report.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11493) Ozone: SCM: Add the ability to handle container reports

Reply via email to