[ 
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11493:
--------------------------------
    Attachment: HDFS-11493-HDFS-7240.002.patch

[~vagarychen] and [~xyao] Thank you for the comments. This patch addresses all 
the comments. Specifics inline ...

bq. NIT: Line 91 can we rename 
"ozone.scm.container.report.processing.lag.seconds" to 
"ozone.scm.container.report.processing.interval.seconds"
Fixed.

bq. NIT: Line 99: can we rename 
"ozone.scm.max.wait.for.container.reports.in.seconds" to 
"ozone.scm.container.reports.wait.timeout.seconds"
Fixed.


bq. NIT: Line 135 "command" -> Commands
Fixed.

bq. NIT: unnecessary empty line change
Fixed.

bq. Line 26: should we use AtomicLong for totalProcessedCount?
Fixed.

bq. Line 92: miss javadoc for getLastProcessTime
Fixed.

bq. Line 87: NIT: move before line 79?
Fixed.

bq. Line 119: we should ensure the executorService is shutdown properly.
Fixed.

bq. Line 121: extra leading space in the name string " Container Reports..."
Fixed.

bq. Line 135: NIT: need add an empty line
Fixed.
bq. Line 145: can you elaborate on the two cases?
Fixed.

bq. Line 190: computerPoolDifference assumes the newPool is made a copy in the 
method and the oldPool is made a copy by the caller
I am not sure how the stream.collect usage pattern is going to be 
better than set.removeAll pattern. I would think the memory allocation costs
are pretty same, also we have will have less than 200 pools at any given time.


bq. Line 208: PerodicPool#lastProcessTime is never changed with this patch. 
Fixed. Thanks for flagging this, Updated the timeStamp in 
{{startReconciliation}}

bq. Line 243: we restart the poolProcessThread if UncaughtException is hit. 
Yes, it is possible. But if this thread is not running we cannot resolve 
replication status. An option is to count how many times you have seen this 
error and if it exceeds a particular number, SCM can shutdown. Another 
extension, would be to add a counter, so that programs like Ambari can keep a 
watch. I have added a TODO comment.

bq. Line 53: missing java doc for some parameters
Fixed.

bq. Line 66: can we define some SCM specific exception here?
Since this is test code , do you want to create SCM specific test exceptions ?
Please let me know if you need that done.

bq. Line 70: Random r can be moved to member to avoid new Random() for 
eachgetContainerReport call.
Fixed.

bq. @Test above comments, while one test has it after comments.
Fixed.


> Ozone: SCM:  Add the ability to handle container reports 
> ---------------------------------------------------------
>
>                 Key: HDFS-11493
>                 URL: https://issues.apache.org/jira/browse/HDFS-11493
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: container-replication-storage.pdf, 
> HDFS-11493-HDFS-7240.001.patch, HDFS-11493-HDFS-7240.002.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to 
> determine if the replication levels are acceptable. If it is not, SCM should 
> initiate a replication request to another datanode. This JIRA tracks how SCM  
> handles a container report.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to