[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606682#comment-15606682
 ] 

Xiaobing Zhou commented on HDFS-11047:
--------------------------------------

I posted patch v000, please kindly review it, thanks.
1. It added new function (i.e. getFinalizedBlocksReferences) to declare 
references only to replicas instead of changing existing getFinalizedBlocks to 
avoid any compatibility issues.
2. Meanwhile, comments are added to getFinalizedBlocks regarding the deep 
copies contract it already implemented.
3. removed List -> Array translation.

[~liuml07] thank you for the comments.



> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11047
>                 URL: https://issues.apache.org/jira/browse/HDFS-11047
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, fs
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>         Attachments: HDFS-11047.000.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if 
> directory scans are scheduled more frequently. This proposes removing 
> unnecessary deep copies since DirectoryScanner#scan already holds lock of 
> dataset. 
> DirectoryScanner#scan
> {code}
>     try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>       for (Entry<String, ScanInfo[]> entry : diskReport.entrySet()) {
>         String bpid = entry.getKey();
>         ScanInfo[] blockpoolReport = entry.getValue();
>         
>         Stats statsRecord = new Stats(bpid);
>         stats.put(bpid, statsRecord);
>         LinkedList<ScanInfo> diffRecord = new LinkedList<ScanInfo>();
>         diffs.put(bpid, diffRecord);
>         
>         statsRecord.totalBlocks = blockpoolReport.length;
>         List<ReplicaInfo> bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List<ReplicaInfo> getFinalizedBlocks(String bpid) {
>     try (AutoCloseableLock lock = datasetLock.acquire()) {
>       ArrayList<ReplicaInfo> finalized =
>           new ArrayList<ReplicaInfo>(volumeMap.size(bpid));
>       for (ReplicaInfo b : volumeMap.replicas(bpid)) {
>         if (b.getState() == ReplicaState.FINALIZED) {
>           finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>               .from(b).build()); /* deep copies here*/
>         }
>       }
>       return finalized;
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to