[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaobing Zhou updated HDFS-11047: --------------------------------- Attachment: HDFS-11047.000.patch > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > -------------------------------------------------------------------------------- > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs > Reporter: Xiaobing Zhou > Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if > directory scans are scheduled more frequently. This proposes removing > unnecessary deep copies since DirectoryScanner#scan already holds lock of > dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entry<String, ScanInfo[]> entry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList<ScanInfo> diffRecord = new LinkedList<ScanInfo>(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List<ReplicaInfo> bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List<ReplicaInfo> getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList<ReplicaInfo> finalized = > new ArrayList<ReplicaInfo>(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org