[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059210#comment-16059210 ] Brahma Reddy Battula commented on HDFS-11047: - IMHO, this will be good candidate to go branch-2.7. [~xiaobingo] can you update the branch-2.7 patch also..? > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch, HDFS-11047-branch-2.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614027#comment-15614027 ] Hadoop QA commented on HDFS-11047: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 42s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 49s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 180 unchanged - 1 fixed = 180 total (was 181) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 3s{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_111. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:b59b8b7 | | JIRA Issue | HDFS-11047 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835692/HDFS-11047-branch-2.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux dcf06a0b364b 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 /
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613728#comment-15613728 ] Mingliang Liu commented on HDFS-11047: -- +1 pending on Jenkins. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-11047-branch-2.002.patch, HDFS-11047.000.patch, > HDFS-11047.001.patch, HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613695#comment-15613695 ] Xiaobing Zhou commented on HDFS-11047: -- Thanks [~liuml07] for committing it, [~jpallas] and [~arpitagarwal] for reviewing. I posted branch-2 patch. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-11047-branch-2.002.patch, HDFS-11047.000.patch, > HDFS-11047.001.patch, HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613626#comment-15613626 ] Mingliang Liu commented on HDFS-11047: -- Committed to {{trunk}} branch. Thanks [~xiaobingo] for contribution. Thanks [~jpallas] and [~arpitagarwal] for the review. Can you provide a {{branch-2}} patch as I see non-trivial conflicts when cherry-picking. Will leave this JIRA open before that. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613625#comment-15613625 ] Hudson commented on HDFS-11047: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10712 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10712/]) HDFS-11047. Remove deep copies of FinalizedReplica to alleviate heap (liuml07: rev 9e03ee527988ff85af7f2c224c5570b69d09279a) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613557#comment-15613557 ] Mingliang Liu commented on HDFS-11047: -- +1 Will commit in a sec. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613524#comment-15613524 ] Hadoop QA commented on HDFS-11047: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 12s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-11047 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835652/HDFS-11047.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 0f8588aa03e2 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dd4ed6a | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17333/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17333/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch,
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613075#comment-15613075 ] Arpit Agarwal commented on HDFS-11047: -- +1 for the v002 patch pending Jenkins. Thanks [~xiaobingo]. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613025#comment-15613025 ] Xiaobing Zhou commented on HDFS-11047: -- v002 patch addressed that, thanks [~arpitagarwal] > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, > HDFS-11047.002.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613000#comment-15613000 ] Arpit Agarwal commented on HDFS-11047: -- The v2 patch looks good. Nitpick: can you please change the _may need to_ to _should_? {code} /** * Gets a list of references to the finalized blocks for the given block pool. * * Callers of this function should call * {@link FsDatasetSpi#acquireDatasetLock} to avoid blocks' status being * changed during list iteration. {code} Also can you please copy the documentation to {{FsDatasetImpl#getFinalizedBlocks}} so future callers are less likely to miss the caveat? > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. The sibling work is tracked by AMBARI-18694 > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612817#comment-15612817 ] Hadoop QA commented on HDFS-11047: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestFileCreationDelete | | | hadoop.security.TestPermission | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-11047 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835618/HDFS-11047.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 707a5c1e3764 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ac35ee9 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17328/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17328/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17328/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL:
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612439#comment-15612439 ] Xiaobing Zhou commented on HDFS-11047: -- I posted patch v001. # removed deep copies from getFinalizedBlocks. # added contract doc to it. # removed getFinalizedBlocksReferences added in previous patch v000. Thanks [~arpiagariu]/[~jpallas] for the reviews. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. The sibling work is tracked by AMBARI-18694 > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610179#comment-15610179 ] Arpit Agarwal commented on HDFS-11047: -- Nice catch [~xiaobingo]. Thanks for reporting and fixing this. I agree with [~jpallas] that we can just change the behavior of getFinalizedBlocks as it is a private interface. We can document the requirement that the caller of {{getFinalizedBlocks}} first get the dataset lock via {{FsDatasetSpi#acquireDatasetLock}}. In addition to the deep copy there is an apparently unnecessary list to array conversion that you removed. I wasn't able to follow the source history past 2011 to see why it was introduced. IAC I can't think of any reason to retain it. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. The sibling work is tracked by AMBARI-18694 > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606836#comment-15606836 ] Hadoop QA commented on HDFS-11047: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestFsDatasetCache | | Timed out junit tests | org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy | | | org.apache.hadoop.tools.TestHdfsConfigFields | | | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | | | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-11047 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835215/HDFS-11047.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b78a03837e2d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 084bdab | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17285/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17285/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17285/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606794#comment-15606794 ] Xiaobing Zhou commented on HDFS-11047: -- Thanks [~jpallas]. I also did that check and tried to modify getFinalizedBlocks directly, however, to be safe, let's poll more inputs to the final choice. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. The sibling work is tracked by AMBARI-18694 > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606777#comment-15606777 ] Joe Pallas commented on HDFS-11047: --- It looks like the directory scanner and one test are the only callers of getFinalizedBlocks. If that's the case, there's no real reason to keep an old version for compatibility. What do you think, [~xiaobingo]? > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage > even worse if directory scans are scheduled more frequently. This proposes > removing unnecessary deep copies since DirectoryScanner#scan already holds > lock of dataset. The sibling work is tracked by AMBARI-18694 > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606682#comment-15606682 ] Xiaobing Zhou commented on HDFS-11047: -- I posted patch v000, please kindly review it, thanks. 1. It added new function (i.e. getFinalizedBlocksReferences) to declare references only to replicas instead of changing existing getFinalizedBlocks to avoid any compatibility issues. 2. Meanwhile, comments are added to getFinalizedBlocks regarding the deep copies contract it already implemented. 3. removed List -> Array translation. [~liuml07] thank you for the comments. > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-11047.000.patch > > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if > directory scans are scheduled more frequently. This proposes removing > unnecessary deep copies since DirectoryScanner#scan already holds lock of > dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
[ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603216#comment-15603216 ] Mingliang Liu commented on HDFS-11047: -- +1 for the proposal. Nice catch. Wondering if {{getFinalizedBlocks}} has promised anything regarding the returned value. Another point is about the List -> Array in {{DirectoryScanner#scan}}; can we avoid this? > Remove deep copies of FinalizedReplica to alleviate heap consumption on > DataNode > > > Key: HDFS-11047 > URL: https://issues.apache.org/jira/browse/HDFS-11047 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > > DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment > with 500,000+ blocks, we've seen the DN heap usage being accumulated to high > peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if > directory scans are scheduled more frequently. This proposes removing > unnecessary deep copies since DirectoryScanner#scan already holds lock of > dataset. > DirectoryScanner#scan > {code} > try(AutoCloseableLock lock = dataset.acquireDatasetLock()) { > for (Entryentry : diskReport.entrySet()) { > String bpid = entry.getKey(); > ScanInfo[] blockpoolReport = entry.getValue(); > > Stats statsRecord = new Stats(bpid); > stats.put(bpid, statsRecord); > LinkedList diffRecord = new LinkedList(); > diffs.put(bpid, diffRecord); > > statsRecord.totalBlocks = blockpoolReport.length; > List bl = dataset.getFinalizedBlocks(bpid); /* deep > copies here*/ > {code} > FsDatasetImpl#getFinalizedBlocks > {code} > public List getFinalizedBlocks(String bpid) { > try (AutoCloseableLock lock = datasetLock.acquire()) { > ArrayList finalized = > new ArrayList(volumeMap.size(bpid)); > for (ReplicaInfo b : volumeMap.replicas(bpid)) { > if (b.getState() == ReplicaState.FINALIZED) { > finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED) > .from(b).build()); /* deep copies here*/ > } > } > return finalized; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org