[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2017-06-22 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059210#comment-16059210
 ] 

Brahma Reddy Battula commented on HDFS-11047:
-

IMHO, this will be good candidate to go branch-2.7. [~xiaobingo] can you update 
the branch-2.7 patch also..? 

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch, HDFS-11047-branch-2.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614027#comment-15614027
 ] 

Hadoop QA commented on HDFS-11047:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
42s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
49s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 180 unchanged - 1 fixed = 180 total (was 181) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m  
3s{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_111. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-11047 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835692/HDFS-11047-branch-2.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux dcf06a0b364b 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / 

[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613728#comment-15613728
 ] 

Mingliang Liu commented on HDFS-11047:
--

+1 pending on Jenkins.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11047-branch-2.002.patch, HDFS-11047.000.patch, 
> HDFS-11047.001.patch, HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613695#comment-15613695
 ] 

Xiaobing Zhou commented on HDFS-11047:
--

Thanks [~liuml07] for committing it, [~jpallas] and [~arpitagarwal] for 
reviewing. 
I posted branch-2 patch.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11047-branch-2.002.patch, HDFS-11047.000.patch, 
> HDFS-11047.001.patch, HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613626#comment-15613626
 ] 

Mingliang Liu commented on HDFS-11047:
--

Committed to {{trunk}} branch. Thanks [~xiaobingo] for contribution. Thanks 
[~jpallas] and [~arpitagarwal] for the review.

Can you provide a {{branch-2}} patch as I see non-trivial conflicts when 
cherry-picking. Will leave this JIRA open before that.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613625#comment-15613625
 ] 

Hudson commented on HDFS-11047:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10712 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10712/])
HDFS-11047. Remove deep copies of FinalizedReplica to alleviate heap (liuml07: 
rev 9e03ee527988ff85af7f2c224c5570b69d09279a)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java


> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613557#comment-15613557
 ] 

Mingliang Liu commented on HDFS-11047:
--

+1 Will commit in a sec.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613524#comment-15613524
 ] 

Hadoop QA commented on HDFS-11047:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 
12s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11047 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835652/HDFS-11047.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0f8588aa03e2 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / dd4ed6a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17333/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17333/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, 

[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613075#comment-15613075
 ] 

Arpit Agarwal commented on HDFS-11047:
--

+1 for the v002 patch pending Jenkins. Thanks [~xiaobingo].

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613025#comment-15613025
 ] 

Xiaobing Zhou commented on HDFS-11047:
--

v002 patch addressed that, thanks [~arpitagarwal]

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, 
> HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset.
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613000#comment-15613000
 ] 

Arpit Agarwal commented on HDFS-11047:
--

The v2 patch looks good.

Nitpick: can you please change the _may need to_ to _should_?

{code}
  /**
   * Gets a list of references to the finalized blocks for the given block pool.
   * 
   * Callers of this function should call
   * {@link FsDatasetSpi#acquireDatasetLock} to avoid blocks' status being
   * changed during list iteration.
{code}

Also can you please copy the documentation to 
{{FsDatasetImpl#getFinalizedBlocks}} so future callers are less likely to miss 
the caveat?

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset. The sibling work is tracked by AMBARI-18694
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612817#comment-15612817
 ] 

Hadoop QA commented on HDFS-11047:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestFileCreationDelete |
|   | hadoop.security.TestPermission |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11047 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835618/HDFS-11047.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 707a5c1e3764 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ac35ee9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17328/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17328/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17328/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: 

[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-27 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612439#comment-15612439
 ] 

Xiaobing Zhou commented on HDFS-11047:
--

I posted patch v001.
# removed deep copies from getFinalizedBlocks.
# added contract doc to it.
# removed getFinalizedBlocksReferences added in previous patch v000.

Thanks [~arpiagariu]/[~jpallas] for the reviews.


> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset. The sibling work is tracked by AMBARI-18694
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-26 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610179#comment-15610179
 ] 

Arpit Agarwal commented on HDFS-11047:
--

Nice catch [~xiaobingo]. Thanks for reporting and fixing this.

I agree with [~jpallas] that we can just change the behavior of 
getFinalizedBlocks as it is a private interface. We can document the 
requirement that the caller of {{getFinalizedBlocks}} first get the dataset 
lock via {{FsDatasetSpi#acquireDatasetLock}}.

In addition to the deep copy there is an apparently unnecessary list to array 
conversion that you removed. I wasn't able to follow the source history past 
2011 to see why it was introduced. IAC I can't think of any reason to retain it.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset. The sibling work is tracked by AMBARI-18694
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606836#comment-15606836
 ] 

Hadoop QA commented on HDFS-11047:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestFsDatasetCache |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
|   | org.apache.hadoop.tools.TestHdfsConfigFields |
|   | 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | 
org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11047 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835215/HDFS-11047.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b78a03837e2d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 084bdab |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17285/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17285/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17285/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   

[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-25 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606794#comment-15606794
 ] 

Xiaobing Zhou commented on HDFS-11047:
--

Thanks [~jpallas]. I also did that check and tried to modify getFinalizedBlocks 
directly, however, to be safe, let's poll more inputs to the final choice.

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset. The sibling work is tracked by AMBARI-18694
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-25 Thread Joe Pallas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606777#comment-15606777
 ] 

Joe Pallas commented on HDFS-11047:
---

It looks like the directory scanner and one test are the only callers of 
getFinalizedBlocks.  If that's the case, there's no real reason to keep an old 
version for compatibility.  What do you think, [~xiaobingo]?

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage 
> even worse if directory scans are scheduled more frequently. This proposes 
> removing unnecessary deep copies since DirectoryScanner#scan already holds 
> lock of dataset. The sibling work is tracked by AMBARI-18694
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-25 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606682#comment-15606682
 ] 

Xiaobing Zhou commented on HDFS-11047:
--

I posted patch v000, please kindly review it, thanks.
1. It added new function (i.e. getFinalizedBlocksReferences) to declare 
references only to replicas instead of changing existing getFinalizedBlocks to 
avoid any compatibility issues.
2. Meanwhile, comments are added to getFinalizedBlocks regarding the deep 
copies contract it already implemented.
3. removed List -> Array translation.

[~liuml07] thank you for the comments.



> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-11047.000.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if 
> directory scans are scheduled more frequently. This proposes removing 
> unnecessary deep copies since DirectoryScanner#scan already holds lock of 
> dataset. 
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode

2016-10-24 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603216#comment-15603216
 ] 

Mingliang Liu commented on HDFS-11047:
--

+1 for the proposal. Nice catch. Wondering if {{getFinalizedBlocks}} has 
promised anything regarding the returned value. Another point is about the List 
-> Array in {{DirectoryScanner#scan}}; can we avoid this?

> Remove deep copies of FinalizedReplica to alleviate heap consumption on 
> DataNode
> 
>
> Key: HDFS-11047
> URL: https://issues.apache.org/jira/browse/HDFS-11047
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment 
> with 500,000+ blocks, we've seen the DN heap usage being accumulated to high 
> peaks. Deep copies of FinalizedReplica will make DN heap usage even worse if 
> directory scans are scheduled more frequently. This proposes removing 
> unnecessary deep copies since DirectoryScanner#scan already holds lock of 
> dataset. 
> DirectoryScanner#scan
> {code}
> try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>   for (Entry entry : diskReport.entrySet()) {
> String bpid = entry.getKey();
> ScanInfo[] blockpoolReport = entry.getValue();
> 
> Stats statsRecord = new Stats(bpid);
> stats.put(bpid, statsRecord);
> LinkedList diffRecord = new LinkedList();
> diffs.put(bpid, diffRecord);
> 
> statsRecord.totalBlocks = blockpoolReport.length;
> List bl = dataset.getFinalizedBlocks(bpid); /* deep 
> copies here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List getFinalizedBlocks(String bpid) {
> try (AutoCloseableLock lock = datasetLock.acquire()) {
>   ArrayList finalized =
>   new ArrayList(volumeMap.size(bpid));
>   for (ReplicaInfo b : volumeMap.replicas(bpid)) {
> if (b.getState() == ReplicaState.FINALIZED) {
>   finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>   .from(b).build()); /* deep copies here*/
> }
>   }
>   return finalized;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org