[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2018-04-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450781#comment-16450781
 ] 

Hudson commented on HDFS-10797:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14057/])
Revert "HDFS-10797. Disk usage summary of snapshots causes renamed (xyao: rev 
c90e1905072a75141585c91bd4687b2babe09b41)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java


> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2017-05-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024002#comment-16024002
 ] 

Hudson commented on HDFS-10797:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11776 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11776/])
Revert "HDFS-10797. Disk usage summary of snapshots causes renamed (weichiu: 
rev b8b69d797aed8dfeb65ea462c2856f62e9aa1023)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java


> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2017-04-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979494#comment-15979494
 ] 

Junping Du commented on HDFS-10797:
---

Hi [~mackrorysd], can you take a look at HDFS-11661 which could be caused by 
improvement here? Thx!

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2017-04-18 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973105#comment-15973105
 ] 

Nathan Roberts commented on HDFS-10797:
---

After deploying this to a cluster with a few hundred nodes, we have discovered 
that this jira has caused significant memory bloat in the namenode. Filed 2.8.1 
blocker for this issue - HDFS-11661. 

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556828#comment-15556828
 ] 

Hudson commented on HDFS-10797:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10572 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10572/])
HDFS-10797. Disk usage summary of snapshots causes renamed blocks to get (xiao: 
rev 6a38d118d86b7907009bcec34f1b788d076f1d1c)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ContentSummaryComputationContext.java


> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-07 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1630#comment-1630
 ] 

Xiao Chen commented on HDFS-10797:
--

I plan to commit this end of today if no objections.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-05 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549659#comment-15549659
 ] 

Xiao Chen commented on HDFS-10797:
--

Thanks for revving Sean. +1 on patch 10.
[~jingzhao] do you have any comments?

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549617#comment-15549617
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 267 unchanged - 0 fixed = 268 total (was 267) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831783/HDFS-10797.010.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 13a2e115431f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d6be1e7 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17022/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17022/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17022/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17022/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-05 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549352#comment-15549352
 ] 

Xiao Chen commented on HDFS-10797:
--

Thanks Sean. Agreed correctness is the goal. My comment regarding performance 
was on the nature of the context based summary, which is added to prevent a 
{{du}} locking the NN. And as we both agreed, this patch doesn't touch that.

Also about {{getIncludedNodes}}, it seems we don't call it anywhere... Maybe we 
can remove the getter method, and when a use case emerges, let the new change 
handle it?

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549321#comment-15549321
 ] 

Sean Mackrory commented on HDFS-10797:
--

Thanks, [~xiaochen]. Except as noted below, I'll incorporate all your feedback 
into another patch...

{quote}I don't think that will be a critical path to impact du 
performance{quote}

Yeah - not sure if anything performance critical depends on du, but I would 
think correctness of the final result is far more important here anyway.

{quote}In nodeIncluded, we safeguard includedNodes in a synchronized block, but 
we also provide a getIncludedNodes method, which could potentially be updated 
by the caller. No real usage yet, but I just feel this a bit unsafe in general, 
maybe return a clone of it instead?{quote}

So my concern was not that the contents of the HashSet instance might change, 
but that the reference 'counts' temporarily points to a different object 
entirely when tallying the deleted, snapshotted INodes. Rather than protecting 
the data structures, it ensures no one can call getCounts() while counts would 
point to the wrong object. Beyond that, I think it's just as likely that 
threads calling getCounts in parallel will need their changes to propagate to 
the rest of the program, meaning the correct solution would be a thread-safe 
data structure rather than a clone. So I do think it's best to leave it as is 
until there is a use case for other concurrent accesses.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-04 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547166#comment-15547166
 ] 

Xiao Chen commented on HDFS-10797:
--

Thanks Sean for the new revs! I think the synchronized block is ok, as I don't 
think that will be a critical path to impact du performance. Not sure what 
[~jingzhao]'s take is on this one.

Some minor comments, all in {{ContentSummaryComputationContext}}: 
- In {{nodeIncluded}}, we safeguard {{includedNodes}} in a synchronized block, 
but we also provide a {{getIncludedNodes}} method, which could potentially be 
updated by the caller. No real usage yet, but I just feel this a bit unsafe in 
general, maybe return a clone of it instead?
- {{resolveINodeReference}} can be private
- Need a {{}} to separate paragraphs in javadoc.

Test looks great.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537650#comment-15537650
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 267 unchanged - 0 fixed = 268 total (was 267) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 55s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
| Timed out junit tests | org.apache.hadoop.hdfs.TestSmallBlock |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831199/HDFS-10797.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2925a49ab866 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2549ee9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16961/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16961/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16961/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16961/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get coun

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537211#comment-15537211
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 267 unchanged - 0 fixed = 270 total (was 267) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
54s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.counts; 
locked 55% of time  Unsynchronized access at 
ContentSummaryComputationContext.java:55% of time  Unsynchronized access at 
ContentSummaryComputationContext.java:[line 91] |
| Failed junit tests | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.TestFileCorruption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831164/HDFS-10797.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7ddd29c9e58a 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2549ee9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16953/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16953/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16953/artifact/patchprocess/patch-unit-h

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536805#comment-15536805
 ] 

Sean Mackrory commented on HDFS-10797:
--

findbugs output is correct: I'm using synchronization to prevent any problems 
caused by the fact that we swap the counts object to tally up deleted, 
snapshotted space. We did this before, and it *should* be fine, since this is 
single-threaded everywhere I've seen, but thought it'd be wise to guard just in 
case. I'll synchronize the entire function on the context object instead and 
resubmit - other feedback welcome in the mean time, since that won't affect 
most of the patch.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536780#comment-15536780
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 267 unchanged - 0 fixed = 270 total (was 267) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
58s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 62m 
31s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Synchronization on ContentSummaryComputationContext.counts in futile 
attempt to guard it  At ContentSummaryComputationContext.java:attempt to guard 
it  At ContentSummaryComputationContext.java:[line 224] |
|  |  Synchronization on ContentSummaryComputationContext.counts in futile 
attempt to guard it  At ContentSummaryComputationContext.java:attempt to guard 
it  At ContentSummaryComputationContext.java:[line 232] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12831131/HDFS-10797.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 27f500bb7b8c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0670149 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16948/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16948/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
 |
|  Test Results | 
https://builds.apac

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15534425#comment-15534425
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 267 unchanged - 0 fixed = 270 total (was 267) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 85m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12830965/HDFS-10797.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2e30f32b8da2 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 10be459 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16933/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16933/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16933/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16933/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-29 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15534405#comment-15534405
 ] 

Xiao Chen commented on HDFS-10797:
--

Thank you for the good discussions here [~mackrorysd] and [~jingzhao]! Sorry I 
missed the scenario Jing mentioned. The semantic looks great to me.
Some nits:
- There's an used var {{count}} in {{INodeDirectory#computeContentSummary}} 
- {{ContentSummaryComputationContext#nodeIncluded}} 's java doc has some typos: 
{{both either}}

And more importantly, I think some update on the 
{{INodeDirectory#computeContentSummary}} logic: the snapshotCounts added by 
HDFS-8986 is supposed to count only contents under snapshots. Looks like this 
change break the unit test from that jira. I think the {{subtreeSummary}} is 
the problem here: we should only add the snapshot portion of the subtree into  
snapshotCounts.

Example unit test is {{TestDFSShell#testDuSnapshots}}. What do you guys think?


> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-29 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533611#comment-15533611
 ] 

Jing Zhao commented on HDFS-10797:
--

Actually my proposal is like your .005 patch. The current semantic and approach 
looks good to me overall.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-28 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15531432#comment-15531432
 ] 

Jing Zhao commented on HDFS-10797:
--

[~mackrorysd], I agree it will be great to have a consistent and user-friendly 
semantic. To me a better semantic can be like this: if the renamed source 
(which is inside of some snapshot) and the renamed target are both under the 
same directory for counting, we count them once. Otherwise they will be counted 
separately.

With this semantic maybe we only need to move your hashset to the context 
object passed from the beginning of the counting call, and use it to avoid 
duplicated counting. What do you think?

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-28 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15531199#comment-15531199
 ] 

Sean Mackrory commented on HDFS-10797:
--

Thanks for pointing that out [~jingzhao]. I added test cases to address some 
inter-directory renames. Of course, some of them are broken and still reported 
the wrong usage. I'd really like to come up with a way for the semantics to be 
both consistent and unsurprising to a user. I improved the situation somewhat 
by computing which nodes were deleted (as opposed to renames) in the context of 
all the diffs for a directory instead of each diff individually. So it's a step 
in the right direction but the real fix would be to have some global context 
when computing usage that ensures each INode in the hierarchy is counted 
exactly once. It looks to me like that's going to require some refactoring, 
since although the counts are cumulative, they can accumulate in multiple 
distinct objects before being combined. We would need to refactor some 
functions that so all counts were added directly to a single object, and that 
same object could prevent nodes from being counted twice, once because they 
were removed from a snapshotted directory, and again because of where they 
reside now.

Thoughts on this approach before I go further?

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-26 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524191#comment-15524191
 ] 

Jing Zhao commented on HDFS-10797:
--

Thanks for working on this, Sean. This is actually a known issue and the reason 
we currently choose to double the blocks is to make the semantic consistent 
with the scenario that the rename op moves the file out the original parent 
directory (e.g., into another snapshottable sub-tree or even a 
non-snapshottable subtree). In the later case, the moved files have to be 
counted in both the source directory and the target directory, since it belongs 
to snapshots of the source, and the current target directory.

The scenario described in this jira is a special case. I agree it's strange to 
find that a file is counted twice after renaming under the same directory. 
However, with the current change the semantic will become inconsistent. For 
e.g., we will sometimes count the file twice and sometimes only once in the 
following scenario:
1. move the file out of the original parent directory
2. optional: take a new snapshot
3. move the file back

Without step 2 then the file is counted only once, but with step 2 since the 
newly created item is in another snapshot diff the file will be counted twice. 
Or if the file is renamed into a subdirectory of its parent directory (/foo/bar 
--> /foo/subdir/bar), this file is still double counted. From the end user 
point of view this inconsistency is also strange.

So currently I'm leaning towards always doing double count. Thoughts?



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-23 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15517111#comment-15517111
 ] 

Xiao Chen commented on HDFS-10797:
--

I took another look of the patch, +1.

[~yzhangal] [~jingzhao], do you have any comments? I plan to commit this by 
next Tuesday if no objections. Thanks.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-16 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15496960#comment-15496960
 ] 

Xiao Chen commented on HDFS-10797:
--

Thanks Sean for revving! Patch 3 looks great to me.

Hi [~yzhangal] and [~jingzhao], could you please take a final look since you're 
more familiar with snapshots? Thanks in advance.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15496878#comment-15496878
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828842/HDFS-10797.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 31121d442c83 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / b09a03c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16770/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16770/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16770/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issue

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-15 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495504#comment-15495504
 ] 

Xiao Chen commented on HDFS-10797:
--

Thanks [~mackrorysd] for reporting and working on this!
I think you analysis is right, but would love to have [~jingzhao]'s input as 
well.

The patch looks great, I only have 2 small comments:
- We can add a test timeout
- Do you think it's worth adding a final scenario to the new test, that after 
everything is deleted (including snapshots) the size goes back to 0?

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15494402#comment-15494402
 ] 

Hadoop QA commented on HDFS-10797:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 56m 
31s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828709/HDFS-10797.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 96ce17cd020e 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbac00 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16754/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16754/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot th

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493695#comment-15493695
 ] 

Hadoop QA commented on HDFS-10797:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 143 unchanged - 0 fixed = 144 total (was 143) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.TestDecommissionWithStriped |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828499/HDFS-10797.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 15e9d1d42a17 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 40b5a59 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16752/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16752/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16752/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16752/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Disk usage summary of snapshots causes renamed blocks to get counted twice
> -

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-09-01 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456729#comment-15456729
 ] 

Sean Mackrory commented on HDFS-10797:
--

Just a note for anyone reading my comments above, I learned that snapshots 
don't really work by maintaining references to immutable blocks, they work by 
maintaining a diff from the current state. So it's not a matter of counting up 
the correct blocks, it's a matter of ensuring renames, appends, etc. or 
properly accounted for from the diff. That can get tricky, since a file *could* 
get renamed multiple times, etc.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-08-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439451#comment-15439451
 ] 

Sean Mackrory commented on HDFS-10797:
--

It's worth pointing out that the df command also shows some potentially 
surprising increases during this procedure, but I believe the behavior is 
correct. It increases more than you'd expect when appending data, but I believe 
this is because the snapshotted block and the amended block now need to be 
different blocks. The increase doesn't happen when using block sizes that 
divide 100MB evenly, which supports this idea.

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-08-25 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438128#comment-15438128
 ] 

Sean Mackrory commented on HDFS-10797:
--

To reproduce the discrepancy you can follow the following procedure. I put a 
100 MB file into HDFS and snapshot it (hadoop fs -du -s reports 100 MB * 
replication after both operations), and then append another 100 MB onto it 
(hadoop fs -du -s will report 200 MB * replication factor at that point). If I 
move the file to trash or simply rename it, hadoop fs -du -s starts reporting 
300 MB * replication factor in the second column. I believe at this point it is 
counting some of the overlap in block between the snapshot and the regular file 
twice, because it views the move operation the same as a delete, but since the 
file wasn't actually deleted it gets counted again.
{quote}
dd if=/dev/zero of=100MB.zero bs=1 count=1
bin/hadoop fs -mkdir -p /user/sean
bin/hadoop fs -chown sean /user/sean
bin/hadoop fs -put 100MB.zero /user/sean/HDFS-10797

bin/hdfs dfsadmin -allowSnapshot /user/sean
bin/hdfs dfs -createSnapshot /user/sean s1

bin/hadoop fs -appendToFile 100MB.zero /user/sean/HDFS-10797

bin/hadoop fs -du -s /user/sean

bin/hadoop fs -rm /user/sean/HDFS-10797 # or simply rename with mv
bin/hadoop fs -du -s /user/sean
{quote}

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org