[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2018-04-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450725#comment-16450725
 ] 

Hudson commented on HDFS-11674:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14057/])
HDFS-11674. reserveSpaceForReplicas is not released if append request (xyao: 
rev be303c29906002a4cb1c00a47b7844cce3de591f)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java


> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007661#comment-16007661
 ] 

Brahma Reddy Battula commented on HDFS-11674:
-

[~vinayrpet] Nice Catch!! AS BR will not be closed in this scenario., bytes 
will not released forever.
+1, LGTM on branch-2.7 too. Test failures are unrelated.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007658#comment-16007658
 ] 

Hadoop QA commented on HDFS-11674:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 245 unchanged - 2 fixed = 248 total (was 247) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2141 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m 
54s{color} | {color:red} The patch 70 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 47m 11s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_131 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.web.TestWebHdfsTokens |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
| JDK v1.7.0_121 Failed junit tests | 
hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.web.TestHttpsFileSystem |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.server.namenode.snapsho

[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007595#comment-16007595
 ] 

Hudson commented on HDFS-11674:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11728 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11728/])
HDFS-11674. reserveSpaceForReplicas is not released if append request 
(vinayakumarb: rev 1411612aa4e70c704b941723217ed4efd8a0125b)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007564#comment-16007564
 ] 

Vinayakumar B commented on HDFS-11674:
--

Thanks [~arpitagarwal] for confirmation.
Committed to trunk, branch-2, branch-2.8, branch-2.8.1.

Branch-2.7 have some conflicts, attaching patch for the same.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007310#comment-16007310
 ] 

Arpit Agarwal commented on HDFS-11674:
--

bq. In the below part of code, blockLocations were queried first and then set 
as pipeline explicitly for the test purpose.
That makes sense, thanks. I reran the test a few more times and didn't see 
another failure.

+1 for the v3 patch.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006001#comment-16006001
 ] 

Hadoop QA commented on HDFS-11674:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11674 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12867499/HDFS-11674-03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8b39078151c2 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 51b671e |
| Default Java | 1.8.0_121 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19396/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19396/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19396/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19396/artifact/patchprocess/patch-asflicense-problems.txt
 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19396/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.o

[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-10 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005886#comment-16005886
 ] 

Vinayakumar B commented on HDFS-11674:
--

bq. Could you please clarify how this part works? getBlockLocations sorts the 
blocks by network distance from the caller, randomizing replicas at the same 
distance. So lastBlock.getLocations()\[2\] may be the first replica in the 
pipeline some times.

In the below part of code, blockLocations were queried first and then set as 
pipeline explicitly for the test purpose. Also note that there is no 'sorting 
on distance' done for append calls. Its currently only for 
'getBlockLocations()' cal. May be could do that in a following Jira.
{code:java}
/*
 * Reset the pipeline for the append in such a way that, datanode which is
 * down is one of the mirror, not the first datanode.
 */
HdfsBlockLocation blockLocation = (HdfsBlockLocation) fs.getClient()
.getBlockLocations(file.toString(), 0, BLOCK_SIZE)[0];
LocatedBlock lastBlock = blockLocation.getLocatedBlock();
.
.
.
DFSTestUtil.setPipeline((DFSOutputStream) os.getWrappedStream(),
  lastBlock);{code}

bq. I ran this test 5 times and it timed out once waiting for the file to be 
closed. I didn't debug it further though.
I will also check again, not sure whats wrong. But I am sure that its not 
because of current change or test. Could you paste console logs if possible.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-10 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005659#comment-16005659
 ] 

Arpit Agarwal commented on HDFS-11674:
--

I ran this test 5 times and it timed out once waiting for the file to be 
closed. I didn't debug it further though.
{code}
"Thread-254"  prio=5 tid=465 runnable
java.lang.Thread.State: RUNNABLE
at java.lang.Thread.dumpThreads(Native Method)
at java.lang.Thread.getAllStackTraces(Thread.java:1607)
at 
org.apache.hadoop.test.TimedOutTestsListener.buildThreadDump(TimedOutTestsListener.java:87)
at 
org.apache.hadoop.test.TimedOutTestsListener.buildThreadDiagnosticString(TimedOutTestsListener.java:73)
at 
org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:277)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation.testReservedSpaceForLeaseRecovery(TestSpaceReservation.java:730)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
"pool-56-thread-1" daemon prio=5 tid=596 timed_waiting
{code}

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-10 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005613#comment-16005613
 ] 

Arpit Agarwal commented on HDFS-11674:
--

+1 for the patch. I am not clear on one thing in the test case:
{code}
/*
 * Reset the pipeline for the append in such a way that, datanode which is
 * down is one of the mirror, not the first datanode.
 */
HdfsBlockLocation blockLocation = (HdfsBlockLocation) fs.getClient()
.getBlockLocations(file.toString(), 0, BLOCK_SIZE)[0];
LocatedBlock lastBlock = blockLocation.getLocatedBlock();
// stop 3rd node.
cluster.stopDataNode(lastBlock.getLocations()[2].getName());
{code}
Could you please clarify how this part works? getBlockLocations sorts the 
blocks by network distance from the caller, randomizing replicas at the same 
distance. So {{lastBlock.getLocations()\[2\]}} may be the first replica in the 
pipeline some times.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-04 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997682#comment-15997682
 ] 

Arpit Agarwal commented on HDFS-11674:
--

Hi [~vinayrpet], I'll take a look at it. Thanks for the heads up.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-02 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994262#comment-15994262
 ] 

Vinayakumar B commented on HDFS-11674:
--

Hi [~arpitagarwal], can you please review this?

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992512#comment-15992512
 ] 

Hadoop QA commented on HDFS-11674:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
57s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.server.namenode.TestMetadataVersionOutput |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11674 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865878/HDFS-11674-02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0df061d12f9e 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / b0f54ea |
| Default Java | 1.8.0_121 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19260/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19260/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19260/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19260/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> ---

[jira] [Commented] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-04-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15974272#comment-15974272
 ] 

Hadoop QA commented on HDFS-11674:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 185 unchanged - 0 fixed = 187 total (was 185) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 90m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | HDFS-11674 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12863960/HDFS-11674-01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 66d14fbf8867 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8c81a16 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19140/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19140/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19140/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19140/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19140/console |
| Powered by | Apac