[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Attachment: HDFS-11031-branch-2.002.patch

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, 
> HDFS-11031-branch-2.002.patch, HDFS-11031.000.patch, HDFS-11031.001.patch, 
> HDFS-11031.002.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11011) Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'

2016-10-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11011:
-
Attachment: HDFS-11011.006.patch

Posted v006 patch to fix check style issues. Thanks for reviews.

> Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'
> -
>
> Key: HDFS-11011
> URL: https://issues.apache.org/jira/browse/HDFS-11011
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: fs, shell, test
> Attachments: HDFS-11011.000.patch, HDFS-11011.001.patch, 
> HDFS-11011.002.patch, HDFS-11011.003.patch, HDFS-11011.004.patch, 
> HDFS-11011.005.patch, HDFS-11011.006.patch
>
>
> This proposes adding a bunch of unit tests for command  'dfsadmin 
> setSpaceQuota' and  'dfsadmin clrSpaceQuota'.
> 1. test to set space quote using negative number.
> 2. test to set and clear space quote, regular usage.
> 3. test to set and clear space quote by storage type.
> 4. test to set and clear space quote when directory doesn't exist.
> 5. test to set and clear space quote when path is a file.
> 6. test to set and clear space quote when user has no access right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Attachment: HDFS-11031.002.patch

V2 patch addresses [~brahmareddy]'s comments.

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch, 
> HDFS-11031.001.patch, HDFS-11031.002.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593926#comment-15593926
 ] 

Hadoop QA commented on HDFS-11031:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m  2s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834591/HDFS-11031.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 106e6d9a80e8 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17246/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17246/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17246/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  

[jira] [Commented] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File in Datanode.

2016-10-20 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593906#comment-15593906
 ] 

Virajith Jalaparti commented on HDFS-10638:
---

Failing test case is unrelated to the patch. 

> Modifications to remove the assumption that StorageLocation is associated 
> with java.io.File in Datanode.
> 
>
> Key: HDFS-10638
> URL: https://issues.apache.org/jira/browse/HDFS-10638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10638.001.patch, HDFS-10638.002.patch, 
> HDFS-10638.003.patch, HDFS-10638.004.patch, HDFS-10638.005.patch
>
>
> Changes to ensure that {{StorageLocation}} need not be associated with a 
> {{java.io.File}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593873#comment-15593873
 ] 

Mingliang Liu commented on HDFS-11031:
--

[~brahmareddy] that's a good idea. I'll update the patch with a common helper 
method. Thanks,

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch, 
> HDFS-11031.001.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10730) Fix some failed tests due to BindException

2016-10-20 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593843#comment-15593843
 ] 

Brahma Reddy Battula commented on HDFS-10730:
-

[~linyiqun] thanks for updating the patch..Latest patch LGTM, will commit today.

> Fix some failed tests due to BindException
> --
>
> Key: HDFS-10730
> URL: https://issues.apache.org/jira/browse/HDFS-10730
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-10730.001.patch, HDFS-10730.002.patch
>
>
> In HDFS-10723, [~kihwal] suggested that 
> {quote}
> it is not a good idea to hard-code or reuse the same port number in unit 
> tests. Because the jenkins slave can run multiple jobs at the same time.
> {quote}
> Then I collected some tests which failed by this reason in recent jenkin 
> buildings.
> Finally I found these two failed test 
> {{TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/)
>  and 
> {{TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).
> The stack infos:
> {code}
> java.net.BindException: Problem binding to [localhost:57241] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:538)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:811)
>   at org.apache.hadoop.ipc.Server.(Server.java:2611)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:562)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:488)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
> {code}
> {code}
> java.net.BindException: Problem binding to [localhost:54191] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:530)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:519)
>   at 
> org.apache.hadoop.hdfs.net.TcpPeerServer.(TcpPeerServer.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:488)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>   at 
> org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(TestDecommissionWithStriped.java:255)
> {code}
> We can make a change 

[jira] [Commented] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593836#comment-15593836
 ] 

Brahma Reddy Battula commented on HDFS-11031:
-

[~liuml07] thanks for working on this. One straight forward question.
How about having one common method for all these testcases(can pass dir,fail 
tolerate-number...) ?




> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch, 
> HDFS-11031.001.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10998) Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593810#comment-15593810
 ] 

Hudson commented on HDFS-10998:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10650 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10650/])
HDFS-10998. Add unit tests for HDFS command 'dfsadmin -fetchImage' in (liuml07: 
rev d7d87deece66333c188e9b7c10b4b56ddb529ce9)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFetchImage.java


> Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA
> 
>
> Key: HDFS-10998
> URL: https://issues.apache.org/jira/browse/HDFS-10998
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10998.000.patch, HDFS-10998.001.patch
>
>
> This proposes adding unit tests to verify fetchImage works well in the case 
> of HA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11011) Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'

2016-10-20 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593785#comment-15593785
 ] 

Mingliang Liu commented on HDFS-11011:
--

+1 after the checkstyle warning is addressed. Thanks.

> Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'
> -
>
> Key: HDFS-11011
> URL: https://issues.apache.org/jira/browse/HDFS-11011
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: fs, shell, test
> Attachments: HDFS-11011.000.patch, HDFS-11011.001.patch, 
> HDFS-11011.002.patch, HDFS-11011.003.patch, HDFS-11011.004.patch, 
> HDFS-11011.005.patch
>
>
> This proposes adding a bunch of unit tests for command  'dfsadmin 
> setSpaceQuota' and  'dfsadmin clrSpaceQuota'.
> 1. test to set space quote using negative number.
> 2. test to set and clear space quote, regular usage.
> 3. test to set and clear space quote by storage type.
> 4. test to set and clear space quote when directory doesn't exist.
> 5. test to set and clear space quote when path is a file.
> 6. test to set and clear space quote when user has no access right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10998) Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10998:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to {{trunk}} through {{branch-2.8}} branch. Thanks for your 
contribution, [~xiaobingo].

> Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA
> 
>
> Key: HDFS-10998
> URL: https://issues.apache.org/jira/browse/HDFS-10998
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10998.000.patch, HDFS-10998.001.patch
>
>
> This proposes adding unit tests to verify fetchImage works well in the case 
> of HA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File in Datanode.

2016-10-20 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-10638:
--
Status: Open  (was: Patch Available)

> Modifications to remove the assumption that StorageLocation is associated 
> with java.io.File in Datanode.
> 
>
> Key: HDFS-10638
> URL: https://issues.apache.org/jira/browse/HDFS-10638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10638.001.patch, HDFS-10638.002.patch, 
> HDFS-10638.003.patch, HDFS-10638.004.patch, HDFS-10638.005.patch
>
>
> Changes to ensure that {{StorageLocation}} need not be associated with a 
> {{java.io.File}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File in Datanode.

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593750#comment-15593750
 ] 

Hadoop QA commented on HDFS-10638:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 564 unchanged - 7 fixed = 566 total (was 571) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10638 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834583/HDFS-10638.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 348d057a729b 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17245/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17245/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17245/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17245/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Modifications to remove the assumption that StorageLocation is associated 
> with java.io.File in Datanode.
> 

[jira] [Commented] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593747#comment-15593747
 ] 

Mingliang Liu commented on HDFS-11031:
--

I'd like to ping [~jnp] for code review.

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch, 
> HDFS-11031.001.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Attachment: HDFS-11031.001.patch

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch, 
> HDFS-11031.001.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Attachment: HDFS-11031-branch-2.001.patch

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031-branch-2.001.patch, HDFS-11031.000.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593639#comment-15593639
 ] 

Hadoop QA commented on HDFS-11031:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834577/HDFS-11031.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ec4acacd2ee7 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17244/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17244/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17244/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>

[jira] [Commented] (HDFS-11011) Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593637#comment-15593637
 ] 

Hadoop QA commented on HDFS-11011:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 19 new + 38 unchanged - 1 fixed = 57 total (was 39) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11011 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834575/HDFS-11011.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f28c7974d34c 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17243/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17243/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17243/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17243/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add unit tests for HDFS command 

[jira] [Updated] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File in Datanode.

2016-10-20 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-10638:
--
Status: Patch Available  (was: Open)

> Modifications to remove the assumption that StorageLocation is associated 
> with java.io.File in Datanode.
> 
>
> Key: HDFS-10638
> URL: https://issues.apache.org/jira/browse/HDFS-10638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10638.001.patch, HDFS-10638.002.patch, 
> HDFS-10638.003.patch, HDFS-10638.004.patch, HDFS-10638.005.patch
>
>
> Changes to ensure that {{StorageLocation}} need not be associated with a 
> {{java.io.File}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File in Datanode.

2016-10-20 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-10638:
--
Attachment: HDFS-10638.005.patch

Attaching a new patch based on [~eddyxu]'s comments. It completely removes 
{{File}} from {{StorageLocation}}, and uses an {{URI}} instead. One of the 
results of this is that {{VolumeFailureSummary.getFailedStorageLocations()}} 
now returns an array of strings that are URIs and not file paths. While 
{{VolumeFailureSummary}} is reported from the Datanode to the Namenode in the 
heartbeats, {{VolumeFailureSummary.getFailedStorageLocations()}} is not used to 
determine which {{StorageLocation}} s actually failed. If this changes in the 
future, it needs to be noted that these strings are actually URIs. 

> Modifications to remove the assumption that StorageLocation is associated 
> with java.io.File in Datanode.
> 
>
> Key: HDFS-10638
> URL: https://issues.apache.org/jira/browse/HDFS-10638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10638.001.patch, HDFS-10638.002.patch, 
> HDFS-10638.003.patch, HDFS-10638.004.patch, HDFS-10638.005.patch
>
>
> Changes to ensure that {{StorageLocation}} need not be associated with a 
> {{java.io.File}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10998) Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA

2016-10-20 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593541#comment-15593541
 ] 

Mingliang Liu commented on HDFS-10998:
--

+1 Will commit shortly.

> Add unit tests for HDFS command 'dfsadmin -fetchImage' in HA
> 
>
> Key: HDFS-10998
> URL: https://issues.apache.org/jira/browse/HDFS-10998
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10998.000.patch, HDFS-10998.001.patch
>
>
> This proposes adding unit tests to verify fetchImage works well in the case 
> of HA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593514#comment-15593514
 ] 

Arpit Agarwal commented on HDFS-11005:
--

Thanks for the heads up [~vagarychen]. I will also review this patch tomorrow.

> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch, 
> HDFS-11005-HDFS-7240.002.patch
>
>
> TestBlockPoolManager.testFederationRefresh  fails in the ozone branch with 
> the following error message.
> {noformat}
> Running org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.231 sec <<< 
> FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 0.043 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager.testFederationRefresh(TestBlockPoolManager.java:123)
> Results :
> Failed tests:
>   TestBlockPoolManager.testFederationRefresh:123 expected: refresh #2]
> > but was: refresh #1]
> >
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10730) Fix some failed tests due to BindException

2016-10-20 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593496#comment-15593496
 ] 

Yiqun Lin commented on HDFS-10730:
--

The jenkins's result seems good, feel free to commit, [~brahmareddy]. Or you 
want to make other change that you can just let me know, :).

> Fix some failed tests due to BindException
> --
>
> Key: HDFS-10730
> URL: https://issues.apache.org/jira/browse/HDFS-10730
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-10730.001.patch, HDFS-10730.002.patch
>
>
> In HDFS-10723, [~kihwal] suggested that 
> {quote}
> it is not a good idea to hard-code or reuse the same port number in unit 
> tests. Because the jenkins slave can run multiple jobs at the same time.
> {quote}
> Then I collected some tests which failed by this reason in recent jenkin 
> buildings.
> Finally I found these two failed test 
> {{TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/)
>  and 
> {{TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).
> The stack infos:
> {code}
> java.net.BindException: Problem binding to [localhost:57241] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:538)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:811)
>   at org.apache.hadoop.ipc.Server.(Server.java:2611)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:562)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:488)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
> {code}
> {code}
> java.net.BindException: Problem binding to [localhost:54191] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:530)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:519)
>   at 
> org.apache.hadoop.hdfs.net.TcpPeerServer.(TcpPeerServer.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:488)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>   at 
> 

[jira] [Commented] (HDFS-11025) TestDiskspaceQuotaUpdate fails in trunk due to Bind exception

2016-10-20 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593480#comment-15593480
 ] 

Yiqun Lin commented on HDFS-11025:
--

Thanks [~brahmareddy] for the commit in this JIRA and in HDFS-10699!

> TestDiskspaceQuotaUpdate fails in trunk due to Bind exception
> -
>
> Key: HDFS-11025
> URL: https://issues.apache.org/jira/browse/HDFS-11025
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11025.001.patch
>
>
> The test {{TestDiskspaceQuotaUpdate}} fails sometimes after HDFS-10843, the 
> link addresse: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/17200/testReport/. The 
> stack infos:
> {code} 
> java.net.BindException: Problem binding to [localhost:49195] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> {code} 
> I found the bind exception was happened in new test method 
> {{TestDiskspaceQuotaUpdate.testQuotaIssuesWhileCommitting}}. The related 
> codes:
> {code}
>   public void testQuotaIssuesWhileCommitting() throws Exception {
> ...
> try {
>   for (int i = REPLICATION - 1; i > 0; i--) {
> dnprops.add(cluster.stopDataNode(i));
>   }
>   ...
> } finally {
>   for (MiniDFSCluster.DataNodeProperties dnprop : dnprops) {
> cluster.restartDataNode(dnprop, true);
>   }
>   cluster.waitActive();
> }
>   }
> {code}
> I think we can make a simple fix in {{cluster.restartDataNode(dnprop, 
> true);}}. The tests in {{TestDiskspaceQuotaUpdate}} just care about that if 
> the cluster is up and running. So I think this change will not influence the 
> current logic,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11018) Incorrect check and message in FsDatasetImpl#invalidate

2016-10-20 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593466#comment-15593466
 ] 

Yiqun Lin commented on HDFS-11018:
--

Thanks [~jojochuang] for the commit!

> Incorrect check and message in FsDatasetImpl#invalidate
> ---
>
> Key: HDFS-11018
> URL: https://issues.apache.org/jira/browse/HDFS-11018
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11018.001.patch, HDFS-11018.002.patch, 
> HDFS-11018.003.patch
>
>
> The following error check and message is incorrect, because {{info}} is null 
> if (1) the block id does not exist in ReplicaMap or (2) the generation stamp 
> of block does not match the replica entry in ReplicaMap.
> {code:title=FsDatasetImpl#invalidate}
>final ReplicaInfo info = volumeMap.get(bpid, invalidBlks[i]);
> if (info == null) {
>   // It is okay if the block is not found -- it may be deleted 
> earlier.
>   LOG.info("Failed to delete replica " + invalidBlks[i]
>   + ": ReplicaInfo not found.");
>   continue;
> }
> if (info.getGenerationStamp() != invalidBlks[i].getGenerationStamp()) 
> {
>   errors.add("Failed to delete replica " + invalidBlks[i]
>   + ": GenerationStamp not matched, info=" + info);
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10757) KMSClientProvider combined with KeyProviderCache can result in wrong UGI being used

2016-10-20 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593419#comment-15593419
 ] 

Xiao Chen commented on HDFS-10757:
--

Thanks [~xyao] for the new rev, looks good to me.
Could you clarify what testing you have done with this patch?

> KMSClientProvider combined with KeyProviderCache can result in wrong UGI 
> being used
> ---
>
> Key: HDFS-10757
> URL: https://issues.apache.org/jira/browse/HDFS-10757
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-10757.00.patch, HDFS-10757.01.patch, 
> HDFS-10757.02.patch, HDFS-10757.03.patch
>
>
> ClientContext::get gets the context from CACHE via a config setting based 
> name, then KeyProviderCache stored in ClientContext gets the key provider 
> cached by URI from the configuration, too. These would return the same 
> KeyProvider regardless of current UGI.
> KMSClientProvider caches the UGI (actualUgi) in ctor; that means in 
> particular that all the users of DFS with KMSClientProvider in a process will 
> get the KMS token (along with other credentials) of the first user, via the 
> above cache.
> Either KMSClientProvider shouldn't store the UGI, or one of the caches should 
> be UGI-aware, like the FS object cache.
> Side note: the comment in createConnection that purports to handle the 
> different UGI doesn't seem to cover what it says it covers. In our case, we 
> have two unrelated UGIs with no auth (createRemoteUser) with bunch of tokens, 
> including a KMS token, added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11011) Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'

2016-10-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593422#comment-15593422
 ] 

Xiaobing Zhou commented on HDFS-11011:
--

Thank you [~liuml07] for the patch. I posted another patch v005 which is based 
on v002. The reason is that a jumbo reusable function (e.g. v003 and v004) 
makes code less readable and maintainable.

> Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'
> -
>
> Key: HDFS-11011
> URL: https://issues.apache.org/jira/browse/HDFS-11011
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: fs, shell, test
> Attachments: HDFS-11011.000.patch, HDFS-11011.001.patch, 
> HDFS-11011.002.patch, HDFS-11011.003.patch, HDFS-11011.004.patch, 
> HDFS-11011.005.patch
>
>
> This proposes adding a bunch of unit tests for command  'dfsadmin 
> setSpaceQuota' and  'dfsadmin clrSpaceQuota'.
> 1. test to set space quote using negative number.
> 2. test to set and clear space quote, regular usage.
> 3. test to set and clear space quote by storage type.
> 4. test to set and clear space quote when directory doesn't exist.
> 5. test to set and clear space quote when path is a file.
> 6. test to set and clear space quote when user has no access right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10935) TestFileChecksum tests are failing after HDFS-10460 (Mac only?)

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593423#comment-15593423
 ] 

Wei-Chiu Chuang commented on HDFS-10935:


I have two local Hadoop repo, one is built with native isa-l lib and the other 
is not. The one without the native lib failed while the other passed. It seems 
java based codec doesn't work as expected.

I also have other tests where if I specifically corrupt 1, 2 or 3 strips out of 
9, it does not recover if hadoop is built without native lib.

> TestFileChecksum tests are failing after HDFS-10460 (Mac only?)
> ---
>
> Key: HDFS-10935
> URL: https://issues.apache.org/jira/browse/HDFS-10935
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: JDK 1.8.0_91 on Mac OS X Yosemite 10.10.5
>Reporter: Wei-Chiu Chuang
>Assignee: SammiChen
>
> On my Mac, TestFileChecksum has been been failing since HDFS-10460. However, 
> the jenkins jobs have not reported the failures. Maybe it's an issue with my 
> Mac or JDK.
> 9 out of 21 tests failed. 
> {noformat}
> java.lang.AssertionError: Checksum mismatches!
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery(TestFileChecksum.java:227)
>   at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocksRangeQuery10(TestFileChecksum.java:336)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Status: Patch Available  (was: Open)

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031.000.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11011) Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'

2016-10-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11011:
-
Attachment: HDFS-11011.005.patch

> Add unit tests for HDFS command 'dfsadmin -set/clrSpaceQuota'
> -
>
> Key: HDFS-11011
> URL: https://issues.apache.org/jira/browse/HDFS-11011
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: fs, shell, test
> Attachments: HDFS-11011.000.patch, HDFS-11011.001.patch, 
> HDFS-11011.002.patch, HDFS-11011.003.patch, HDFS-11011.004.patch, 
> HDFS-11011.005.patch
>
>
> This proposes adding a bunch of unit tests for command  'dfsadmin 
> setSpaceQuota' and  'dfsadmin clrSpaceQuota'.
> 1. test to set space quote using negative number.
> 2. test to set and clear space quote, regular usage.
> 3. test to set and clear space quote by storage type.
> 4. test to set and clear space quote when directory doesn't exist.
> 5. test to set and clear space quote when path is a file.
> 6. test to set and clear space quote when user has no access right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11031:
-
Attachment: HDFS-11031.000.patch

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11031.000.patch
>
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593413#comment-15593413
 ] 

Hadoop QA commented on HDFS-11005:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 58m 
35s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11005 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834560/HDFS-11005-HDFS-7240.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a1552164aace 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / c70775a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17241/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17241/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch, 
> 

[jira] [Commented] (HDFS-10757) KMSClientProvider combined with KeyProviderCache can result in wrong UGI being used

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593376#comment-15593376
 ] 

Hadoop QA commented on HDFS-10757:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
42s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10757 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834564/HDFS-10757.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d52b84b66568 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17242/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17242/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> KMSClientProvider combined with KeyProviderCache can result in wrong UGI 
> being used
> ---
>
> Key: HDFS-10757
> URL: https://issues.apache.org/jira/browse/HDFS-10757
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-10757.00.patch, 

[jira] [Commented] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593363#comment-15593363
 ] 

Hadoop QA commented on HDFS-11030:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
39s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 50 unchanged - 7 fixed = 50 total (was 57) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 48m 
20s{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_111. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_101 Failed junit tests | hadoop.hdfs.TestEncryptionZones |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-11030 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834547/HDFS-11030-branch-2.000.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2ebe64861e73 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / 1f384b6 |
| Default 

[jira] [Commented] (HDFS-6440) Support more than 2 NameNodes

2016-10-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593343#comment-15593343
 ] 

Arpit Agarwal commented on HDFS-6440:
-

Thank you for the quick response Jesse.

> Support more than 2 NameNodes
> -
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: auto-failover, ha, namenode
>Affects Versions: 2.4.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 3.0.0-alpha1
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
> hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
> hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9462) DiskBalancer: Add Scan Command

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593331#comment-15593331
 ] 

Hadoop QA commented on HDFS-9462:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project: The patch generated 6 new + 
3 unchanged - 0 fixed = 9 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 7 
unchanged - 0 fixed = 8 total (was 7) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Dead store to planId in 
org.apache.hadoop.hdfs.tools.DiskBalancerCLI.addScanCommands(Options)  At 
DiskBalancerCLI.java:org.apache.hadoop.hdfs.tools.DiskBalancerCLI.addScanCommands(Options)
  At DiskBalancerCLI.java:[line 472] |
| Failed junit tests | hadoop.hdfs.TestFileChecksum |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-9462 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834552/HDFS-9462.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 82c0884da770 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 262827c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Updated] (HDFS-10757) KMSClientProvider combined with KeyProviderCache can result in wrong UGI being used

2016-10-20 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10757:
--
Attachment: HDFS-10757.03.patch

Thanks [~xiaochen] for the feedback. Attach a new patch that removes the 
checking based on UserGroupInformation.AuthenticationMethod as it is not a 
reliable way for non-server scenario usage of KMSClientProvider. 

Add logic to use loginUser only if the currentUGI does not have either kerberos 
 credential or kms delegation token. If the currentUGI has Kerberos credential 
but not kms delegation token, we should go through the SPNEGO authentication 
rather than using loginUGI directly. 



> KMSClientProvider combined with KeyProviderCache can result in wrong UGI 
> being used
> ---
>
> Key: HDFS-10757
> URL: https://issues.apache.org/jira/browse/HDFS-10757
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-10757.00.patch, HDFS-10757.01.patch, 
> HDFS-10757.02.patch, HDFS-10757.03.patch
>
>
> ClientContext::get gets the context from CACHE via a config setting based 
> name, then KeyProviderCache stored in ClientContext gets the key provider 
> cached by URI from the configuration, too. These would return the same 
> KeyProvider regardless of current UGI.
> KMSClientProvider caches the UGI (actualUgi) in ctor; that means in 
> particular that all the users of DFS with KMSClientProvider in a process will 
> get the KMS token (along with other credentials) of the first user, via the 
> above cache.
> Either KMSClientProvider shouldn't store the UGI, or one of the caches should 
> be UGI-aware, like the FS object cache.
> Side note: the comment in createConnection that purports to handle the 
> different UGI doesn't seem to cover what it says it covers. In our case, we 
> have two unrelated UGIs with no auth (createRemoteUser) with bunch of tokens, 
> including a KMS token, added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593258#comment-15593258
 ] 

Hadoop QA commented on HDFS-11005:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
37s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 
20s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11005 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834549/HDFS-11005-HDFS-7240.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 76cc859d527d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / c70775a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17239/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17239/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch, 
> 

[jira] [Updated] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-11005:
--
Attachment: HDFS-11005-HDFS-7240.002.patch

Thanks [~xyao] for the review! The issue here was that if exception happened in 
the {{=DFSUtil.getNN...()}} calls, there will only be a log warning, then the 
following putAll call will complain non-initialized variable, in which case the 
HashMap still need to be created. In v002 patch I changed to only create when 
the exception happens, which is hopefully cleaner.

> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch, 
> HDFS-11005-HDFS-7240.002.patch
>
>
> TestBlockPoolManager.testFederationRefresh  fails in the ozone branch with 
> the following error message.
> {noformat}
> Running org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.231 sec <<< 
> FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 0.043 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager.testFederationRefresh(TestBlockPoolManager.java:123)
> Results :
> Failed tests:
>   TestBlockPoolManager.testFederationRefresh:123 expected: refresh #2]
> > but was: refresh #1]
> >
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11017) dfsadmin set/clrSpaceQuota fail to recognize StorageType option

2016-10-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou resolved HDFS-11017.
--
Resolution: Invalid

It seems there's a flaky error in my script. It's not an issue anymore, thank 
you [~xyao] for the check.

> dfsadmin set/clrSpaceQuota fail to recognize StorageType option
> ---
>
> Key: HDFS-11017
> URL: https://issues.apache.org/jira/browse/HDFS-11017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: cli
>
> dfsadmin setSpaceQuota or clrSpaceQuota don't recognize valid StorageType 
> options, such as DISK or SSD, however, It's been supported by DFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11038) DiskBalancer: support running multiple commands under one setup of disk balancer

2016-10-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou resolved HDFS-11038.
--
Resolution: Implemented

As explained, it's already implemented in HDFS-9462 patch. Closed this ticket.

> DiskBalancer: support running multiple commands under one setup of disk 
> balancer
> 
>
> Key: HDFS-11038
> URL: https://issues.apache.org/jira/browse/HDFS-11038
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> Disk balancer follows/reuses one rule designed by HDFS balancer, that is, 
> only one instance is allowed to run at the same time. This is correct in 
> production system to avoid any inconsistencies, but it's not ideal to write 
> and run unit tests. For example, it should be allowed run plan, execute, scan 
> commands under one setup of disk balancer. One instance rule will throw 
> exception by complaining 'Another instance is running'. In such a case, 
> there's no way to do a full life cycle tests which involves a sequence of 
> commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11038) DiskBalancer: support running multiple commands under one setup of disk balancer

2016-10-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593134#comment-15593134
 ] 

Xiaobing Zhou commented on HDFS-11038:
--

This can be fixed by closing FileSystem instance maintaine by 
org.apache.hadoop.hdfs.server.diskbalancer.command. See also  
[HDFS-9462.002.patch|https://issues.apache.org/jira/secure/attachment/12834552/HDFS-9462.002.patch].
{code}
110   /**
111* Cleans any resources held by this command.
112* 
113* The main goal is to delete id file created in
114* {@link org.apache.hadoop.hdfs.server.balancer.checkAndMarkRunning},
115* otherwise, it's not allowed to run multiple commands in a row.
116* 
117*/
118   @Override
119   public void close() throws IOException {
120 if (fs != null) {
121   fs.close();
122 }
123   }
124 
{code}


> DiskBalancer: support running multiple commands under one setup of disk 
> balancer
> 
>
> Key: HDFS-11038
> URL: https://issues.apache.org/jira/browse/HDFS-11038
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> Disk balancer follows/reuses one rule designed by HDFS balancer, that is, 
> only one instance is allowed to run at the same time. This is correct in 
> production system to avoid any inconsistencies, but it's not ideal to write 
> and run unit tests. For example, it should be allowed run plan, execute, scan 
> commands under one setup of disk balancer. One instance rule will throw 
> exception by complaining 'Another instance is running'. In such a case, 
> there's no way to do a full life cycle tests which involves a sequence of 
> commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9462) DiskBalancer: Add Scan Command

2016-10-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593122#comment-15593122
 ] 

Xiaobing Zhou commented on HDFS-9462:
-

Thanks [~anu] for reviews. Posted patch v002, added a couple of tests.

> DiskBalancer: Add Scan Command
> --
>
> Key: HDFS-9462
> URL: https://issues.apache.org/jira/browse/HDFS-9462
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9462-HDFS-10576.001.patch, 
> HDFS-9462-HDFS-1312.000.patch, HDFS-9462.002.patch
>
>
> This is to propose being able to scan all the nodes that we send various 
> plans to. In order to do the scan, scan command will talk to all involved 
> data nodes through cluster interface(HDFS-9449) and data models(HDFS-9420) 
> and compare the hash tag that it gets back to make sure that the plan is that 
> we are interested in and print out the results.
> As bonus, it should support the ability to print out the diff of what 
> happened when a DiskBalancer run is complete. Assuming the state of the 
> cluster is saved to file before.json. There should be two kinds of diffs:
> 1. Overall what happened in the cluster vs. before.json -- just a summary 
> 2. for a specific node -- just like report command we should be able to pass 
> in a node and as see the changes against the before.json



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9462) DiskBalancer: Add Scan Command

2016-10-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9462:

Attachment: HDFS-9462.002.patch

> DiskBalancer: Add Scan Command
> --
>
> Key: HDFS-9462
> URL: https://issues.apache.org/jira/browse/HDFS-9462
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9462-HDFS-10576.001.patch, 
> HDFS-9462-HDFS-1312.000.patch, HDFS-9462.002.patch
>
>
> This is to propose being able to scan all the nodes that we send various 
> plans to. In order to do the scan, scan command will talk to all involved 
> data nodes through cluster interface(HDFS-9449) and data models(HDFS-9420) 
> and compare the hash tag that it gets back to make sure that the plan is that 
> we are interested in and print out the results.
> As bonus, it should support the ability to print out the diff of what 
> happened when a DiskBalancer run is complete. Assuming the state of the 
> cluster is saved to file before.json. There should be two kinds of diffs:
> 1. Overall what happened in the cluster vs. before.json -- just a summary 
> 2. for a specific node -- just like report command we should be able to pass 
> in a node and as see the changes against the before.json



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593068#comment-15593068
 ] 

Chen Liang commented on HDFS-11005:
---

[~arpitagarwal] do you mind taking a look at the patch? as the {{putAll}} call 
was introduced in HDFS-10363, thanks!

> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch
>
>
> TestBlockPoolManager.testFederationRefresh  fails in the ozone branch with 
> the following error message.
> {noformat}
> Running org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.231 sec <<< 
> FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 0.043 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager.testFederationRefresh(TestBlockPoolManager.java:123)
> Results :
> Failed tests:
>   TestBlockPoolManager.testFederationRefresh:123 expected: refresh #2]
> > but was: refresh #1]
> >
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11005) Ozone: TestBlockPoolManager fails in ozone branch.

2016-10-20 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-11005:
--
Assignee: Chen Liang
  Status: Patch Available  (was: Open)

> Ozone: TestBlockPoolManager fails in ozone branch.
> --
>
> Key: HDFS-11005
> URL: https://issues.apache.org/jira/browse/HDFS-11005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Chen Liang
> Attachments: HDFS-11005-HDFS-7240.001.patch
>
>
> TestBlockPoolManager.testFederationRefresh  fails in the ozone branch with 
> the following error message.
> {noformat}
> Running org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.231 sec <<< 
> FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 0.043 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager.testFederationRefresh(TestBlockPoolManager.java:123)
> Results :
> Failed tests:
>   TestBlockPoolManager.testFederationRefresh:123 expected: refresh #2]
> > but was: refresh #1]
> >
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Status: Patch Available  (was: Open)

> TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)
> -
>
> Key: HDFS-11030
> URL: https://issues.apache.org/jira/browse/HDFS-11030
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 2.7.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11030-branch-2.000.patch, HDFS-11030.000.patch
>
>
> TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
> blocks and files are replicated correctly.
> # To fail a volume, it deletes all the blocks and sets the data dir read only.
> {code:title=testVolumeFailure() snippet}
> // fail the volume
> // delete/make non-writable one of the directories (failed volume)
> data_fail = new File(dataDir, "data3");
> failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
> cluster.getNamesystem().getBlockPoolId());
> if (failedDir.exists() &&
> //!FileUtil.fullyDelete(failedDir)
> !deteteBlocks(failedDir)
> ) {
>   throw new IOException("Could not delete hdfs directory '" + failedDir + 
> "'");
> }
> data_fail.setReadOnly();
> failedDir.setReadOnly();
> {code}
> However, there are two bugs here, which make the blocks not deleted.
> #- The {{failedDir}} directory for finalized blocks is not calculated 
> correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
> directory.
> #- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
> there is no subdirectories in the data dir. This assumption was also noted in 
> the comments.
> {quote}
> // we use only small number of blocks to avoid creating subdirs in the 
> data dir..
> {quote}
> This is not true. On my local cluster and MiniDFSCluster, there will be 
> subdir0/subdir0/ two level directories regardless of the number of blocks.
> # Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
> the volume and send block report to NN. This is basically in the 
> {{triggerFailure()}} method.
> {code}
>   private void triggerFailure(String path, long size) throws IOException {
> NamenodeProtocols nn = cluster.getNameNodeRpc();
> List locatedBlocks =
>   nn.getBlockLocations(path, 0, size).getLocatedBlocks();
> 
> for (LocatedBlock lb : locatedBlocks) {
>   DatanodeInfo dinfo = lb.getLocations()[1];
>   ExtendedBlock b = lb.getBlock();
>   try {
> accessBlock(dinfo, lb);
>   } catch (IOException e) {
> System.out.println("Failure triggered, on block: " + b.getBlockId() + 
>  
> "; corresponding volume should be removed by now");
> break;
>   }
> }
>   }
> {code}
> Accessing those blocks will not trigger failures if the directory is 
> read-only (while the block files are all there). I ran the tests multiple 
> times without triggering this failure. We have to write new block files to 
> the data directories, or we should have deleted the blocks correctly. I think 
> we need to add some assertion code after triggering the volume failure. The 
> assertions should check the datanode volume failure summary explicitly to 
> make sure a volume failure is triggered (and noticed).
> # To make sure the NameNode be aware of the volume failure, the code 
> explictily send block reports to NN.
> {code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
> cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
> new BlockReportContext(1, 0, System.nanoTime(), 0, false));
> {code}
> Generating block report code is complex, which is actually the internal logic 
> of {{BPServiceActor}}. We may have to update this code it changes. In fact, 
> the volume failure is now sent by DataNode via heartbeats. We should trigger 
> a heartbeat request here; and make sure the NameNode handles the heartbeat 
> before we verify the block states.
> # When verifying via {{verify()}}, it counts the real block files and assert 
> that real block files plus underreplicated blocks should cover all blocks. 
> Before counting underreplicated blocks, it triggered the {{BlockManager}} to 
> compute the datanode work:
> {code}
> // force update of all the metric counts by calling computeDatanodeWork
> BlockManagerTestUtil.getComputedDatanodeWork(fsn.getBlockManager());
> {code}
> However, counting physical block files and underreplicated blocks are not 
> atomic. The NameNode will inform of the DataNode the computed work at next 
> heartbeat. So I think this part of code may fail when some blocks are 
> replicated and the number of physical block files 

[jira] [Assigned] (HDFS-11031) Add additional unit test for DataNode startup behavior when volumes fail

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HDFS-11031:


Assignee: Mingliang Liu

> Add additional unit test for DataNode startup behavior when volumes fail
> 
>
> Key: HDFS-11031
> URL: https://issues.apache.org/jira/browse/HDFS-11031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> There are several cases to add in {{TestDataNodeVolumeFailure}}:
> - DataNode should not start in case of volumes failure
> - DataNode should not start in case of lacking data dir read/write permission
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Attachment: HDFS-11030-branch-2.000.patch

> TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)
> -
>
> Key: HDFS-11030
> URL: https://issues.apache.org/jira/browse/HDFS-11030
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 2.7.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11030-branch-2.000.patch, HDFS-11030.000.patch
>
>
> TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
> blocks and files are replicated correctly.
> # To fail a volume, it deletes all the blocks and sets the data dir read only.
> {code:title=testVolumeFailure() snippet}
> // fail the volume
> // delete/make non-writable one of the directories (failed volume)
> data_fail = new File(dataDir, "data3");
> failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
> cluster.getNamesystem().getBlockPoolId());
> if (failedDir.exists() &&
> //!FileUtil.fullyDelete(failedDir)
> !deteteBlocks(failedDir)
> ) {
>   throw new IOException("Could not delete hdfs directory '" + failedDir + 
> "'");
> }
> data_fail.setReadOnly();
> failedDir.setReadOnly();
> {code}
> However, there are two bugs here, which make the blocks not deleted.
> #- The {{failedDir}} directory for finalized blocks is not calculated 
> correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
> directory.
> #- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
> there is no subdirectories in the data dir. This assumption was also noted in 
> the comments.
> {quote}
> // we use only small number of blocks to avoid creating subdirs in the 
> data dir..
> {quote}
> This is not true. On my local cluster and MiniDFSCluster, there will be 
> subdir0/subdir0/ two level directories regardless of the number of blocks.
> # Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
> the volume and send block report to NN. This is basically in the 
> {{triggerFailure()}} method.
> {code}
>   private void triggerFailure(String path, long size) throws IOException {
> NamenodeProtocols nn = cluster.getNameNodeRpc();
> List locatedBlocks =
>   nn.getBlockLocations(path, 0, size).getLocatedBlocks();
> 
> for (LocatedBlock lb : locatedBlocks) {
>   DatanodeInfo dinfo = lb.getLocations()[1];
>   ExtendedBlock b = lb.getBlock();
>   try {
> accessBlock(dinfo, lb);
>   } catch (IOException e) {
> System.out.println("Failure triggered, on block: " + b.getBlockId() + 
>  
> "; corresponding volume should be removed by now");
> break;
>   }
> }
>   }
> {code}
> Accessing those blocks will not trigger failures if the directory is 
> read-only (while the block files are all there). I ran the tests multiple 
> times without triggering this failure. We have to write new block files to 
> the data directories, or we should have deleted the blocks correctly. I think 
> we need to add some assertion code after triggering the volume failure. The 
> assertions should check the datanode volume failure summary explicitly to 
> make sure a volume failure is triggered (and noticed).
> # To make sure the NameNode be aware of the volume failure, the code 
> explictily send block reports to NN.
> {code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
> cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
> new BlockReportContext(1, 0, System.nanoTime(), 0, false));
> {code}
> Generating block report code is complex, which is actually the internal logic 
> of {{BPServiceActor}}. We may have to update this code it changes. In fact, 
> the volume failure is now sent by DataNode via heartbeats. We should trigger 
> a heartbeat request here; and make sure the NameNode handles the heartbeat 
> before we verify the block states.
> # When verifying via {{verify()}}, it counts the real block files and assert 
> that real block files plus underreplicated blocks should cover all blocks. 
> Before counting underreplicated blocks, it triggered the {{BlockManager}} to 
> compute the datanode work:
> {code}
> // force update of all the metric counts by calling computeDatanodeWork
> BlockManagerTestUtil.getComputedDatanodeWork(fsn.getBlockManager());
> {code}
> However, counting physical block files and underreplicated blocks are not 
> atomic. The NameNode will inform of the DataNode the computed work at next 
> heartbeat. So I think this part of code may fail when some blocks are 
> replicated and the number of physical block 

[jira] [Commented] (HDFS-11036) Ozone : reuse Xceiver connection

2016-10-20 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592986#comment-15592986
 ] 

Anu Engineer commented on HDFS-11036:
-

[~vagarychen] Thanks for this patch. It is a greatly needed improvement. I had 
a minor suggestion. Instead of holding a lock on map while iterating through 
the map, would it make sense to use a real LRU so that you know which clients 
need closing. My worry is that this thread will startup during a heavily loaded 
session and will block I/O threads.  So if we can do it in a way which is 
guaranteed not to affect I/O threads that would be good.



> Ozone : reuse Xceiver connection
> 
>
> Key: HDFS-11036
> URL: https://issues.apache.org/jira/browse/HDFS-11036
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11036-HDFS-7240.001.patch, 
> HDFS-11036-HDFS-7240.002.patch, HDFS-11036-HDFS-7240.003.patch
>
>
> Currently for every IO operation calling into XceiverClientManager will 
> open/close a connection, this JIRA proposes to reuse connection to reduce 
> connection setup/shutdown overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Attachment: HDFS-11030.000.patch

Anyone can review the JIRA and the patch? Thanks,

Ping [~jnp] and [~arpitagarwal].

> TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)
> -
>
> Key: HDFS-11030
> URL: https://issues.apache.org/jira/browse/HDFS-11030
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 2.7.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-11030.000.patch
>
>
> TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
> blocks and files are replicated correctly.
> # To fail a volume, it deletes all the blocks and sets the data dir read only.
> {code:title=testVolumeFailure() snippet}
> // fail the volume
> // delete/make non-writable one of the directories (failed volume)
> data_fail = new File(dataDir, "data3");
> failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
> cluster.getNamesystem().getBlockPoolId());
> if (failedDir.exists() &&
> //!FileUtil.fullyDelete(failedDir)
> !deteteBlocks(failedDir)
> ) {
>   throw new IOException("Could not delete hdfs directory '" + failedDir + 
> "'");
> }
> data_fail.setReadOnly();
> failedDir.setReadOnly();
> {code}
> However, there are two bugs here, which make the blocks not deleted.
> #- The {{failedDir}} directory for finalized blocks is not calculated 
> correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
> directory.
> #- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
> there is no subdirectories in the data dir. This assumption was also noted in 
> the comments.
> {quote}
> // we use only small number of blocks to avoid creating subdirs in the 
> data dir..
> {quote}
> This is not true. On my local cluster and MiniDFSCluster, there will be 
> subdir0/subdir0/ two level directories regardless of the number of blocks.
> # Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
> the volume and send block report to NN. This is basically in the 
> {{triggerFailure()}} method.
> {code}
>   private void triggerFailure(String path, long size) throws IOException {
> NamenodeProtocols nn = cluster.getNameNodeRpc();
> List locatedBlocks =
>   nn.getBlockLocations(path, 0, size).getLocatedBlocks();
> 
> for (LocatedBlock lb : locatedBlocks) {
>   DatanodeInfo dinfo = lb.getLocations()[1];
>   ExtendedBlock b = lb.getBlock();
>   try {
> accessBlock(dinfo, lb);
>   } catch (IOException e) {
> System.out.println("Failure triggered, on block: " + b.getBlockId() + 
>  
> "; corresponding volume should be removed by now");
> break;
>   }
> }
>   }
> {code}
> Accessing those blocks will not trigger failures if the directory is 
> read-only (while the block files are all there). I ran the tests multiple 
> times without triggering this failure. We have to write new block files to 
> the data directories, or we should have deleted the blocks correctly. I think 
> we need to add some assertion code after triggering the volume failure. The 
> assertions should check the datanode volume failure summary explicitly to 
> make sure a volume failure is triggered (and noticed).
> # To make sure the NameNode be aware of the volume failure, the code 
> explictily send block reports to NN.
> {code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
> cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
> new BlockReportContext(1, 0, System.nanoTime(), 0, false));
> {code}
> Generating block report code is complex, which is actually the internal logic 
> of {{BPServiceActor}}. We may have to update this code it changes. In fact, 
> the volume failure is now sent by DataNode via heartbeats. We should trigger 
> a heartbeat request here; and make sure the NameNode handles the heartbeat 
> before we verify the block states.
> # When verifying via {{verify()}}, it counts the real block files and assert 
> that real block files plus underreplicated blocks should cover all blocks. 
> Before counting underreplicated blocks, it triggered the {{BlockManager}} to 
> compute the datanode work:
> {code}
> // force update of all the metric counts by calling computeDatanodeWork
> BlockManagerTestUtil.getComputedDatanodeWork(fsn.getBlockManager());
> {code}
> However, counting physical block files and underreplicated blocks are not 
> atomic. The NameNode will inform of the DataNode the computed work at next 
> heartbeat. So I think this part of code may fail when some blocks are 
> 

[jira] [Updated] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Description: 
TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
blocks and files are replicated correctly.

# To fail a volume, it deletes all the blocks and sets the data dir read only.
{code:title=testVolumeFailure() snippet}
// fail the volume
// delete/make non-writable one of the directories (failed volume)
data_fail = new File(dataDir, "data3");
failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
cluster.getNamesystem().getBlockPoolId());
if (failedDir.exists() &&
//!FileUtil.fullyDelete(failedDir)
!deteteBlocks(failedDir)
) {
  throw new IOException("Could not delete hdfs directory '" + failedDir + 
"'");
}
data_fail.setReadOnly();
failedDir.setReadOnly();
{code}
However, there are two bugs here, which make the blocks not deleted.
#- The {{failedDir}} directory for finalized blocks is not calculated 
correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
directory.
#- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
there is no subdirectories in the data dir. This assumption was also noted in 
the comments.
{quote}
// we use only small number of blocks to avoid creating subdirs in the data 
dir..
{quote}
This is not true. On my local cluster and MiniDFSCluster, there will be 
subdir0/subdir0/ two level directories regardless of the number of blocks.
# Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
the volume and send block report to NN. This is basically in the 
{{triggerFailure()}} method.
{code}
  private void triggerFailure(String path, long size) throws IOException {
NamenodeProtocols nn = cluster.getNameNodeRpc();
List locatedBlocks =
  nn.getBlockLocations(path, 0, size).getLocatedBlocks();

for (LocatedBlock lb : locatedBlocks) {
  DatanodeInfo dinfo = lb.getLocations()[1];
  ExtendedBlock b = lb.getBlock();
  try {
accessBlock(dinfo, lb);
  } catch (IOException e) {
System.out.println("Failure triggered, on block: " + b.getBlockId() +  
"; corresponding volume should be removed by now");
break;
  }
}
  }
{code}
Accessing those blocks will not trigger failures if the directory is read-only 
(while the block files are all there). I ran the tests multiple times without 
triggering this failure. We have to write new block files to the data 
directories, or we should have deleted the blocks correctly. I think we need to 
add some assertion code after triggering the volume failure. The assertions 
should check the datanode volume failure summary explicitly to make sure a 
volume failure is triggered (and noticed).
# To make sure the NameNode be aware of the volume failure, the code explictily 
send block reports to NN.
{code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
new BlockReportContext(1, 0, System.nanoTime(), 0, false));
{code}
Generating block report code is complex, which is actually the internal logic 
of {{BPServiceActor}}. We may have to update this code it changes. In fact, the 
volume failure is now sent by DataNode via heartbeats. We should trigger a 
heartbeat request here; and make sure the NameNode handles the heartbeat before 
we verify the block states.
# When verifying via {{verify()}}, it counts the real block files and assert 
that real block files plus underreplicated blocks should cover all blocks. 
Before counting underreplicated blocks, it triggered the {{BlockManager}} to 
compute the datanode work:
{code}
// force update of all the metric counts by calling computeDatanodeWork
BlockManagerTestUtil.getComputedDatanodeWork(fsn.getBlockManager());
{code}
However, counting physical block files and underreplicated blocks are not 
atomic. The NameNode will inform of the DataNode the computed work at next 
heartbeat. So I think this part of code may fail when some blocks are 
replicated and the number of physical block files is made stale. To avoid this 
case, I think we should keep the DataNode from sending the heartbeat after 
that. A simple solution is to set {{dfs.heartbeat.interval}} long enough.


This unit test has been there for years and it seldom fails, just because it's 
never triggered a real volume failure.


  was:
TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
blocks and files are replicated correctly.

# To fail a volume, it deletes all the blocks and sets the data dir read only.
{code:title=testVolumeFailure() snippet}
// fail the volume
// delete/make non-writable one of the directories (failed volume)
data_fail = new File(dataDir, "data3");
failedDir = 

[jira] [Issue Comment Deleted] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Comment: was deleted

(was: Sending block report code seems complex. This is the internal logic of 
{{BPServiceActor}} and we may have to update this code it changes. I think 
{{cluster.triggerBlockReport()}} is a good alternative.
{code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
// make sure a block report is sent 
DataNode dn = cluster.getDataNodes().get(1); //corresponds to dir data3
String bpid = cluster.getNamesystem().getBlockPoolId();
DatanodeRegistration dnR = dn.getDNRegistrationForBP(bpid);
Map perVolumeBlockLists =
dn.getFSDataset().getBlockReports(bpid);

// Send block report
StorageBlockReport[] reports =
new StorageBlockReport[perVolumeBlockLists.size()];

int reportIndex = 0;
for(Map.Entry kvPair : 
perVolumeBlockLists.entrySet()) {
DatanodeStorage dnStorage = kvPair.getKey();
BlockListAsLongs blockList = kvPair.getValue();
reports[reportIndex++] =
new StorageBlockReport(dnStorage, blockList);
}

cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
new BlockReportContext(1, 0, System.nanoTime(), 0, false));
{code})

> TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)
> -
>
> Key: HDFS-11030
> URL: https://issues.apache.org/jira/browse/HDFS-11030
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, test
>Affects Versions: 2.7.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
> blocks and files are replicated correctly.
> # To fail a volume, it deletes all the blocks and sets the data dir read only.
> {code:title=testVolumeFailure() snippet}
> // fail the volume
> // delete/make non-writable one of the directories (failed volume)
> data_fail = new File(dataDir, "data3");
> failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
> cluster.getNamesystem().getBlockPoolId());
> if (failedDir.exists() &&
> //!FileUtil.fullyDelete(failedDir)
> !deteteBlocks(failedDir)
> ) {
>   throw new IOException("Could not delete hdfs directory '" + failedDir + 
> "'");
> }
> data_fail.setReadOnly();
> failedDir.setReadOnly();
> {code}
> However, there are two bugs here, which make the blocks not deleted.
> #- The {{failedDir}} directory for finalized blocks is not calculated 
> correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
> directory.
> #- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
> there is no subdirectories in the data dir. This assumption was also noted in 
> the comments.
> {quote}
> // we use only small number of blocks to avoid creating subdirs in the 
> data dir..
> {quote}
> This is not true. On my local cluster and MiniDFSCluster, there will be 
> subdir0/subdir0/ two level directories regardless of the number of blocks.
> # Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
> the volume and send block report to NN. This is basically in the 
> {{triggerFailure()}} method.
> {code}
>   private void triggerFailure(String path, long size) throws IOException {
> NamenodeProtocols nn = cluster.getNameNodeRpc();
> List locatedBlocks =
>   nn.getBlockLocations(path, 0, size).getLocatedBlocks();
> 
> for (LocatedBlock lb : locatedBlocks) {
>   DatanodeInfo dinfo = lb.getLocations()[1];
>   ExtendedBlock b = lb.getBlock();
>   try {
> accessBlock(dinfo, lb);
>   } catch (IOException e) {
> System.out.println("Failure triggered, on block: " + b.getBlockId() + 
>  
> "; corresponding volume should be removed by now");
> break;
>   }
> }
>   }
> {code}
> Accessing those blocks will not trigger failures if the directory is 
> read-only (while the block files are all there). I ran the tests multiple 
> times without triggering this failure. We have to write new block files to 
> the data directories, or we should have deleted the blocks correctly.
> # To make sure the NameNode be aware of the volume failure, the code 
> explictily send block reports to NN.
> {code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
> cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
> new BlockReportContext(1, 0, System.nanoTime(), 0, false));
> {code}
> Generating block report code is complex, which is actually the internal logic 
> of {{BPServiceActor}}. We may have to update this code it changes. In fact, 
> the volume 

[jira] [Updated] (HDFS-11030) TestDataNodeVolumeFailure#testVolumeFailure is flaky (though passing)

2016-10-20 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11030:
-
Description: 
TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
blocks and files are replicated correctly.

# To fail a volume, it deletes all the blocks and sets the data dir read only.
{code:title=testVolumeFailure() snippet}
// fail the volume
// delete/make non-writable one of the directories (failed volume)
data_fail = new File(dataDir, "data3");
failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
cluster.getNamesystem().getBlockPoolId());
if (failedDir.exists() &&
//!FileUtil.fullyDelete(failedDir)
!deteteBlocks(failedDir)
) {
  throw new IOException("Could not delete hdfs directory '" + failedDir + 
"'");
}
data_fail.setReadOnly();
failedDir.setReadOnly();
{code}
However, there are two bugs here, which make the blocks not deleted.
#- The {{failedDir}} directory for finalized blocks is not calculated 
correctly. It should use {{data_fail}} instead of {{dataDir}} as the base 
directory.
#- When deleting block files in {{deteteBlocks(failedDir)}}, it assumes that 
there is no subdirectories in the data dir. This assumption was also noted in 
the comments.
{quote}
// we use only small number of blocks to avoid creating subdirs in the data 
dir..
{quote}
This is not true. On my local cluster and MiniDFSCluster, there will be 
subdir0/subdir0/ two level directories regardless of the number of blocks.
# Meanwhile, to fail a volume, it also needs to trigger the DataNode removing 
the volume and send block report to NN. This is basically in the 
{{triggerFailure()}} method.
{code}
  private void triggerFailure(String path, long size) throws IOException {
NamenodeProtocols nn = cluster.getNameNodeRpc();
List locatedBlocks =
  nn.getBlockLocations(path, 0, size).getLocatedBlocks();

for (LocatedBlock lb : locatedBlocks) {
  DatanodeInfo dinfo = lb.getLocations()[1];
  ExtendedBlock b = lb.getBlock();
  try {
accessBlock(dinfo, lb);
  } catch (IOException e) {
System.out.println("Failure triggered, on block: " + b.getBlockId() +  
"; corresponding volume should be removed by now");
break;
  }
}
  }
{code}
Accessing those blocks will not trigger failures if the directory is read-only 
(while the block files are all there). I ran the tests multiple times without 
triggering this failure. We have to write new block files to the data 
directories, or we should have deleted the blocks correctly.
# To make sure the NameNode be aware of the volume failure, the code explictily 
send block reports to NN.
{code:title=TestDataNodeVolumeFailure#testVolumeFailure()}
cluster.getNameNodeRpc().blockReport(dnR, bpid, reports,
new BlockReportContext(1, 0, System.nanoTime(), 0, false));
{code}
Generating block report code is complex, which is actually the internal logic 
of {{BPServiceActor}}. We may have to update this code it changes. In fact, the 
volume failure is now sent by DataNode via heartbeats. We should trigger a 
heartbeat request here; and make sure the NameNode handles the heartbeat before 
we verify the block states.
# When verifying via {{verify()}}, it counts the real block files and assert 
that real block files plus underreplicated blocks should cover all blocks. 
Before counting underreplicated blocks, it triggered the {{BlockManager}} to 
compute the datanode work:
{code}
// force update of all the metric counts by calling computeDatanodeWork
BlockManagerTestUtil.getComputedDatanodeWork(fsn.getBlockManager());
{code}
However, counting physical block files and underreplicated blocks are not 
atomic. The NameNode will inform of the DataNode the computed work at next 
heartbeat. So I think this part of code may fail when some blocks are 
replicated and the number of physical block files is made stale. To avoid this 
case, I think we should keep the DataNode from sending the heartbeat after 
that. A simple solution is to set {{dfs.heartbeat.interval}} long enough.


This unit test has been there for years and it seldom fails, just because it's 
never triggered a real volume failure.


  was:
TestDataNodeVolumeFailure#testVolumeFailure fails a volume and verifies the 
blocks and files are replicated correctly.

To fail a volume, it deletes all the blocks and sets the data dir read only.
{code:title=testVolumeFailure() snippet}
// fail the volume
// delete/make non-writable one of the directories (failed volume)
data_fail = new File(dataDir, "data3");
failedDir = MiniDFSCluster.getFinalizedDir(dataDir, 
cluster.getNamesystem().getBlockPoolId());
if (failedDir.exists() &&
//!FileUtil.fullyDelete(failedDir)
!deteteBlocks(failedDir)
) {
  throw new IOException("Could 

[jira] [Updated] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10423:
-
   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Committed to branch-2. Thanks everyone for the contribution.

> Increase default value of httpfs maxHttpHeaderSize
> --
>
> Key: HDFS-10423
> URL: https://issues.apache.org/jira/browse/HDFS-10423
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.6.4, 3.0.0-alpha1
>Reporter: Nicolae Popa
>Assignee: Nicolae Popa
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-10423.01.patch, HDFS-10423.02.patch, 
> HDFS-10423.branch-2.patch, testing-after-HDFS-10423.txt, 
> testing-after-HDFS-10423_withCustomHeader4.txt, 
> testing-before-HDFS-10423.txt
>
>
> The Tomcat default value of maxHttpHeaderSize is 8k, which is too low for 
> certain Hadoop workloads in kerberos enabled environments. This JIRA will to 
> change it to 65536 in server.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6440) Support more than 2 NameNodes

2016-10-20 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592925#comment-15592925
 ] 

Jesse Yates commented on HDFS-6440:
---

Upgrades/downgrades between major versions isn't supported AFAIK. Those seem 
like the major 2 places for upgrade issues.

> Support more than 2 NameNodes
> -
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: auto-failover, ha, namenode
>Affects Versions: 2.4.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 3.0.0-alpha1
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
> hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
> hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6440) Support more than 2 NameNodes

2016-10-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592918#comment-15592918
 ] 

Arpit Agarwal commented on HDFS-6440:
-

Hi [~jesse_yates], was it a design goal to ensure compatibility for rolling 
upgrades/downgrades? Alternatively do you know of anything that can result in 
upgrade incompatibilities?

>From a quick look at the patch I saw two potential sources of incompatibility 
>but haven't analyzed closely enough to be sure - (1) changes to the image 
>transfer protocol and (2) BlockTokenSecretManager index range partitioning.

> Support more than 2 NameNodes
> -
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: auto-failover, ha, namenode
>Affects Versions: 2.4.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 3.0.0-alpha1
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-6440-trunk-v4.patch, 
> hdfs-6440-trunk-v5.patch, hdfs-6440-trunk-v6.patch, hdfs-6440-trunk-v7.patch, 
> hdfs-6440-trunk-v8.patch, hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10976) Report erasure coding policy of EC files in Fsck

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592894#comment-15592894
 ] 

Hudson commented on HDFS-10976:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10648 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10648/])
HDFS-10976. Report erasure coding policy of EC files in Fsck. (weichiu: rev 
5e83a21cb66c78e89ac5af9a130ab0aee596a9f4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java


> Report erasure coding policy of EC files in Fsck
> 
>
> Key: HDFS-10976
> URL: https://issues.apache.org/jira/browse/HDFS-10976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10976.001.patch, HDFS-10976.002.patch
>
>
> Currently fsck reports corrupt EC files as follows: it does not distinguish 
> erasure coded files versus replicated files. In addition, it would be nice to 
> print out the ec policy of the corrupt EC file.
> {quote}
> /striped 
> /striped/corrupted 393216 bytes, 1 block(s): 
> /striped/corrupted: CORRUPT blockpool BP-1564681138-127.0.0.1-1475793860787 
> block blk_-9223372036854775792
>  Under replicated 
> BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001. Target 
> Replicas is 9 but found 5 live replica(s), 0 decommissioned replica(s) and 0 
> decommissioning replica(s).
>  CORRUPT 1 blocks of total size 393216 B
> 0. BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001 
> len=393216 Live_repl=5  
> [DatanodeInfoWithStorage[127.0.0.1:62192,DS-81dcbc38-755e-446e-a028-71a79e4de6d9,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62180,DS-98fe193d-6342-4b2c-ad61-4586e2530b1e,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62167,DS-53031f88-0c63-4839-ab18-efea2f1bb063,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62162,DS-e8b418fd-165d-4d6f-886b-f75c21be096d,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62176,DS-03e51584-5b33-4bb6-89b5-f519cda57429,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62158,DS-ce9ca7b3-5b00-4351-8537-822eed532b46,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62184,DS-668c500e-eb3d-4d40-b900-814076d5e160,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62171,DS-763a3961-b214-4601-81c0-abdaecf539c4,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62188,DS-d4ea6399-bead-452e-8dd5-bc8c5ebd4f45,DISK](LIVE)]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11018) Incorrect check and message in FsDatasetImpl#invalidate

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11018:
---
Release Note: Improves the error message when datanode removes a replica 
which is not found.

> Incorrect check and message in FsDatasetImpl#invalidate
> ---
>
> Key: HDFS-11018
> URL: https://issues.apache.org/jira/browse/HDFS-11018
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11018.001.patch, HDFS-11018.002.patch, 
> HDFS-11018.003.patch
>
>
> The following error check and message is incorrect, because {{info}} is null 
> if (1) the block id does not exist in ReplicaMap or (2) the generation stamp 
> of block does not match the replica entry in ReplicaMap.
> {code:title=FsDatasetImpl#invalidate}
>final ReplicaInfo info = volumeMap.get(bpid, invalidBlks[i]);
> if (info == null) {
>   // It is okay if the block is not found -- it may be deleted 
> earlier.
>   LOG.info("Failed to delete replica " + invalidBlks[i]
>   + ": ReplicaInfo not found.");
>   continue;
> }
> if (info.getGenerationStamp() != invalidBlks[i].getGenerationStamp()) 
> {
>   errors.add("Failed to delete replica " + invalidBlks[i]
>   + ": GenerationStamp not matched, info=" + info);
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10976) Report erasure coding policy of EC files in Fsck

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10976:
---
Release Note: Fsck now reports whether a file is replicated and 
erasure-coded. If it is replicated, fsck reports replication factor of the 
file. If it is erasure coded, fsck reports the erasure coding policy of the 
file.

> Report erasure coding policy of EC files in Fsck
> 
>
> Key: HDFS-10976
> URL: https://issues.apache.org/jira/browse/HDFS-10976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10976.001.patch, HDFS-10976.002.patch
>
>
> Currently fsck reports corrupt EC files as follows: it does not distinguish 
> erasure coded files versus replicated files. In addition, it would be nice to 
> print out the ec policy of the corrupt EC file.
> {quote}
> /striped 
> /striped/corrupted 393216 bytes, 1 block(s): 
> /striped/corrupted: CORRUPT blockpool BP-1564681138-127.0.0.1-1475793860787 
> block blk_-9223372036854775792
>  Under replicated 
> BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001. Target 
> Replicas is 9 but found 5 live replica(s), 0 decommissioned replica(s) and 0 
> decommissioning replica(s).
>  CORRUPT 1 blocks of total size 393216 B
> 0. BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001 
> len=393216 Live_repl=5  
> [DatanodeInfoWithStorage[127.0.0.1:62192,DS-81dcbc38-755e-446e-a028-71a79e4de6d9,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62180,DS-98fe193d-6342-4b2c-ad61-4586e2530b1e,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62167,DS-53031f88-0c63-4839-ab18-efea2f1bb063,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62162,DS-e8b418fd-165d-4d6f-886b-f75c21be096d,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62176,DS-03e51584-5b33-4bb6-89b5-f519cda57429,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62158,DS-ce9ca7b3-5b00-4351-8537-822eed532b46,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62184,DS-668c500e-eb3d-4d40-b900-814076d5e160,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62171,DS-763a3961-b214-4601-81c0-abdaecf539c4,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62188,DS-d4ea6399-bead-452e-8dd5-bc8c5ebd4f45,DISK](LIVE)]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10976) Report erasure coding policy of EC files in Fsck

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10976:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks to [~andrew.wang] and [~tasanuma0829] for reviewing the patch. Committed 
the v002 patch to trunk.

> Report erasure coding policy of EC files in Fsck
> 
>
> Key: HDFS-10976
> URL: https://issues.apache.org/jira/browse/HDFS-10976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-10976.001.patch, HDFS-10976.002.patch
>
>
> Currently fsck reports corrupt EC files as follows: it does not distinguish 
> erasure coded files versus replicated files. In addition, it would be nice to 
> print out the ec policy of the corrupt EC file.
> {quote}
> /striped 
> /striped/corrupted 393216 bytes, 1 block(s): 
> /striped/corrupted: CORRUPT blockpool BP-1564681138-127.0.0.1-1475793860787 
> block blk_-9223372036854775792
>  Under replicated 
> BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001. Target 
> Replicas is 9 but found 5 live replica(s), 0 decommissioned replica(s) and 0 
> decommissioning replica(s).
>  CORRUPT 1 blocks of total size 393216 B
> 0. BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001 
> len=393216 Live_repl=5  
> [DatanodeInfoWithStorage[127.0.0.1:62192,DS-81dcbc38-755e-446e-a028-71a79e4de6d9,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62180,DS-98fe193d-6342-4b2c-ad61-4586e2530b1e,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62167,DS-53031f88-0c63-4839-ab18-efea2f1bb063,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62162,DS-e8b418fd-165d-4d6f-886b-f75c21be096d,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62176,DS-03e51584-5b33-4bb6-89b5-f519cda57429,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62158,DS-ce9ca7b3-5b00-4351-8537-822eed532b46,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62184,DS-668c500e-eb3d-4d40-b900-814076d5e160,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62171,DS-763a3961-b214-4601-81c0-abdaecf539c4,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62188,DS-d4ea6399-bead-452e-8dd5-bc8c5ebd4f45,DISK](LIVE)]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10976) Report erasure coding policy of EC files in Fsck

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10976:
---
Fix Version/s: 3.0.0-alpha2

> Report erasure coding policy of EC files in Fsck
> 
>
> Key: HDFS-10976
> URL: https://issues.apache.org/jira/browse/HDFS-10976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10976.001.patch, HDFS-10976.002.patch
>
>
> Currently fsck reports corrupt EC files as follows: it does not distinguish 
> erasure coded files versus replicated files. In addition, it would be nice to 
> print out the ec policy of the corrupt EC file.
> {quote}
> /striped 
> /striped/corrupted 393216 bytes, 1 block(s): 
> /striped/corrupted: CORRUPT blockpool BP-1564681138-127.0.0.1-1475793860787 
> block blk_-9223372036854775792
>  Under replicated 
> BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001. Target 
> Replicas is 9 but found 5 live replica(s), 0 decommissioned replica(s) and 0 
> decommissioning replica(s).
>  CORRUPT 1 blocks of total size 393216 B
> 0. BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001 
> len=393216 Live_repl=5  
> [DatanodeInfoWithStorage[127.0.0.1:62192,DS-81dcbc38-755e-446e-a028-71a79e4de6d9,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62180,DS-98fe193d-6342-4b2c-ad61-4586e2530b1e,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62167,DS-53031f88-0c63-4839-ab18-efea2f1bb063,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62162,DS-e8b418fd-165d-4d6f-886b-f75c21be096d,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62176,DS-03e51584-5b33-4bb6-89b5-f519cda57429,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62158,DS-ce9ca7b3-5b00-4351-8537-822eed532b46,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62184,DS-668c500e-eb3d-4d40-b900-814076d5e160,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62171,DS-763a3961-b214-4601-81c0-abdaecf539c4,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62188,DS-d4ea6399-bead-452e-8dd5-bc8c5ebd4f45,DISK](LIVE)]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10976) Report erasure coding policy of EC files in Fsck

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10976:
---
Summary: Report erasure coding policy of EC files in Fsck  (was: Fsck 
should mark EC files explicitly)

> Report erasure coding policy of EC files in Fsck
> 
>
> Key: HDFS-10976
> URL: https://issues.apache.org/jira/browse/HDFS-10976
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-10976.001.patch, HDFS-10976.002.patch
>
>
> Currently fsck reports corrupt EC files as follows: it does not distinguish 
> erasure coded files versus replicated files. In addition, it would be nice to 
> print out the ec policy of the corrupt EC file.
> {quote}
> /striped 
> /striped/corrupted 393216 bytes, 1 block(s): 
> /striped/corrupted: CORRUPT blockpool BP-1564681138-127.0.0.1-1475793860787 
> block blk_-9223372036854775792
>  Under replicated 
> BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001. Target 
> Replicas is 9 but found 5 live replica(s), 0 decommissioned replica(s) and 0 
> decommissioning replica(s).
>  CORRUPT 1 blocks of total size 393216 B
> 0. BP-1564681138-127.0.0.1-1475793860787:blk_-9223372036854775792_1001 
> len=393216 Live_repl=5  
> [DatanodeInfoWithStorage[127.0.0.1:62192,DS-81dcbc38-755e-446e-a028-71a79e4de6d9,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62180,DS-98fe193d-6342-4b2c-ad61-4586e2530b1e,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62167,DS-53031f88-0c63-4839-ab18-efea2f1bb063,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62162,DS-e8b418fd-165d-4d6f-886b-f75c21be096d,DISK](CORRUPT),
>  
> DatanodeInfoWithStorage[127.0.0.1:62176,DS-03e51584-5b33-4bb6-89b5-f519cda57429,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62158,DS-ce9ca7b3-5b00-4351-8537-822eed532b46,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62184,DS-668c500e-eb3d-4d40-b900-814076d5e160,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62171,DS-763a3961-b214-4601-81c0-abdaecf539c4,DISK](LIVE),
>  
> DatanodeInfoWithStorage[127.0.0.1:62188,DS-d4ea6399-bead-452e-8dd5-bc8c5ebd4f45,DISK](LIVE)]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11033) Add documents for native raw erasure coder in XOR codes

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592827#comment-15592827
 ] 

Wei-Chiu Chuang commented on HDFS-11033:


Thanks for the patch.
The last patch looks good to me. Would any one else like to comment?


> Add documents for native raw erasure coder in XOR codes
> ---
>
> Key: HDFS-11033
> URL: https://issues.apache.org/jira/browse/HDFS-11033
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation, erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-11033-v1.patch, HDFS-11033-v2.patch
>
>
> Add document for native raw erasure coder in XOR codes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592812#comment-15592812
 ] 

Hadoop QA commented on HDFS-10423:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
7s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 7s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
40s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed with JDK 
v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-10423 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834464/HDFS-10423.branch-2.patch
 |
| Optional Tests |  asflicense  mvnsite  unit  shellcheck  shelldocs  compile  
javac  javadoc  mvninstall  xml  |
| uname | Linux c47ccad9ccdd 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / ab36519 |
| Default Java | 1.7.0_111 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 |
| shellcheck | v0.4.4 |
| 

[jira] [Updated] (HDFS-11018) Incorrect check and message in FsDatasetImpl#invalidate

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11018:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha2
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-2.8 and branch-2. Thanks [~linyiqun] for the 
patch!

> Incorrect check and message in FsDatasetImpl#invalidate
> ---
>
> Key: HDFS-11018
> URL: https://issues.apache.org/jira/browse/HDFS-11018
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11018.001.patch, HDFS-11018.002.patch, 
> HDFS-11018.003.patch
>
>
> The following error check and message is incorrect, because {{info}} is null 
> if (1) the block id does not exist in ReplicaMap or (2) the generation stamp 
> of block does not match the replica entry in ReplicaMap.
> {code:title=FsDatasetImpl#invalidate}
>final ReplicaInfo info = volumeMap.get(bpid, invalidBlks[i]);
> if (info == null) {
>   // It is okay if the block is not found -- it may be deleted 
> earlier.
>   LOG.info("Failed to delete replica " + invalidBlks[i]
>   + ": ReplicaInfo not found.");
>   continue;
> }
> if (info.getGenerationStamp() != invalidBlks[i].getGenerationStamp()) 
> {
>   errors.add("Failed to delete replica " + invalidBlks[i]
>   + ": GenerationStamp not matched, info=" + info);
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2016-10-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592678#comment-15592678
 ] 

Andrew Wang commented on HDFS-7343:
---

Thanks for the replies Wei Zhou, inline:

{quote}
 To solve this, we make IO statistics from both HDFS level and system level 
(system wide, like data given by system tool ‘iostat’). IO caused by SCR can be 
measured from system level data.
{quote}

Just iostat-level data won't tell you what file or file range is being read 
though. Do you also plan to capturing strace or other information? What's the 
performance impact?

{quote}
For SSM it have to consider the factors like file length, access history and 
memory availability before making a decision to cache a file. It tries to 
minimize the possibility to cache a file that not needed to.
{quote}

{quote}
For performance consideration, SSM makes the first action have higher priority 
than the second one. It depends and not always the case.
{quote}

Unfortunately this doesn't really clarify things for me. The rules engine is 
the most important part of this work, and if it's a blackbox, it's much harder 
for admins to use it effectively. This is especially true in debugging 
scenarios when the rules engine isn't doing what the admin wants.

Have we done any prototyping and simulation of the rules engine? There are 
workload generators like SWIM which could be useful here.

{quote}
For example, Jingcheng Du and I did a study on HSM last year, we found that the 
throughput of cluster with 4 SSDs + 4 HDDs on each DN is 1.36x larger than 
cluster with 8 HDDs on each DN, it’s almost as good as cluster with 8 SSDs on 
each DN.
{quote}

Thanks for the reference. Some questions about this study though:

* What is the comparative cost-per-byte of SSD vs. HDD? I'm pretty sure it's 
greater than 1.36x, meaning we might be better off buying more HDD to get more 
throughput. Alternatively, buying more RAM depending on the dataset size.
* This is an example of application-specific tuning for HSM, which is a best 
case. If the SSM doesn't correctly recognize the workload pattern, we won't 
achieve the full 1.36x improvement.
* I'm also unable to find the "com.yahoo.ycsb.workloads.CareWorkload" 
mentioned, do you have a reference?

{quote}
For example, the amount of memory available has to be checked before caching a 
file, if not enough memory available then the action will be canceled.
{quote}

The NameNode already does this checking, so it seems better for the enforcement 
of quotas to be done in one place for consistency.

A related question, how does this interact with user-generated actions? Maybe a 
user changes some files from ALL_SSD to DISK since they want to free up SSD 
quota for an important job they're going to run later. The SSM then sees there 
is available SSD and uses it. Then the user is out of SSD quota and their 
important job runs slowly.

{quote}
SSM pays more attention on the efficiency of the whole cluster than a 
particular workload, it may not improve the end-to-end execution time of one 
workload but it may improve another workload in the cluster. 
{quote}

Sorry to be this direct, but is improving average cluster throughput useful to 
end users? Broadly speaking, for ad-hoc user-submitted jobs, you care about 
end-to-end latency. For large batch jobs, they aren't that performance 
sensitive, and their working sets are unlikely to fit in memory/SSD anyway. In 
this case, we care very much about improving a particular workload.

I'll end with some overall comments:

If the goal is improving performance, would our time be better spent on the I/O 
paths, HSM and caching? I mentioned sub-block caching and client-side metrics 
as potential improvements for in-memory caching. Integrating it with the 
storage policies API and YARN resource management would also be useful. I'm 
sure there's work to be done in the I/O path too, particularly the write path 
which hasn't seen as much love as reads. This means we'd get more upside from 
fast storage like SSD.

I'm also not convinced that our average I/O utilization is that high to begin 
with. Typical YARN CPU utilization is <50%, and that's with many jobs being CPU 
bound. On the storage side, most clusters are capacity-bound. Optimistic 
scheduling and better resource isolation might lead to big improvements here.

I'm also concerned about scope creep, particularly since the replies to my 
comments indicate a system even bigger than the one described in the design 
document. It involves:

* A policy engine that can understand a wide variety of OS, HDFS, and 
application-level hints and performance metrics, as well as additional 
constraints from user-provided rules, system quotas, and data movement costs.
* Adding a metrics collection system for OS-level metrics which needs to be 
operated, managed, and deployed.
* The SSM itself, which is a stateful service, which again needs 

[jira] [Commented] (HDFS-11018) Incorrect check and message in FsDatasetImpl#invalidate

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592572#comment-15592572
 ] 

Hudson commented on HDFS-11018:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10646 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10646/])
HDFS-11018. Incorrect check and message in FsDatasetImpl#invalidate. (weichiu: 
rev 6d2da38d16cebe9b82f1048f87127eecee33664c)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


> Incorrect check and message in FsDatasetImpl#invalidate
> ---
>
> Key: HDFS-11018
> URL: https://issues.apache.org/jira/browse/HDFS-11018
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Attachments: HDFS-11018.001.patch, HDFS-11018.002.patch, 
> HDFS-11018.003.patch
>
>
> The following error check and message is incorrect, because {{info}} is null 
> if (1) the block id does not exist in ReplicaMap or (2) the generation stamp 
> of block does not match the replica entry in ReplicaMap.
> {code:title=FsDatasetImpl#invalidate}
>final ReplicaInfo info = volumeMap.get(bpid, invalidBlks[i]);
> if (info == null) {
>   // It is okay if the block is not found -- it may be deleted 
> earlier.
>   LOG.info("Failed to delete replica " + invalidBlks[i]
>   + ": ReplicaInfo not found.");
>   continue;
> }
> if (info.getGenerationStamp() != invalidBlks[i].getGenerationStamp()) 
> {
>   errors.add("Failed to delete replica " + invalidBlks[i]
>   + ": GenerationStamp not matched, info=" + info);
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11018) Incorrect check and message in FsDatasetImpl#invalidate

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592523#comment-15592523
 ] 

Wei-Chiu Chuang commented on HDFS-11018:


Committed this to trunk. I'll commit this to branch-2 and branch-2.8 later.

> Incorrect check and message in FsDatasetImpl#invalidate
> ---
>
> Key: HDFS-11018
> URL: https://issues.apache.org/jira/browse/HDFS-11018
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Attachments: HDFS-11018.001.patch, HDFS-11018.002.patch, 
> HDFS-11018.003.patch
>
>
> The following error check and message is incorrect, because {{info}} is null 
> if (1) the block id does not exist in ReplicaMap or (2) the generation stamp 
> of block does not match the replica entry in ReplicaMap.
> {code:title=FsDatasetImpl#invalidate}
>final ReplicaInfo info = volumeMap.get(bpid, invalidBlks[i]);
> if (info == null) {
>   // It is okay if the block is not found -- it may be deleted 
> earlier.
>   LOG.info("Failed to delete replica " + invalidBlks[i]
>   + ": ReplicaInfo not found.");
>   continue;
> }
> if (info.getGenerationStamp() != invalidBlks[i].getGenerationStamp()) 
> {
>   errors.add("Failed to delete replica " + invalidBlks[i]
>   + ": GenerationStamp not matched, info=" + info);
>   continue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592508#comment-15592508
 ] 

Hadoop QA commented on HDFS-10423:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m 
10s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  3m 
26s{color} | {color:red} root in branch-2 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 9s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
41s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed with JDK 
v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-10423 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834464/HDFS-10423.branch-2.patch
 |
| Optional Tests |  asflicense  mvnsite  unit  shellcheck  shelldocs  compile  
javac  javadoc  mvninstall  xml  |
| uname | Linux af4353419799 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / bf6379f |
| Default Java | 1.7.0_111 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 |
| mvninstall | 

[jira] [Resolved] (HDFS-11019) Inconsistent number of corrupt replicas if a corrupt replica is reported multiple times

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-11019.

Resolution: Duplicate

I am pretty sure this is a dup of HDFS-9958 . Thanks [~kshukla] for confirming 
this!

> Inconsistent number of corrupt replicas if a corrupt replica is reported 
> multiple times
> ---
>
> Key: HDFS-11019
> URL: https://issues.apache.org/jira/browse/HDFS-11019
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
> Environment: CDH5.7.2 
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-11019.test.patch
>
>
> While investigating a block corruption issue, I found the following warning 
> message in the namenode log:
> {noformat}
> (a client reports a block replica is corrupt)
> 2016-10-12 10:07:37,166 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1073803461 added as corrupt on 
> 10.0.0.63:50010 by /10.0.0.62  because client machine reported it
> 2016-10-12 10:07:37,166 INFO BlockStateChange: BLOCK* invalidateBlock: 
> blk_1073803461_74513(stored=blk_1073803461_74553) on 10.0.0.63:50010
> 2016-10-12 10:07:37,166 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
> blk_1073803461_74513 to 10.0.0.63:50010
> (another client reports a block replica is corrupt)
> 2016-10-12 10:07:37,728 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1073803461 added as corrupt on 
> 10.0.0.63:50010 by /10.0.0.64  because client machine reported it
> 2016-10-12 10:07:37,728 INFO BlockStateChange: BLOCK* invalidateBlock: 
> blk_1073803461_74513(stored=blk_1073803461_74553) on 10.0.0.63:50010
> (ReplicationMonitor thread kicks in to invalidate the replica and add a new 
> one)
> 2016-10-12 10:07:37,888 INFO BlockStateChange: BLOCK* ask 10.0.0.56:50010 to 
> replicate blk_1073803461_74553 to datanode(s) 10.0.0.63:50010
> 2016-10-12 10:07:37,888 INFO BlockStateChange: BLOCK* BlockManager: ask 
> 10.0.0.63:50010 to delete [blk_1073803461_74513]
> (the two maps are inconsistent)
> 2016-10-12 10:08:00,335 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Inconsistent 
> number of corrupt replicas for blk_1073803461_74553 blockMap has 0 but 
> corrupt replicas map has 1
> {noformat}
> It seems that when a corrupt block replica is reported twice, blockMap 
> corrupt and corrupt replica map becomes inconsistent.
> Looking at the log, I suspect the bug is in 
> {{BlockManager#removeStoredBlock}}. When a corrupt replica is reported, 
> BlockManager removes the block from blocksMap. If the block is already 
> removed (that is, the corrupt replica is reported twice), return; Otherwise 
> (that is, the corrupt replica is reported the first time), remove the block 
> from corruptReplicasMap (The block is added into corruptReplicasMap in 
> BlockerManager#markBlockAsCorrupt) Therefore, after the second corruption 
> report, the corrupt replica is removed from blocksMap, but the one in 
> corruptReplicasMap is not removed.
> I can’t tell what’s the impact that they are inconsistent. But I feel it's a 
> good idea to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10423:
-
Status: Patch Available  (was: Reopened)

> Increase default value of httpfs maxHttpHeaderSize
> --
>
> Key: HDFS-10423
> URL: https://issues.apache.org/jira/browse/HDFS-10423
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1, 2.6.4
>Reporter: Nicolae Popa
>Assignee: Nicolae Popa
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10423.01.patch, HDFS-10423.02.patch, 
> HDFS-10423.branch-2.patch, testing-after-HDFS-10423.txt, 
> testing-after-HDFS-10423_withCustomHeader4.txt, 
> testing-before-HDFS-10423.txt
>
>
> The Tomcat default value of maxHttpHeaderSize is 8k, which is too low for 
> certain Hadoop workloads in kerberos enabled environments. This JIRA will to 
> change it to 65536 in server.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10423:
-
Attachment: HDFS-10423.branch-2.patch

Sorry, reopening to run a pre-commit. Attaching a branch-2 patch.

> Increase default value of httpfs maxHttpHeaderSize
> --
>
> Key: HDFS-10423
> URL: https://issues.apache.org/jira/browse/HDFS-10423
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.6.4, 3.0.0-alpha1
>Reporter: Nicolae Popa
>Assignee: Nicolae Popa
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10423.01.patch, HDFS-10423.02.patch, 
> HDFS-10423.branch-2.patch, testing-after-HDFS-10423.txt, 
> testing-after-HDFS-10423_withCustomHeader4.txt, 
> testing-before-HDFS-10423.txt
>
>
> The Tomcat default value of maxHttpHeaderSize is 8k, which is too low for 
> certain Hadoop workloads in kerberos enabled environments. This JIRA will to 
> change it to 65536 in server.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-10423) Increase default value of httpfs maxHttpHeaderSize

2016-10-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen reopened HDFS-10423:
--

> Increase default value of httpfs maxHttpHeaderSize
> --
>
> Key: HDFS-10423
> URL: https://issues.apache.org/jira/browse/HDFS-10423
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.6.4, 3.0.0-alpha1
>Reporter: Nicolae Popa
>Assignee: Nicolae Popa
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10423.01.patch, HDFS-10423.02.patch, 
> HDFS-10423.branch-2.patch, testing-after-HDFS-10423.txt, 
> testing-after-HDFS-10423_withCustomHeader4.txt, 
> testing-before-HDFS-10423.txt
>
>
> The Tomcat default value of maxHttpHeaderSize is 8k, which is too low for 
> certain Hadoop workloads in kerberos enabled environments. This JIRA will to 
> change it to 65536 in server.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10885) [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on

2016-10-20 Thread Wei Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592413#comment-15592413
 ] 

Wei Zhou commented on HDFS-10885:
-

Hi [~umamaheswararao], thank you very much for reviewing this patch.
{quote}
I think depending on config option really does not work here. Because Mover can 
run from any process where config items can be different from Namenode. So, 
mover may have this item disabled in its configs, but at NN it might be true 
and running.
{quote}
Yes, that's indeed a big issue needs to be fixed.
{quote}
How about we use mover id file for communicating this. 
{quote}
I do think it's a very good way, but what we should do when user disables the 
Xattr feature through {{dfs.namenode.xattrs.enabled}}? Is there any potential 
issue if we ask user to enable the feature as a must? 




> [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier 
> is on
> --
>
> Key: HDFS-10885
> URL: https://issues.apache.org/jira/browse/HDFS-10885
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: HDFS-10285
>
> Attachments: HDFS-10800-HDFS-10885-00.patch, 
> HDFS-10800-HDFS-10885-01.patch, HDFS-10800-HDFS-10885-02.patch, 
> HDFS-10885-HDFS-10285.03.patch, HDFS-10885-HDFS-10285.04.patch
>
>
> These two can not work at the same time to avoid conflicts and fight with 
> each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2016-10-20 Thread Wei Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592342#comment-15592342
 ] 

Wei Zhou commented on HDFS-7343:


Continue on comments from [~andrew.wang],
{quote}
Could you talk a little bit more about the rules solver? What happens when a 
rule cannot be satisfied?
{quote}
A rule is a declaration which defines actions to be implied on some objects 
under certain condition. It’s a guideline for SSM to function. Rule solver 
parses a rule and takes the action specified if its predefined condition 
fulfilled. But this does not mean that the action will be executed physically. 
It depends on many factors. For example, the amount of memory available has to 
be checked before caching a file, if not enough memory available then the 
action will be canceled. 
>From above a rule is essentially a hint for SSM. When a rule cannot be 
>satisfied SSM can write some logs with the reason to log 
>file/console/dashboard. Admin can check those info for further processing.
{quote}
improve average throughput, but not improve end-to-end execution time (the SLO 
metric).
{quote}
SSM pays more attention on the efficiency of the whole cluster than a 
particular workload, it may not improve the end-to-end execution time of one 
workload but it may improve another workload in the cluster. Another case is 
that it won’t help for a CPU intensive workload though we do make some 
optimization to IO. To make SSM work better, we could expose some interface for 
workloads to provide hints to SSM. 
{quote}
Also on the rules solver, how do we quantify the cost of executing an action? 
It's important to avoid unnecessarily migrating data back and forth.
{quote}
It’s very hard to quantify the cost generally in a dynamic environment. Moving 
hot data to faster storage may impact the performance now but may boost it 
later. What we do now is trying to minimize the cost based on access history, 
current status of the cluster, rules and other mechanisms like hints from user. 
Restrict conditions have to be full filled (rules, cluster states, history, 
hints etc.) before actually executing an action. Generally, the greater the 
cost, the harder the conditions. For example, Actions like archive file and 
balance the cluster may depends higher on the rules or user’s hint compared 
with actions like cache a file. Yes, it's very important to avoid unnecessarily 
migrating data back and forth and SSM tries to minimize it at the very 
beginning.
{quote}
Could you talk some more about the value of Kafka in this architecture, 
compared to a naive implementation that just polls the NN and DN for 
information? 
HDFS's inotify mechanism might also be interesting here.
{quote}
Please reference the reply to #3 question from [~anu] also. For SSM:
1. It’s a message collector for SSM. It provides a high efficiency and reliable 
way for nodes to send messages out. If all the nodes send out messages to SSM 
directly then it’s very hard for SSM to handling issues such as message 
buffering, persist to avoid losing messages, unstable service time due to too 
many nodes and etc. It decouples the SSM from the cluster and let it can focus 
on the message processing logic.
2. It’s a message recorder for SSM. If SSM stopped by user or crashed while the 
HDFS cluster is still working, then messages from nodes can be stored in Kafka. 
These messages are good material for SSM to warm up quickly. Without Kafka then 
these precious data will be lost. It makes SSM more robust.
{quote}
Also wondering if with Kafka we still need a periodic snapshot of state, since 
Kafka is just a log.
{quote}
SSM snapshots the data digested from those raw logs and other info managed, but 
raw logs themselves are not stored. The data to be snapshotted is essential for 
SSM to function better. 
{quote}
The doc talks a lot about improving performance, but I think the more important 
usecase is actually saving cost by migrating data to archival or EC storage. 
This is because of the above difficulties surrounding actually understanding 
application-level performance with just FS-level information.
{quote}
Agreed that it’s an important use case and impossible for SSM itself to improve 
the performance for all cases as you mentioned. But it’s a trend that DNs will 
have larger memory and faster storage. How to make use of these hardwares to 
improve the performance is also an important issue to solve. For example, 
[~jingcheng...@intel.com] and I did a [study on HSM | 
http://blog.cloudera.com/blog/2016/06/new-study-evaluating-apache-hbase-performance-on-modern-storage-media/]
 last year, we found that the throughput of cluster with 4 SSDs + 4 HDDs on 
each DN is 1.36x larger than cluster with 8 HDDs on each DN, it’s almost as 
good as cluster with 8 SSDs on each DN. It’s also the same case for latency. So 
it should improve the performance by using the fast storage efficiently.  I 
think 

[jira] [Commented] (HDFS-11027) libhdfs++: Don't retry if there is an authentication failure

2016-10-20 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592340#comment-15592340
 ] 

James Clampffer commented on HDFS-11027:


[~xiaowei.zhu] I agree, no point in retrying if it's not something recoverable. 
 Did you want to extend your current patch to include that?

> libhdfs++: Don't retry if there is an authentication failure
> 
>
> Key: HDFS-11027
> URL: https://issues.apache.org/jira/browse/HDFS-11027
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-11027.HDFS-8707.000.patch
>
>
> "Authentication failed" status falls into the general !status.ok() block in 
> the HA retry policy so it will keep attempting to failover.  If the client 
> isn't kerberized, or doesn't have the right ticket it should give up and 
> return a meaningful error message (right now it returns a generic bad 
> connection failure string).
> Wouldn't hurt to check the FixedDelayRetryPolicy to make sure that doesn't 
> also keep attempting to retry in the same way.  I suspect it does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11027) libhdfs++: Don't retry if there is an authentication failure

2016-10-20 Thread Xiaowei Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592273#comment-15592273
 ] 

Xiaowei Zhu commented on HDFS-11027:


We should consider that if there are more cases we do not want to retry, for 
example permission denied.

> libhdfs++: Don't retry if there is an authentication failure
> 
>
> Key: HDFS-11027
> URL: https://issues.apache.org/jira/browse/HDFS-11027
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-11027.HDFS-8707.000.patch
>
>
> "Authentication failed" status falls into the general !status.ok() block in 
> the HA retry policy so it will keep attempting to failover.  If the client 
> isn't kerberized, or doesn't have the right ticket it should give up and 
> return a meaningful error message (right now it returns a generic bad 
> connection failure string).
> Wouldn't hurt to check the FixedDelayRetryPolicy to make sure that doesn't 
> also keep attempting to retry in the same way.  I suspect it does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10699) Log object instance get incorrectly in TestDFSAdmin

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592267#comment-15592267
 ] 

Hudson commented on HDFS-10699:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10644 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10644/])
HDFS-10699. Log object instance get incorrectly in TestDFSAdmin. (brahma: rev 
6fb6b651e8d3b58a903a792e7d55f73f8b4032d2)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java


> Log object instance get incorrectly in TestDFSAdmin
> ---
>
> Key: HDFS-10699
> URL: https://issues.apache.org/jira/browse/HDFS-10699
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10699.001.patch, HDFS-10699.002.patch
>
>
> In class TestDFSAdmin, it gets a incorrect object instance. The codes:
> {code}
>  public class TestDFSAdmin {
>private static final Log LOG = LogFactory.getLog(DFSAdmin.class);
>private Configuration conf = null;
>private MiniDFSCluster cluster;
>private DFSAdmin admin;
>...
> {code}
> Here the class name {{DFSAdmin}} should be {{TestDFSAdmin}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10627) Volume Scanner marks a block as "suspect" even if the exception is network-related

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592265#comment-15592265
 ] 

Hudson commented on HDFS-10627:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10644 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10644/])
HDFS-10627. Volume Scanner marks a block as "suspect" even if the (kihwal: rev 
5c0bffddc0cb824a8a2751bcd0dc3e15ce081727)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java


> Volume Scanner marks a block as "suspect" even if the exception is 
> network-related
> --
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection 
> reset")) {
>   LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
>   volumeRef.getVolume().getStorageID(),
>   block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception 
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the 
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go 
> through all the suspect blocks to scan one corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10699) Log object instance get incorrectly in TestDFSAdmin

2016-10-20 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10699:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks [~linyiqun] for your contribution.

> Log object instance get incorrectly in TestDFSAdmin
> ---
>
> Key: HDFS-10699
> URL: https://issues.apache.org/jira/browse/HDFS-10699
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10699.001.patch, HDFS-10699.002.patch
>
>
> In class TestDFSAdmin, it gets a incorrect object instance. The codes:
> {code}
>  public class TestDFSAdmin {
>private static final Log LOG = LogFactory.getLog(DFSAdmin.class);
>private Configuration conf = null;
>private MiniDFSCluster cluster;
>private DFSAdmin admin;
>...
> {code}
> Here the class name {{DFSAdmin}} should be {{TestDFSAdmin}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10627) Volume Scanner marks a block as "suspect" even if the exception is network-related

2016-10-20 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10627:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.7.4
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-2, branch-2.8 and branch-2.7.

> Volume Scanner marks a block as "suspect" even if the exception is 
> network-related
> --
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.7.4, 3.0.0-alpha2
>
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection 
> reset")) {
>   LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
>   volumeRef.getVolume().getStorageID(),
>   block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception 
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the 
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go 
> through all the suspect blocks to scan one corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10699) Log object instance get incorrectly in TestDFSAdmin

2016-10-20 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10699:

Attachment: HDFS-10699.002.patch

Uploading the committed patch,Just conflict in package name

> Log object instance get incorrectly in TestDFSAdmin
> ---
>
> Key: HDFS-10699
> URL: https://issues.apache.org/jira/browse/HDFS-10699
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10699.001.patch, HDFS-10699.002.patch
>
>
> In class TestDFSAdmin, it gets a incorrect object instance. The codes:
> {code}
>  public class TestDFSAdmin {
>private static final Log LOG = LogFactory.getLog(DFSAdmin.class);
>private Configuration conf = null;
>private MiniDFSCluster cluster;
>private DFSAdmin admin;
>...
> {code}
> Here the class name {{DFSAdmin}} should be {{TestDFSAdmin}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10627) Volume Scanner marks a block as "suspect" even if the exception is network-related

2016-10-20 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592195#comment-15592195
 ] 

Wei-Chiu Chuang commented on HDFS-10627:


I think this is okay. In a data transfer scenario (non-pipeline writing), If 
the destination detects corruption in the replica it receives, it reports 
corruption to the namenode. That is a more accurate response than marking a 
block as suspect when socket is broken. 

> Volume Scanner marks a block as "suspect" even if the exception is 
> network-related
> --
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection 
> reset")) {
>   LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
>   volumeRef.getVolume().getStorageID(),
>   block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception 
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the 
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go 
> through all the suspect blocks to scan one corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10627) Volume Scanner marks a block as "suspect" even if the exception is network-related

2016-10-20 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10627:
--
Summary: Volume Scanner marks a block as "suspect" even if the exception is 
network-related  (was: Volume Scanner marks a block as "suspect" even if the 
block sender encounters 'Broken pipe' or 'Connection reset by peer' exception)

> Volume Scanner marks a block as "suspect" even if the exception is 
> network-related
> --
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection 
> reset")) {
>   LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
>   volumeRef.getVolume().getStorageID(),
>   block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception 
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the 
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go 
> through all the suspect blocks to scan one corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10627) Volume Scanner marks a block as "suspect" even if the block sender encounters 'Broken pipe' or 'Connection reset by peer' exception

2016-10-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592168#comment-15592168
 ] 

Kihwal Lee commented on HDFS-10627:
---

bq. +1 Will check in tomorrow afternoon unless there are further objections to 
discuss.
[~daryn] probably meant it in Mercury days (1408 hours).  I say we waited long 
enough.

> Volume Scanner marks a block as "suspect" even if the block sender encounters 
> 'Broken pipe' or 'Connection reset by peer' exception
> ---
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection 
> reset")) {
>   LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
>   volumeRef.getVolume().getStorageID(),
>   block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception 
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the 
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go 
> through all the suspect blocks to scan one corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9480) Expose nonDfsUsed via StorageTypeStats

2016-10-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592164#comment-15592164
 ] 

Hudson commented on HDFS-9480:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10643 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10643/])
HDFS-9480. Expose nonDfsUsed via StorageTypeStats. Contributed by Brahma 
(brahma: rev 4c73be135ca6ee2ba0b075a507097900db206b09)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/StorageTypeStats.java


>  Expose nonDfsUsed via StorageTypeStats 
> 
>
> Key: HDFS-9480
> URL: https://issues.apache.org/jira/browse/HDFS-9480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-9480-002.patch, HDFS-9480.patch
>
>
>  Expose nonDfsUsed via StorageTypeStats..See the comment [here | 
> https://issues.apache.org/jira/browse/HDFS-9038?focusedCommentId=15018761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15018761]
>  from arpit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11034) Provide a command line tool to clear decommissioned DataNode information from the NameNode without restarting.

2016-10-20 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-11034:
-
Assignee: Gergely Novák

> Provide a command line tool to clear decommissioned DataNode information from 
> the NameNode without restarting.
> --
>
> Key: HDFS-11034
> URL: https://issues.apache.org/jira/browse/HDFS-11034
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Gergely Novák
>
> Information about decommissioned DataNodes remains tracked in the NameNode 
> for the entire NameNode process lifetime.  Currently, the only way to clear 
> this information is to restart the NameNode.  This issue proposes to add a 
> way to clear this information online, without requiring a process restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9480) Expose nonDfsUsed via StorageTypeStats

2016-10-20 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9480:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk,branch-2 and branch-2.8..

>  Expose nonDfsUsed via StorageTypeStats 
> 
>
> Key: HDFS-9480
> URL: https://issues.apache.org/jira/browse/HDFS-9480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-9480-002.patch, HDFS-9480.patch
>
>
>  Expose nonDfsUsed via StorageTypeStats..See the comment [here | 
> https://issues.apache.org/jira/browse/HDFS-9038?focusedCommentId=15018761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15018761]
>  from arpit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10997) Reduce number of path resolving methods

2016-10-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591923#comment-15591923
 ] 

Daryn Sharp commented on HDFS-10997:


Test failures unrelated and pass locally.  TestFileCreationDelete failed from 
address already in use race condition.  The TestAddStripedBlockInFBR failed for 
some EC block report issue.

> Reduce number of path resolving methods
> ---
>
> Key: HDFS-10997
> URL: https://issues.apache.org/jira/browse/HDFS-10997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10997.1.patch, HDFS-10997.2.patch, HDFS-10997.patch
>
>
> FSDirectory contains many methods for resolving paths to an IIP and/or inode. 
>  These should be unified into a couple methods that will consistently do the 
> basics of resolving reserved paths, blocking write ops from snapshot paths, 
> verifying ancestors as directories, and throwing if symlinks are encountered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6708) StorageType should be encoded in the block token

2016-10-20 Thread Pieter Reuse (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591757#comment-15591757
 ] 

Pieter Reuse commented on HDFS-6708:


The changes in the protobuf for the BlockTokenIdentifier will be discussed and 
potentially implemented in 
[HDFS-11026|https://issues.apache.org/jira/browse/HDFS-11026]. I've made great 
progress in implementing a patch for this ticket, but will rebase 
implementation on top of the patch for HDFS-11026 before uploading it here 
(avoiding double review-work).

> StorageType should be encoded in the block token
> 
>
> Key: HDFS-6708
> URL: https://issues.apache.org/jira/browse/HDFS-6708
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.4.1
>Reporter: Arpit Agarwal
>Assignee: Pieter Reuse
>
> HDFS-6702 is adding support for file creation based on StorageType.
> The block token is used as a tamper-proof channel for communicating block 
> parameters from the NN to the DN during block creation. The StorageType 
> should be included in this block token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9096) Issue in Rollback (after rolling upgrade) from hadoop 2.7.1 to 2.4.0

2016-10-20 Thread Dinesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591592#comment-15591592
 ] 

Dinesh commented on HDFS-9096:
--

Hi [~kihwal],

Yes. Did by mistake. Let me change version in both name nodes and do rollback 
with older version 2.5.2 and get back to you.

Thanks,
Dinesh Kumar P

> Issue in Rollback (after rolling upgrade) from hadoop 2.7.1 to 2.4.0
> 
>
> Key: HDFS-9096
> URL: https://issues.apache.org/jira/browse/HDFS-9096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.4.0
>Reporter: Harpreet Kaur
>
> I tried to do rolling upgrade from hadoop 2.4.0 to hadoop 2.7.1. As per 
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#dfsadmin_-rollingUpgrade
>  one can rollback to previous release provided the finalise step is not done. 
> I upgraded the setup but didnot finalise the upgrade and tried to rollback 
> HDFS to 2.4.0
> I tried the following steps
>   1.  Shutdown all NNs and DNs.
>   2.  Restore the pre-upgrade release in all machines.
>   3.  Start NN1 as Active with the "-rollingUpgrade 
> rollback"
>  option.
> I am getting the following error after 3rd step
> 15/09/01 17:53:35 INFO namenode.AclConfigFlag: ACLs enabled? false
> 15/09/01 17:53:35 INFO common.Storage: Lock on <>/in_use.lock 
> acquired by nodename 12152@VM-2
> 15/09/01 17:53:35 WARN namenode.FSNamesystem: Encountered exception loading 
> fsimage
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data/yarn/namenode. Reported: -63. Expecting = 
> -56.
> at 
> org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178)
> at 
> org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:608)
> at 
> org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:309)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:639)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:455)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:511)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:670)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:655)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1304)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1370)
> 15/09/01 17:53:35 INFO mortbay.log: Stopped 
> SelectChannelConnector@0.0.0.0:50070
> 15/09/01 17:53:35 INFO impl.MetricsSystemImpl: Stopping NameNode metrics 
> system...
> 15/09/01 17:53:35 INFO impl.MetricsSystemImpl: NameNode metrics system 
> stopped.
> 15/09/01 17:53:35 INFO impl.MetricsSystemImpl: NameNode metrics system 
> shutdown complete.
> 15/09/01 17:53:35 FATAL namenode.NameNode: Exception in namenode join
> From rolling upgrade documentation it can be inferred that rolling upgrade is 
> supported Hadoop 2.4.0 onwards but rollingUpgrade rollback to Hadoop 2.4.0 
> seems to be broken in Hadoop 2.4.0. It throws above mentioned error.
> Are there any other steps to perform rollback (from rolling upgrade) or is it 
> not supported to rollback to Hadoop 2.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf

2016-10-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591402#comment-15591402
 ] 

Steve Loughran commented on HDFS-6984:
--

wow, thanks for that extra work! appreciated. I didn't go into the details 
enough to give it a full review...not familar enough with the code.

# I see in FileStatus.java line 300+ the input size is checked for being 
negative. What if the size came back as 2^31? I think you need an upper bounds 
check if you really want to defend against malicious endpoints

> In Hadoop 3, make FileStatus serialize itself via protobuf
> --
>
> Key: HDFS-6984
> URL: https://issues.apache.org/jira/browse/HDFS-6984
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, 
> HDFS-6984.003.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
> to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
> have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
> information.  The protobuf form is preferable, since it allows us to add new 
> fields in a backwards-compatible way.  Another issue is that already a lot of 
> subclasses of FileStatus don't override the Writable methods of the 
> superclass, breaking the interface contract that read(status.write) should be 
> equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so 
> that we don't have to deal with these issues.  It's probably too late to do 
> this in Hadoop 2, since user code may be relying on the existing FileStatus 
> serialization there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10815) The state of the EC file is erroneously recognized when you restart the NameNode.

2016-10-20 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591320#comment-15591320
 ] 

SammiChen commented on HDFS-10815:
--

Hi Eisuke Umeda, thanks for provide more information! Have you ever tried if 
only one namenode is involved, will this issue still be reproduceable? Is the 
second namenode involvement a must have condition to reproduce the issue? 

> The state of the EC file is erroneously recognized when you restart the 
> NameNode.
> -
>
> Key: HDFS-10815
> URL: https://issues.apache.org/jira/browse/HDFS-10815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
> Environment: 2 NameNodes, 5 DataNodes, Erasured code policy is set as 
> "RS-DEFAULT-3-2-64k"
>Reporter: Eisuke Umeda
>
> After carrying out an examination in the following procedures, an EC files 
> came to be recognized as corrupt files.
> These files were able to get in "hdfs dfs -get".
> NameNode might be causing the false recognition.
> DataNodes: datanode[1-5]
> Rack awareness: not set
> Copy target files: /tmp/tpcds-generate/25/store_sales/*
> {code}
> $ hdfs dfs -ls /tmp/tpcds-generate/25/store_sales
> Found 25 items
> -rw-r--r--   0 root supergroup  399430918 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-0
> -rw-r--r--   0 root supergroup  399054598 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-1
> -rw-r--r--   0 root supergroup  399329373 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-2
> -rw-r--r--   0 root supergroup  399528459 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-3
> -rw-r--r--   0 root supergroup  399329624 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-4
> -rw-r--r--   0 root supergroup  399085924 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-5
> -rw-r--r--   0 root supergroup  399337384 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-6
> -rw-r--r--   0 root supergroup  399199458 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-7
> -rw-r--r--   0 root supergroup  399679096 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-8
> -rw-r--r--   0 root supergroup  399440431 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-9
> -rw-r--r--   0 root supergroup  399403931 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00010
> -rw-r--r--   0 root supergroup  399472465 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00011
> -rw-r--r--   0 root supergroup  399451784 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00012
> -rw-r--r--   0 root supergroup  399240168 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00013
> -rw-r--r--   0 root supergroup  399370507 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00014
> -rw-r--r--   0 root supergroup  399633351 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00015
> -rw-r--r--   0 root supergroup  396532952 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00016
> -rw-r--r--   0 root supergroup  396258715 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00017
> -rw-r--r--   0 root supergroup  396382486 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00018
> -rw-r--r--   0 root supergroup  399016456 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00019
> -rw-r--r--   0 root supergroup  399465745 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00020
> -rw-r--r--   0 root supergroup  399208235 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00021
> -rw-r--r--   0 root supergroup  399198296 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00022
> -rw-r--r--   0 root supergroup  399599711 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00023
> -rw-r--r--   0 root supergroup  395150855 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00024
> {code}
> NameNodes:
>   namenode1(active)
>   namenode2(standby)
> The directory which there is "Under-erasure-coded block groups": 
> /tmp/tpcds-generate/test
> {code}
> $ sudo -u hdfs hdfs erasurecode -getPolicy /tmp/tpcds-generate/test
> ErasureCodingPolicy=[Name=RS-DEFAULT-3-2-64k, 
> Schema=[ECSchema=[Codec=rs-default, numDataUnits=3, numParityUnits=2]], 
> CellSize=65536 ]
> {code}
> The following is the steps to reproduce:
> 1) hdfs dfs -cp /tmp/tpcds-generate/25/store_sales/* /tmp/tpcds-generate/test
> 2) datanode1: (in the middle of the copy) sudo pkill -9 -f datanode
> 3) start a process of datanode1 two minutes later
> 4) carry out hdfs fsck and confirm that Under-Replicated Blocks occurred
> 5) wait until Under-Replicated Blocks becomes 0
> 6) 

[jira] [Commented] (HDFS-10815) The state of the EC file is erroneously recognized when you restart the NameNode.

2016-10-20 Thread Eisuke Umeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591310#comment-15591310
 ] 

Eisuke Umeda commented on HDFS-10815:
-

I'm sorry I'm so late.
It was possible to reproduce the bug by a new procedure.

The following is the steps to reproduce:

{code:title=hdfs-site.xml}

  dfs.datanode.data.dir
  
file:///data1/hdfs/data,file:///data2/hdfs/data,file:///data3/hdfs/data

{code}

1) datanode1: sudo rm -rf /data1/hdfs/data/* /data2/hdfs/data/* 
/data3/hdfs/data/*
2) datanode1: sudo /etc/init.d/hadoop-hdfs-datanode restart
3) datanode2: sudo rm -rf /data1/hdfs/data/* /data2/hdfs/data/* 
/data3/hdfs/data/*
4) datanode2: sudo /etc/init.d/hadoop-hdfs-datanode restart
5) namenode1: sudo -u hdfs hdfs dfsadmin -triggerBlockReport datanode1:9867
6) namenode1: sudo -u hdfs hdfs dfsadmin -triggerBlockReport datanode2:9867
7) namenode1: /etc/init.d/hadoop-hdfs-namenode restart
8) namenode2: /etc/init.d/hadoop-hdfs-namenode restart
9) Carry out hdfs fsck and confirm that Under-Replicated Blocks occurred.
10) Wait for about 24 hours.
11) namenode1: /etc/init.d/hadoop-hdfs-namenode restart
12) namenode2: /etc/init.d/hadoop-hdfs-namenode restart

> The state of the EC file is erroneously recognized when you restart the 
> NameNode.
> -
>
> Key: HDFS-10815
> URL: https://issues.apache.org/jira/browse/HDFS-10815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
> Environment: 2 NameNodes, 5 DataNodes, Erasured code policy is set as 
> "RS-DEFAULT-3-2-64k"
>Reporter: Eisuke Umeda
>
> After carrying out an examination in the following procedures, an EC files 
> came to be recognized as corrupt files.
> These files were able to get in "hdfs dfs -get".
> NameNode might be causing the false recognition.
> DataNodes: datanode[1-5]
> Rack awareness: not set
> Copy target files: /tmp/tpcds-generate/25/store_sales/*
> {code}
> $ hdfs dfs -ls /tmp/tpcds-generate/25/store_sales
> Found 25 items
> -rw-r--r--   0 root supergroup  399430918 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-0
> -rw-r--r--   0 root supergroup  399054598 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-1
> -rw-r--r--   0 root supergroup  399329373 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-2
> -rw-r--r--   0 root supergroup  399528459 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-3
> -rw-r--r--   0 root supergroup  399329624 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-4
> -rw-r--r--   0 root supergroup  399085924 2016-08-16 15:11 
> /tmp/tpcds-generate/25/store_sales/data-m-5
> -rw-r--r--   0 root supergroup  399337384 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-6
> -rw-r--r--   0 root supergroup  399199458 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-7
> -rw-r--r--   0 root supergroup  399679096 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-8
> -rw-r--r--   0 root supergroup  399440431 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-9
> -rw-r--r--   0 root supergroup  399403931 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00010
> -rw-r--r--   0 root supergroup  399472465 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00011
> -rw-r--r--   0 root supergroup  399451784 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00012
> -rw-r--r--   0 root supergroup  399240168 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00013
> -rw-r--r--   0 root supergroup  399370507 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00014
> -rw-r--r--   0 root supergroup  399633351 2016-08-16 15:12 
> /tmp/tpcds-generate/25/store_sales/data-m-00015
> -rw-r--r--   0 root supergroup  396532952 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00016
> -rw-r--r--   0 root supergroup  396258715 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00017
> -rw-r--r--   0 root supergroup  396382486 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00018
> -rw-r--r--   0 root supergroup  399016456 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00019
> -rw-r--r--   0 root supergroup  399465745 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00020
> -rw-r--r--   0 root supergroup  399208235 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00021
> -rw-r--r--   0 root supergroup  399198296 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00022
> -rw-r--r--   0 root supergroup  399599711 2016-08-16 15:13 
> /tmp/tpcds-generate/25/store_sales/data-m-00023
> -rw-r--r--   0 root supergroup  395150855 2016-08-16 15:13 
> 

[jira] [Commented] (HDFS-11033) Add documents for native raw erasure coder in XOR codes

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591086#comment-15591086
 ] 

Hadoop QA commented on HDFS-11033:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 28s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}114m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-11033 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834328/HDFS-11033-v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  |
| uname | Linux 9b7fe0e2d764 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73504b1 |
| Default Java | 1.8.0_101 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17235/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17235/testReport/ |
| modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17235/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add documents for native raw erasure coder in XOR codes
> ---
>
> Key: HDFS-11033
> URL: 

[jira] [Commented] (HDFS-8410) Add computation time metrics to datanode for ECWorker

2016-10-20 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591061#comment-15591061
 ] 

SammiChen commented on HDFS-8410:
-

1. The failed test is irrelevant. The failure reason is "Adress already in use"
2. The checkstyle issue is about "the variable should has get/set function". 
That's the not case since other variables don't has get/set function either. 

> Add computation time metrics to datanode for ECWorker
> -
>
> Key: HDFS-8410
> URL: https://issues.apache.org/jira/browse/HDFS-8410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: SammiChen
> Attachments: HDFS-8410-001.patch, HDFS-8410-002.patch, 
> HDFS-8410-003.patch, HDFS-8410-004.patch, HDFS-8410-005.patch
>
>
> This is a sub task of HDFS-7674. It adds time metric for ec decode work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8410) Add computation time metrics to datanode for ECWorker

2016-10-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591033#comment-15591033
 ] 

Hadoop QA commented on HDFS-8410:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 63 unchanged - 0 fixed = 64 total (was 63) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 30s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-8410 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834325/HDFS-8410-005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5c07a2ec3f04 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73504b1 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17234/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17234/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17234/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17234/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add computation time metrics to datanode for ECWorker
> -
>
> Key: HDFS-8410
>   

[jira] [Commented] (HDFS-7343) HDFS smart storage management

2016-10-20 Thread Wei Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590949#comment-15590949
 ] 

Wei Zhou commented on HDFS-7343:


Hi [~andrew.wang], thank you for reviewing the document and great comments. 
Sorry for the delay!
{quote}
First was that most reads happen via SCR, thus we do not have reliable IO 
statistics.
{quote}
IO statistics is indeed a point. To solve this, we make IO statistics from both 
HDFS level and system level (system wide, like data given by system tool 
‘iostat’). IO caused by SCR can be measured from system level data. 
{quote}
Do you plan to address these issues as part of this work?
{quote}
Agreed that it’s a waste of precious resources.
For SSM it have to consider the factors like file length, access history and 
memory availability before making a decision to cache a file. It tries to 
minimize the possibility to cache a file that not needed to.
It’s a great ideal to extend HDFS cache to cache partial block instead of the 
whole file. This should help for many scenarios.
SSM can support the extended feature by collecting block read offset info to 
justify the part to be cached.
{quote}
It's difficult to prioritize at the HDFS-level since performance is measured at 
the app-level.  
{quote}
Agreed that it's difficult to prioritize at the HDFS-level. Prioritize is not a 
general and static concept in SSM. It’s used to for schedule some specific 
cases. For example, one rule triggers an action of moving hot file into faster 
storage while another rule triggers an ‘archive cold file’ action at the same 
time. For performance consideration, SSM makes the first action have higher 
priority than the second one. It depends and not always the case.
{quote}
If you're looking at purely HDFS-level information, without awareness of users, 
jobs, and their corresponding priorities, admins will have a hard time mapping 
rules to their actual SLOs.
{quote}
Yes, it’s not possible to handle all issues just from purely HDFS-level 
information. It’s a great plus for SSM to collect high-level application info 
for better efficiency. Also, we could expose some APIs for admin/user to 
provide some hint to SSM to make it work better. Take your case for example, 
user could hint the operation is for a time sensitive job then SSM can cache it 
in memory or move the file to faster storage.

Your following questions will be replied in comments followed. Thanks!

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management.pdf
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org