date:20160419


[ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249390#comment-15249390
 ] 

Hadoop QA commented on HDFS-8057:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-hdfs-project: patch generated 72 new + 212 
unchanged - 71 fixed = 284 total (was 283) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 20s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 37s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} a

[jira] [Updated] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-19 Thread Andres Perez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-5280:
---
Attachment: HDFS-5280.patch

> Corrupted meta files on data nodes prevents DFClient from connecting to data 
> nodes and updating corruption status to name node.
> ---
>
> Key: HDFS-5280
> URL: https://issues.apache.org/jira/browse/HDFS-5280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs-client
>Affects Versions: 1.1.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha, 2.7.2
> Environment: Red hat enterprise 6.4
> Hadoop-2.1.0
>Reporter: Jinghui Wang
>Assignee: Andres Perez
> Attachments: HDFS-5280.patch
>
>
> Meta files being corrupted causes the DFSClient not able to connect to the 
> datanodes to access the blocks, so DFSClient never perform a read on the 
> block, which is what throws the ChecksumException when file blocks are 
> corrupted and report to the namenode to mark the block as corrupt.  Since the 
> client never got to that far, thus the file status remain as healthy and so 
> are all the blocks.
> To replicate the error, put a file onto HDFS.
> run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that 
> following output.
> FSCK started for path /tmp/bogus.csv at 11:33:29
> /tmp/bogus.csv 109 bytes, 1 block(s):  OK
> 0. blk_-4255166695856420554_5292 len=109 repl=3
> find the block/meta files for 4255166695856420554 by running 
> ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will 
> get the following output:
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta
> now corrupt the meta file by running 
> ssh datanode1.address "sed -i -e '1i 1234567891' 
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta" 
> now run hadoop fs -cat /tmp/bogus.csv
> will show the stack trace of DFSClient failing to connect to the data node 
> with the corrupted meta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-19 Thread Andres Perez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-5280:
---
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

The {{IllegalArgumentException}} is catched and a dummy {{DataChecksum}} object 
is created, that way the Checksum test fails later in the pipeline and marks 
the node as non existant in that node, instead of marking the entire node as 
dead becasue the client was supposedly unable to connect.

> Corrupted meta files on data nodes prevents DFClient from connecting to data 
> nodes and updating corruption status to name node.
> ---
>
> Key: HDFS-5280
> URL: https://issues.apache.org/jira/browse/HDFS-5280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs-client
>Affects Versions: 2.7.2, 2.0.4-alpha, 2.1.0-beta, 1.1.1, 3.0.0
> Environment: Red hat enterprise 6.4
> Hadoop-2.1.0
>Reporter: Jinghui Wang
>Assignee: Andres Perez
> Attachments: HDFS-5280.patch
>
>
> Meta files being corrupted causes the DFSClient not able to connect to the 
> datanodes to access the blocks, so DFSClient never perform a read on the 
> block, which is what throws the ChecksumException when file blocks are 
> corrupted and report to the namenode to mark the block as corrupt.  Since the 
> client never got to that far, thus the file status remain as healthy and so 
> are all the blocks.
> To replicate the error, put a file onto HDFS.
> run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that 
> following output.
> FSCK started for path /tmp/bogus.csv at 11:33:29
> /tmp/bogus.csv 109 bytes, 1 block(s):  OK
> 0. blk_-4255166695856420554_5292 len=109 repl=3
> find the block/meta files for 4255166695856420554 by running 
> ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will 
> get the following output:
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta
> now corrupt the meta file by running 
> ssh datanode1.address "sed -i -e '1i 1234567891' 
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta" 
> now run hadoop fs -cat /tmp/bogus.csv
> will show the stack trace of DFSClient failing to connect to the data node 
> with the corrupted meta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-19 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249372#comment-15249372
 ] 

Andrew Wang commented on HDFS-9869:
---

I like Zhe's proposal. I'd prefer to keep the old keys around if it's just an 
extra deprecation.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode


[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249371#comment-15249371
 ] 

Hadoop QA commented on HDFS-7859:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 36s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 58s {color} 
| {color:red} hadoop-hdfs-project-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 
new + 50 unchanged - 1 fixed = 51 total (was 51) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 37s 
{color} | {color:red} hadoop-hdfs-project: patch generated 17 new + 1085 
unchanged - 1 fixed = 1102 total (was 1086) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 54s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77

[jira] [Commented] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-19 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249363#comment-15249363
 ] 

Zhe Zhang commented on HDFS-9869:
-

bq. How about keeping only the recent property name | 
dfs.namenode.replication.pending.timeout-sec | 
dfs.namenode.reconstruction.pending.timeout-sec |
I was thinking keeping both entries, but making the right hand side of both to 
the new config key {{dfs.namenode.reconstruction.pending.timeout-sec}}. Pinging 
[~andrew.wang] for thoughts.

bq. As a followup we should also deprecate other replication-related config 
keys.
bq. Shall we do this deprecation through another follow-up jira task?
Yes, given the size of this change I think we should do other deprecations in a 
separate JIRA.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249351#comment-15249351
 ] 

Hadoop QA commented on HDFS-10313:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 34s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 35s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 36s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799675/HDFS-10313.001.patch |
| JIRA Issue | HDFS-10313 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 197e2099372f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / af9bdbe |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-or

[jira] [Comment Edited] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories

2016-04-19 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249340#comment-15249340
 ] 

Vinayakumar B edited comment on HDFS-10256 at 4/20/16 6:09 AM:
---

bq. MiniDFSCluster#shutdown() registers base_dir to be deleted on shutdown. If 
this gets slow, the next test JVM will start to run before the shutdown hook 
completes. But forcing every test to call shutdown(true) can slowdown things. 
Instead, each instance should get a random base_dir, so that the deletion 
through shutdown hook and the subsequent new test setup can overlap.
I have investigated a bit on this. Registering a dir to be deleted during 
shutdown will not actually delete the directory unless its empty. I have tried 
to register all files/directories recursively to be deleted on shutdown. But 
this will sometimes keeps the empty directories.
Instead, I have used unique base directory (testclassname.methodname) for each 
MiniDFSCluster instance and {{FileUtils.fullyDelete()}} during shutdown itself, 
instead of registering for deleteOnExit() in {{MiniDFSCluster#shutdown()}}.
All tests which actually uses MiniDfsCluster will delete its basedir. Apart 
from this, there are some more tests, which creates directories on its own ( 
using some Random String), and these may not get deleted after test run.

At the end of test run, there was ~300MB of data created in 
"hadoop-hdfs/target/test/data" directory, which is way less than ~6GB reported 
by Chris.


So I am not sure why {{base_dir.deleteOnExit()}} was used during 
{{MiniDfsCluster#shutdown()}}, which will not actually delete anything because 
base_dir will not be empty.


was (Author: vinayrpet):
bq. MiniDFSCluster#shutdown() registers base_dir to be deleted on shutdown. If 
this gets slow, the next test JVM will start to run before the shutdown hook 
completes. But forcing every test to call shutdown(true) can slowdown things. 
Instead, each instance should get a random base_dir, so that the deletion 
through shutdown hook and the subsequent new test setup can overlap.
I have investigated a bit on this. Registering a dir to be deleted during 
shutdown will not actually delete the directory unless its empty. I have tried 
to register all files/directories recursively to be deleted on shutdown. But 
this will sometimes keeps the empty directories.
Instead, I have used unique base directory (testclassname.methodname) for each 
MiniDFSCluster instance and FileUtils.fullyDelete()}} during shutdown itself, 
instead of registering for deleteOnExit() in MiniDFSCluster#shutdown().
All tests which actually uses MiniDfsCluster will delete its basedir. Apart 
from this, there are some more tests, which creates directories on its own ( 
using some Random String), and these may not get deleted after test run.

At the end of test run, there was ~300MB of data created in 
"hadoop-hdfs/target/test/data" directory, which is way less than ~6GB reported 
by Chris.


So I am not sure why {{base_dir.deleteOnExit()}} was used during 
{{MiniDfsCluster#shutdown()}}, which will not actually delete anything because 
base_dir will not be empty.

> Use GenericTestUtils.getTestDir method in tests for temporary directories
> -
>
> Key: HDFS-10256
> URL: https://issues.apache.org/jira/browse/HDFS-10256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories

2016-04-19 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249340#comment-15249340
 ] 

Vinayakumar B commented on HDFS-10256:
--

bq. MiniDFSCluster#shutdown() registers base_dir to be deleted on shutdown. If 
this gets slow, the next test JVM will start to run before the shutdown hook 
completes. But forcing every test to call shutdown(true) can slowdown things. 
Instead, each instance should get a random base_dir, so that the deletion 
through shutdown hook and the subsequent new test setup can overlap.
I have investigated a bit on this. Registering a dir to be deleted during 
shutdown will not actually delete the directory unless its empty. I have tried 
to register all files/directories recursively to be deleted on shutdown. But 
this will sometimes keeps the empty directories.
Instead, I have used unique base directory (testclassname.methodname) for each 
MiniDFSCluster instance and FileUtils.fullyDelete()}} during shutdown itself, 
instead of registering for deleteOnExit() in MiniDFSCluster#shutdown().
All tests which actually uses MiniDfsCluster will delete its basedir. Apart 
from this, there are some more tests, which creates directories on its own ( 
using some Random String), and these may not get deleted after test run.

At the end of test run, there was ~300MB of data created in 
"hadoop-hdfs/target/test/data" directory, which is way less than ~6GB reported 
by Chris.


So I am not sure why {{base_dir.deleteOnExit()}} was used during 
{{MiniDfsCluster#shutdown()}}, which will not actually delete anything because 
base_dir will not be empty.

> Use GenericTestUtils.getTestDir method in tests for temporary directories
> -
>
> Key: HDFS-10256
> URL: https://issues.apache.org/jira/browse/HDFS-10256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing

2016-04-19 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249321#comment-15249321
 ] 

Rakesh R commented on HDFS-10308:
-

Thanks [~brahmareddy]

> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-19 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249312#comment-15249312
 ] 

Walter Su commented on HDFS-9958:
-

bq. we fix countNodes().corruptReplicas() to return the number after going thru 
all storages( irrespective of their state) that have the corruptNodes (in this 
case), since numNodes() is storage state agnostic.
I think {{countNodes(blk)}} going thru all storages is unnecessary. Also I 
think {{numMachines}} should only include NORMAL and READ_ONLY. So 
{{createLocatedBlock(..)}} going thru all storages is unnecessary.
{code}
if (numMachines > 0) {
  for(DatanodeStorageInfo storage : blocksMap.getStorages(blk)) {
{code}

btw, which is not related to this topic, I think 
{{findAndMarkBlockAsCorrupt(..)}} shouldn't support adding blk to the map if 
the storage is not found.

ping [~jingzhao] to check if he has any comment.

> BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed 
> storages.
> 
>
> Key: HDFS-9958
> URL: https://issues.apache.org/jira/browse/HDFS-9958
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9958-Test-v1.txt, HDFS-9958.001.patch, 
> HDFS-9958.002.patch
>
>
> In a scenario where the corrupt replica is on a failed storage, before it is 
> taken out of blocksMap, there is a race which causes the creation of 
> LocatedBlock on a {{machines}} array element that is not populated. 
> Following is the root cause,
> {code}
> final int numCorruptNodes = countNodes(blk).corruptReplicas();
> {code}
> countNodes only looks at nodes with storage state as NORMAL, which in the 
> case where corrupt replica is on failed storage will amount to 
> numCorruptNodes being zero. 
> {code}
> final int numNodes = blocksMap.numNodes(blk);
> {code}
> However, numNodes will count all nodes/storages irrespective of the state of 
> the storage. Therefore numMachines will include such (failed) nodes. The 
> assert would fail only if the system is enabled to catch Assertion errors, 
> otherwise it goes ahead and tries to create LocatedBlock object for that is 
> not put in the {{machines}} array.
> Here is the stack trace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:45)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:40)
>   at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:84)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:878)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:826)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:799)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:899)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1849)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: HDFS-10313.001.patch

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: (was: HDFS-10313.001.patch)

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249302#comment-15249302
 ] 

Chris Nauroth commented on HDFS-10312:
--

The remaining Checkstyle warning is for a long method.  It's best not to 
address it in scope of this patch.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249299#comment-15249299
 ] 

Chris Nauroth commented on HDFS-10312:
--

It appears the discussions in those other JIRAs missed the point that 
{{ipc.maximum.data.length}} controls only the maximum payload accepted by the 
RPC server.  Without this patch, it is not sufficient to work around the size 
enforcement by protobuf, demonstrated in the stack trace that I included in 
prior comments.  Asking admins to repartition blocks across multiple storages 
on the same drive isn't a viable workaround for them.  HDFS-9011 is a much 
deeper change that will require further review.  This patch is a simple way to 
unblock clusters that have already gotten into this state accidentally.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing


[ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249290#comment-15249290
 ] 

Brahma Reddy Battula commented on HDFS-10308:
-

[~rakeshr] thanks for reporting this.. Latest patch..LGTM..+1 ( non binding).

> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2016-04-19 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249288#comment-15249288
 ] 

Jitendra Nath Pandey commented on HDFS-9276:


[~daryn] are you planning to review this patch? The patch shouldn't impact 
IP-failover because the new private token preserves the service of the original 
private token.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, 
> HDFS-9276.12.patch, HDFS-9276.13.patch, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler

[jira] [Resolved] (HDFS-10315) Fix TestRetryCacheWithHA and TestNamenodeRetryCache failures


 [ 
https://issues.apache.org/jira/browse/HDFS-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved HDFS-10315.
-
Resolution: Duplicate

> Fix TestRetryCacheWithHA and TestNamenodeRetryCache failures
> 
>
> Key: HDFS-10315
> URL: https://issues.apache.org/jira/browse/HDFS-10315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
>
> {noformat}
> FAILED:  
> org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild
> Error Message:
> expected:<25> but was:<26>
> Stack Trace:
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild(TestNamenodeRetryCache.java:419)
> FAILED:  
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN
> Error Message:
> expected:<25> but was:<26>
> Stack Trace:
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10316) revisit corrupt replicas count

2016-04-19 Thread Walter Su (JIRA)

Walter Su created HDFS-10316:


 Summary: revisit corrupt replicas count
 Key: HDFS-10316
 URL: https://issues.apache.org/jira/browse/HDFS-10316
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Walter Su


A DN has 4 types of storages:
1. NORMAL
2. READ_ONLY
3. FAILED
4. (missing/pruned)

blocksMap.numNodes(blk) counts 1,2,3
blocksMap.getStorages(blk) counts 1,2,3

countNodes(blk).corruptReplicas() counts 1,2
corruptReplicas counts 1,2,3,4. Because findAndMarkBlockAsCorrupt(..) supports 
adding blk to the map even if the storage is not found.

The inconsistency causes bugs like HDFS-9958.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249281#comment-15249281
 ] 

Hadoop QA commented on HDFS-10313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 12s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 14s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 17s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 13s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 15s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 16s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799668/HDFS-10313.001.patch |
| JIRA Issue | HDFS-10313 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fbecc92d6bf0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | t

[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing

2016-04-19 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249276#comment-15249276
 ] 

Rakesh R commented on HDFS-10308:
-

Test failure is unrelated to my patch, please ignore it. I could see HDFS-2043 
is addressing {{TestHFlush.testHFlushInterrupted}} failure.

> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-19 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249258#comment-15249258
 ] 

Rakesh R commented on HDFS-10265:
-

Thanks [~brahmareddy]. Yesterday I've raised a jira HDFS-10308 to fix this. 
Would be great to see your feedback on the proposed patch in that jira.

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Fix For: 2.8.0
>
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: HDFS-10313.001.patch

Sorry, the previous is not completed, update a latest patch.

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: (was: HDFS-10313.001.patch)

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249247#comment-15249247
 ] 

Brahma Reddy Battula commented on HDFS-10312:
-

We've seen same issue and reported HDFS-8574 . As per discussion , it can be 
solved by HDFS-9011 but did not seen any progress there.
As colin suggested there, "It would be simpler for the admin to create two (or 
more) storages on the same drive, and it wouldn't involve any code modification 
by us." 

Even now numofblocks per volume are exposed ( HDFS-9425) such that admin can 
monitor this.


> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249235#comment-15249235
 ] 

Hadoop QA commented on HDFS-10313:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 8s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 11s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-distcp in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 9s {color} | 
{color:red} hadoop-distcp in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 11s {color} 
| {color:red} hadoop-distcp in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 3s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799664/HDFS-10313.001.patch |
| JIRA Issue | HDFS-10313 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f1f97aa008dc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk

[jira] [Updated] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-19 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-8057:
---
Status: Patch Available  (was: Open)

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-19 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-8057:
---
Attachment: HDFS-8057.1.patch

Hi, [~szetszwo]

I'm sorry for leaving this jira so long time. I've worked on other tasks.

I uploaded the first patch for this refactoring. Since it is difficult to move 
the test classes from hadoop-hdfs, I have some methods and variables be public 
in the BlockReader side. If we should use the reflection or something, please 
let me know. Thanks!

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: HDFS-10313.001.patch

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Status: Patch Available  (was: Open)

Attach a initial patch from me, thanks review.

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10315) Fix TestRetryCacheWithHA and TestNamenodeRetryCache failures

Brahma Reddy Battula created HDFS-10315:
---

 Summary: Fix TestRetryCacheWithHA and TestNamenodeRetryCache 
failures
 Key: HDFS-10315
 URL: https://issues.apache.org/jira/browse/HDFS-10315
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0


{noformat}
FAILED:  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild

Error Message:
expected:<25> but was:<26>

Stack Trace:
java.lang.AssertionError: expected:<25> but was:<26>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild(TestNamenodeRetryCache.java:419)


FAILED:  
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN

Error Message:
expected:<25> but was:<26>

Stack Trace:
java.lang.AssertionError: expected:<25> but was:<26>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2016-04-19 Thread Xinwei Qin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-7859:
--
Attachment: HDFS-7859.004.patch

Rebase the patch with the latest trunk.

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Xinwei Qin 
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
> HDFS-7859.001.patch, HDFS-7859.002.patch, HDFS-7859.004.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag


[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249220#comment-15249220
 ] 

Brahma Reddy Battula commented on HDFS-10265:
-

Following two testcases are failing after this in...Even above jenkins report 
shows this failures... will raise seperate jira to track this...

{noformat}
FAILED:  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild

Error Message:
expected:<25> but was:<26>

Stack Trace:
java.lang.AssertionError: expected:<25> but was:<26>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache.testRetryCacheRebuild(TestNamenodeRetryCache.java:419)


FAILED:  
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN

Error Message:
expected:<25> but was:<26>

Stack Trace:
java.lang.AssertionError: expected:<25> but was:<26>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
{noformat}

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Fix For: 2.8.0
>
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-04-19 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249133#comment-15249133
 ] 

Kai Zheng commented on HDFS-8449:
-

Some comments. Could you help check? Thanks Bo.
1. In {{ErasureCodingWorker}}, ref. the following change, it doesn't look good 
to put the counters here, because it only means the task is submitted 
successfully or not, regardless of the task being actually executed 
successfully or not. The right place would be in the {{run()}} method in the 
{{Runnable StripedReconstructor}} task. We may not worry too much about tasks 
of invalid targets because such tasks should be avoided in NN side eventually.
{code}
  public void processErasureCodingTasks(
  Collection ecTasks) {
for (BlockECReconstructionInfo reconstructionInfo : ecTasks) {
  try {
final StripedReconstructor task =
new StripedReconstructor(this, reconstructionInfo);
if (task.hasValidTargets()) {
  stripedReconstructionPool.submit(task);
+  datanode.getMetrics().incrECReconstructionTasks();
} else {
  LOG.warn("No missing internal block. Skip reconstruction for task:{}",
  reconstructionInfo);
}
  } catch (Throwable e) {
LOG.warn("Failed to reconstruct striped block {}",
reconstructionInfo.getExtendedBlock().getLocalBlock(), e);
+datanode.getMetrics().incrECFailedReconstructionTasks();
  }
}
  }
{code}

2. It's good to see new tests for this. As {{TestReconstructStripedFile}} has 
implemented all sorts of cases that reconstruction tasks can happen, could we 
improve it and add the metrics related checks in it?

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1

2016-04-19 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249142#comment-15249142
 ] 

Ravi Prakash commented on HDFS-9530:


Perhaps we can postpone the question of whether RBW blocks which are recovered 
during a DN start / refresh of storages should have space reserved to another 
JIRA (since that is not causing the symptoms mentioned in this JIRA)

Thanks for the explanations Brahma! They are very helpful for me to understand 
the code.

Should we also reduce the reservation in {{FsDatasetImpl.removeVolumes}} after 
{{it.remove();}}? How about {{checkAndUpdate}}?

I'm trying to figure out why we missed releasing the space during 
{{invalidate}} as you found out. As you correctly point out, we reserve space 
only when a BlockReceiver is created. 

> huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
> -
>
> Key: HDFS-9530
> URL: https://issues.apache.org/jira/browse/HDFS-9530
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Fei Hui
> Attachments: HDFS-9530-01.patch
>
>
> i think there are bugs in HDFS
> ===
> here is config
>   
> dfs.datanode.data.dir
> 
> 
> file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
> 
>   
> here is dfsadmin report 
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 238604832768 (222.22 GB)
> DFS Remaining: 215772954624 (200.95 GB)
> DFS Used: 22831878144 (21.26 GB)
> DFS Used%: 9.57%
> Under replicated blocks: 4
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7190958080 (6.70 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72343986176 (67.38 GB)
> DFS Used%: 8.96%
> DFS Remaining%: 90.14%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:02 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7219073024 (6.72 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72315871232 (67.35 GB)
> DFS Used%: 9.00%
> DFS Remaining%: 90.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 8421847040 (7.84 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 71113097216 (66.23 GB)
> DFS Used%: 10.49%
> DFS Remaining%: 88.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> 
> when running hive job , dfsadmin report as follows
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 108266011136 (100.83 GB)
> DFS Remaining: 80078416384 (74.58 GB)
> DFS Used: 28187594752 (26.25 GB)
> DFS Used%: 26.04%
> Under replicated blocks: 7
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9015627776 (8.40 GB)
> Non DFS Used: 44303742464 (41.26 GB)
> DFS Remaining: 26937047552 (25.09 GB)
> DFS Used%: 11.23%
> DFS Remaining%: 33.56%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 693
> Last contact: Wed Dec 09 15:37:35 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9163116544 (8.53 GB)
> Non DFS Used: 47895897600 (44.61 GB)
> DFS Remaining: 23197403648 (21.60 GB)
> DFS Used%: 11.42%
> DFS Remaining%: 28.90%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249123#comment-15249123
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
242 unchanged - 3 fixed = 243 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 16s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 1s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 140m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
htt

[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249114#comment-15249114
 ] 

Lin Yiqun commented on HDFS-10313:
--

Hi, [~yzhangal], I will upload a initial patch later.

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff


 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun reassigned HDFS-10313:


Assignee: Lin Yiqun

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249106#comment-15249106
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 22s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 
244 unchanged - 1 fixed = 247 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 35s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 11s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
31s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 204m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.ba

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249060#comment-15249060
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 5m 31s 
{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-hdfs in trunk failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 18s {color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77 with JDK v1.8.0_77 
generated 33 new + 0 unchanged - 0 fixed = 33 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 
244 unchanged - 1 fixed = 247 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 476 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 9s 
{color} | {color:red} The patch has 384 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 4s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 25s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 143m 10s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249030#comment-15249030
 ] 

Arpit Agarwal commented on HDFS-10312:
--

Yeah I agree that is rather unfortunate since the change to the message length 
is not plumbed without your patch.

I think the missing code paths can be tested with a targeted pre-condition that 
ensures any change to the config setting is propagated to the BufferDecoder 
(and the CodedInputStream) and that pre-condition will fail without your 
src/main changes. However it's okay to evaluate it in a follow up Jira and we 
don't need to hold up this one.

+1 from me.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249007#comment-15249007
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 
244 unchanged - 1 fixed = 247 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 53s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 169m 55s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch U

[jira] [Commented] (HDFS-9943) Support reconfiguring namenode replication confs


[ 
https://issues.apache.org/jira/browse/HDFS-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248996#comment-15248996
 ] 

Hadoop QA commented on HDFS-9943:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 5 new + 
290 unchanged - 4 fixed = 295 total (was 294) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 56s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 58s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 139m 39s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https:/

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248984#comment-15248984
 ] 

Chris Nauroth commented on HDFS-10312:
--

[~arpitagarwal], I like your suggestion for speeding up the test.  
Unfortunately, I think this doesn't quite give us the same test coverage.  To 
demonstrate this, apply patch v004, then revert the src/main changes, and then 
run the test.  It will fail on a protobuf decoding exception.  That's exactly 
the condition we want to test, and the src/main changes make the test pass.  
After applying the delta, that's no longer true.  The test passes with or 
without the src/main changes.  That's because with the smaller block report 
sizes, we don't hit the internal protobuf default of 64 MB maximum.  Using a 
block report size of 600, we definitely push over 64 MB for the RPC message 
size, so we definitely trigger the right condition.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1


[ 
https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248968#comment-15248968
 ] 

Arpit Agarwal edited comment on HDFS-9530 at 4/19/16 11:57 PM:
---

Thanks for continuing to work on this [~brahmareddy] and your detailed 
analyses. Your reasoning sounds correct but I'd need more time to check this 
thoroughly.

I agree that this is complex so if I am also open to the option of removing the 
reservation if there is a simpler alternative.


was (Author: arpitagarwal):
Thanks for continuing to work on this [~brahmareddy] and your detailed 
analyses. Your reasoning sounds correct but I'd need more time to check this 
thoroughly.

I agree that this is complex so if I am also open to the option of removing the 
reservation if there may be a simpler alternative.

> huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
> -
>
> Key: HDFS-9530
> URL: https://issues.apache.org/jira/browse/HDFS-9530
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Fei Hui
> Attachments: HDFS-9530-01.patch
>
>
> i think there are bugs in HDFS
> ===
> here is config
>   
> dfs.datanode.data.dir
> 
> 
> file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
> 
>   
> here is dfsadmin report 
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 238604832768 (222.22 GB)
> DFS Remaining: 215772954624 (200.95 GB)
> DFS Used: 22831878144 (21.26 GB)
> DFS Used%: 9.57%
> Under replicated blocks: 4
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7190958080 (6.70 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72343986176 (67.38 GB)
> DFS Used%: 8.96%
> DFS Remaining%: 90.14%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:02 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7219073024 (6.72 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72315871232 (67.35 GB)
> DFS Used%: 9.00%
> DFS Remaining%: 90.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 8421847040 (7.84 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 71113097216 (66.23 GB)
> DFS Used%: 10.49%
> DFS Remaining%: 88.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> 
> when running hive job , dfsadmin report as follows
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 108266011136 (100.83 GB)
> DFS Remaining: 80078416384 (74.58 GB)
> DFS Used: 28187594752 (26.25 GB)
> DFS Used%: 26.04%
> Under replicated blocks: 7
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9015627776 (8.40 GB)
> Non DFS Used: 44303742464 (41.26 GB)
> DFS Remaining: 26937047552 (25.09 GB)
> DFS Used%: 11.23%
> DFS Remaining%: 33.56%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 693
> Last contact: Wed Dec 09 15:37:35 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9163116544 (8.53 GB)
> Non DFS Used: 47895897600 (44.61 GB)
> DFS Remaining: 23197403648 (21.60 GB)
> DFS Used%: 11.42%
> DFS Remaining%: 28.90%
> Configured Cache Capacity: 0 (0 B)
> Ca

[jira] [Commented] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1


[ 
https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248968#comment-15248968
 ] 

Arpit Agarwal commented on HDFS-9530:
-

Thanks for continuing to work on this [~brahmareddy] and your detailed 
analyses. Your reasoning sounds correct but I'd need more time to check this 
thoroughly.

I agree that this is complex so if I am also open to the option of removing the 
reservation if there may be a simpler alternative.

> huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
> -
>
> Key: HDFS-9530
> URL: https://issues.apache.org/jira/browse/HDFS-9530
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Fei Hui
> Attachments: HDFS-9530-01.patch
>
>
> i think there are bugs in HDFS
> ===
> here is config
>   
> dfs.datanode.data.dir
> 
> 
> file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
> 
>   
> here is dfsadmin report 
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 238604832768 (222.22 GB)
> DFS Remaining: 215772954624 (200.95 GB)
> DFS Used: 22831878144 (21.26 GB)
> DFS Used%: 9.57%
> Under replicated blocks: 4
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7190958080 (6.70 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72343986176 (67.38 GB)
> DFS Used%: 8.96%
> DFS Remaining%: 90.14%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:02 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7219073024 (6.72 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72315871232 (67.35 GB)
> DFS Used%: 9.00%
> DFS Remaining%: 90.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 8421847040 (7.84 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 71113097216 (66.23 GB)
> DFS Used%: 10.49%
> DFS Remaining%: 88.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> 
> when running hive job , dfsadmin report as follows
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 108266011136 (100.83 GB)
> DFS Remaining: 80078416384 (74.58 GB)
> DFS Used: 28187594752 (26.25 GB)
> DFS Used%: 26.04%
> Under replicated blocks: 7
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9015627776 (8.40 GB)
> Non DFS Used: 44303742464 (41.26 GB)
> DFS Remaining: 26937047552 (25.09 GB)
> DFS Used%: 11.23%
> DFS Remaining%: 33.56%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 693
> Last contact: Wed Dec 09 15:37:35 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9163116544 (8.53 GB)
> Non DFS Used: 47895897600 (44.61 GB)
> DFS Remaining: 23197403648 (21.60 GB)
> DFS Used%: 11.42%
> DFS Remaining%: 28.90%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 750
> Last contact: Wed Dec 09 15:37:36 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 10008850432 (9.32 GB)
> Non DFS Used: 40303602176 (37.54 GB)
> DFS Re

[jira] [Commented] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.


[ 
https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248950#comment-15248950
 ] 

Colin Patrick McCabe commented on HDFS-10225:
-

It seems cleaner to remove the volume and re-add it with the new storage type, 
rather than mutating the existing volume.  Otherwise, we need to think about 
synchronization here around every use of storage type.

There are also some changes that look unrelated, like moving the call to 
parseChangedVolunes up several lines.

> DataNode hot swap drives should recognize storage type tags. 
> -
>
> Key: HDFS-10225
> URL: https://issues.apache.org/jira/browse/HDFS-10225
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-10225.000.patch, HDFS-10225.001.patch
>
>
> The current hot swap code only differentiate data dirs by their paths. People 
> might want to change the types of certain data dirs from the default value in 
> an existing cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248947#comment-15248947
 ] 

Arpit Agarwal commented on HDFS-10312:
--

Pasting the delta inline to avoid confusing Jenkins. I'll kick off a build 
manually.
{code}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestLargeBlockReport.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestLargeBlockReport.java
index bd9c0a2..0dff33f 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestLargeBlockReport.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestLargeBlockReport.java
@@ -74,10 +74,10 @@ public void tearDown() {
 
   @Test
   public void testBlockReportExceedsLengthLimit() throws Exception {
-initCluster();
+initCluster(1024 * 1024);
 // Create a large enough report that we expect it will go beyond the RPC
 // server's length validation, and also protobuf length validation.
-StorageBlockReport[] reports = createReports(600);
+StorageBlockReport[] reports = createReports(20);
 try {
   nnProxy.blockReport(bpRegistration, bpId, reports,
   new BlockReportContext(1, 0, reportId, fullBrLeaseId, sorted));
@@ -91,9 +91,8 @@ public void testBlockReportExceedsLengthLimit() throws 
Exception {
 
   @Test
   public void testBlockReportSucceedsWithLargerLengthLimit() throws Exception {
-conf.setInt(IPC_MAXIMUM_DATA_LENGTH, 128 * 1024 * 1024); // 128 MB
-initCluster();
-StorageBlockReport[] reports = createReports(600);
+initCluster(2 * 1024 * 1024);
+StorageBlockReport[] reports = createReports(20);
 nnProxy.blockReport(bpRegistration, bpId, reports,
 new BlockReportContext(1, 0, reportId, fullBrLeaseId, sorted));
   }
@@ -129,7 +128,8 @@ public void testBlockReportSucceedsWithLargerLengthLimit() 
throws Exception {
*
* @throws Exception if initialization fails
*/
-  private void initCluster() throws Exception {
+  private void initCluster(int ipcMaxDataLength) throws Exception {
+conf.setInt(IPC_MAXIMUM_DATA_LENGTH, ipcMaxDataLength);
 cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
 cluster.waitActive();
 dn = cluster.getDataNodes().get(0);
{code}

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10312:
-
Attachment: (was: test-delta.patch)

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248935#comment-15248935
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 15s {color} 
| {color:red} HDFS-10312 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799625/test-delta.patch |
| JIRA Issue | HDFS-10312 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15207/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch, test-delta.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10312:
-
Attachment: test-delta.patch

Hi [~cnauroth], +1 for the v4 patch. Thanks for this improvement.

The attached test-delta.patch reduces the test runtime from ~30 seconds to ~3 
seconds by using a lower message size limit. What do you think?

Also (and Chris etc. know this of course!), it is far from ideal to have ~6 
million blocks on one storage directory. We should add a warning when we 
document this setting.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch, test-delta.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics


[ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248877#comment-15248877
 ] 

Colin Patrick McCabe commented on HDFS-10175:
-

I'll take a look tomorrow, [~liuml07].  Thanks for working on this.

> add per-operation stats to FileSystem.Statistics
> 
>
> Key: HDFS-10175
> URL: https://issues.apache.org/jira/browse/HDFS-10175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Ram Venkatesh
>Assignee: Mingliang Liu
> Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
Attachment: HDFS-10312.004.patch

Patch v004 addresses the Checkstyle warnings.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248830#comment-15248830
 ] 

Hadoop QA commented on HDFS-10312:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 
244 unchanged - 1 fixed = 247 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 25s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 34s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 29s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799603/HDFS-10312.002.patch |
| JIRA Issue | HDFS-10312 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 677ff777c2a3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / af9b

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248813#comment-15248813
 ] 

Chris Nauroth commented on HDFS-10312:
--

Yes, that's going to be a small change, but technically it should be grouped as 
a HADOOP JIRA, not HDFS.  I created HADOOP-13039.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster

[
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248778#comment-15248778
]

Yongjun Zhang edited comment on HDFS-10314 at 4/19/16 10:31 PM:

The idea is the wrap around distcp as a tool to achieve the functionality of
distcp's switch -rdiff (if we will do the same for -diff, it will be a
different jira). Here is a description and comparison of the -diff and
unimplemented -rdiff switches.

{code}
Definition: Assuming we have two snapshots, s1 and s2, where s1 is created
earlier, and s1 is newer.

- SnapshotDiff(s1, s2): represents the delta between s1 and s2; That is, if we
apply
snapshotDiff(s1, s2) on top of s1, we can go to the state of s2.
- SnapshotDiff(s2, s1) represents the reversed delta between s1 and s2. That
is, if
we apply SnapshotDiff(s2, s1) on top of s2, we can go back to the state of s1.

Note: When we talk about source and target, we mean distcp source and distcp
target.

A. -diff allows distcp to efficiently copy incremental changes made (on top of
previously copied
snapshot s1) in source cluster to target cluster Assuming snapshot s2 is
created at the source to
capture s1 + incremental changes, snapshotDiff(s1,s2) is the incremental
changes, the output of this
operation is that the target will be at s2 sate. this operation involves
three steps:

A.1 calculate snapshotDiff(s1, s2) at the source
A.2 apply the rename and delete portion of the snapshotDiff at the target.
this step is called "sync"
A.3 copy created/modified files from source's s2 to target

B. -rdiff allows distcp to efficiently copy data from snapshot s1 to overwrite
changes made in target
after snapshot sx was created in target. Assuming snapshot s2 is created at
the target to capture
the changes that need to be overwritten, snapshotDiff(s2, s1) is what we
want to apply to target.
The output of this operation is that the target is at s1 state. Similar to
-diff, but with differences,
this operation involves three steps too:

B.1 calculate snapshotDiff(s2, s1) at the target,
B.2 apply the rename and delete portion of the snapshot diff at the target.
this step is called "sync"
B.3 copy created/modified files from source's s1 to target. (the source here
can be a different
cluster, or the target itself. When it's a different cluster, the
cluster has to have snapshot s1
that's has exact same name and content as the s1 at the target)

A tablularized comparison:

required snapshots DiffCalc Output After Operation
--
sourcetarget
--
-diff s1, s2 -> s1 source target is at s2
-rdiffs1 -> s1,s2target target is at s1

(note, for -rdiff, the source could be the same as target)

So the "r" (reversed) in the -rdiff means the following and is very symmetric
to -diff:

- swap the snapshot requirement of source and target in -diff
(from "s1, s2 -> s1 " to "s1 -> s1,s2")
- swap the result snapshot after operation (from s2 to s1)
- swap the snapshot diff calculation place (from source to target)

We require source and target to have same snapshot s1 (same snapshot name, same
content).
{code}

was (Author: yzhangal):
The idea is the wrap around distcp as a tool to achieve the functionality of
distcp's switch -rdiff (if we will do the same for -diff, it will be a
different jira). Here is a description and comparison of the -diff and
unimplemented -rdiff switches.

{code}
Definition: Assuming we have two snapshots, s1 and s2, where s1 is created
earlier, and s1 is newer.

Note: When we talk about source and target, we mean distcp source and distcp
target.

B. -rdiff allows distcp to efficiently copy data from snapshot s1

[jira] [Commented] (HDFS-9894) Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives


[ 
https://issues.apache.org/jira/browse/HDFS-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248810#comment-15248810
 ] 

Xiaobing Zhou commented on HDFS-9894:
-

The patch v000 is posted, please kindly review, thanks.

> Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives
> 
>
> Key: HDFS-9894
> URL: https://issues.apache.org/jira/browse/HDFS-9894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: 2.8.0
> Attachments: HDFS-9894.000.patch
>
>
> This is to augment FileContext/AbstractFileSystem and derivatives with newly 
> added API unsetStoragePolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10304) Implement moveToLocal


 [ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10304:
-
Status: Patch Available  (was: Open)

> Implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
> Attachments: HDFS-10304.000.patch
>
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9894) Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives


 [ 
https://issues.apache.org/jira/browse/HDFS-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9894:

Attachment: HDFS-9894.000.patch

> Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives
> 
>
> Key: HDFS-9894
> URL: https://issues.apache.org/jira/browse/HDFS-9894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: 2.8.0
> Attachments: HDFS-9894.000.patch
>
>
> This is to augment FileContext/AbstractFileSystem and derivatives with newly 
> added API unsetStoragePolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9894) Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives


 [ 
https://issues.apache.org/jira/browse/HDFS-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9894:

Status: Patch Available  (was: Open)

> Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives
> 
>
> Key: HDFS-9894
> URL: https://issues.apache.org/jira/browse/HDFS-9894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: 2.8.0
> Attachments: HDFS-9894.000.patch
>
>
> This is to augment FileContext/AbstractFileSystem and derivatives with newly 
> added API unsetStoragePolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-19 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248807#comment-15248807
 ] 

Mingliang Liu commented on HDFS-10312:
--

Shall we create a new jira for this?

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-19 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248798#comment-15248798
 ] 

Xiaoyu Yao commented on HDFS-10207:
---

[~xiaobingo], thanks for updating the patch. Here are my new comments on patch 
v004

1. Remove unused imports in Namenode.java
+import java.util.Set;

import java.util.Collections; 
since the RECONFIGURABLE_PROPERTIES is changed from 
Collections.unmodifiableList to Sets.newTreeSet(Lists.newArrayList(...)

2. Do we want to keep the unmodifieableList to avoid creating new list upon 
each NamenodeRpcServer#listReconfigurableProperties() call? 

3. NIT: NameNode#initBackoffEnableKeys can be changed to 
NameNode#initReconfigurableBackoffKey to avoid confusion. 

4. NIT: An extra space TestNameNodeReconfigure (line 109 )
* Test to reconfigure enable/disable IPC backoff
 */

5. Test code below can be simplified:
{code}
 String IPC_CLIENT_RPC_BACKOFF_ENABLE;
116 
117 /**
118  * Test IPC_CLIENT_RPC_BACKOFF_ENABLE
119  */
120 IPC_CLIENT_RPC_BACKOFF_ENABLE = NameNode.buildBackoffEnableKey(nnrs
121 .getClientRpcServer().getPort());
{code}

into 
{code}
String IPC_CLIENT_RPC_BACKOFF_ENABLE = NameNode.buildBackoffEnableKey(nnrs
121 .getClientRpcServer().getPort());
{code}

6. This test verification logic can be wrapped into helper functions with a 
single boolean parameter for better reuse and clarity. 

{code}
125 assertEquals(IPC_CLIENT_RPC_BACKOFF_ENABLE + " has wrong value", 
false,
126 nnrs.getClientRpcServer().isClientBackoffEnabled());
127 assertEquals(
128 IPC_CLIENT_RPC_BACKOFF_ENABLE + " has wrong value",
129 false,
130 nameNode.getConf().getBoolean(IPC_CLIENT_RPC_BACKOFF_ENABLE,
131 IPC_BACKOFF_ENABLE_DEFAULT));
{code}



> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
Attachment: HDFS-10312.003.patch

Here is patch v003 with one more change in the test.  I found that all of the 
bogus block IDs were causing a lot of log spam and slowing down the test, 
particularly for the block state change messages and the {{FsDatasetImpl}} 
"Failed to delete replica" messages.  I've changed the test to set log level to 
WARN for these.  That skips the log spam and speeds up the test quite a bit.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot


[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248797#comment-15248797
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Many thanks to [~jingzhao] for the offline discussion. I created HDFS-10313 and 
HDFS-10314 as a result. 


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> A common use scenario (scenaio 1): 
> # create snapshot sx in clusterX, 
> # do some experiemnts in clusterX, which creates some files. 
> # throw away the files changed and go back to sx.
> Another scenario (scenario 2) is, there is a production cluster and a backup 
> cluster, we periodically sync up the data from production cluster to the 
> backup cluster with distcp. 
> The cluster in scenario 1 could be the backup cluster in scenario 2.
> For scenario 1:
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges.  Before that jira is implemented, we count on 
> distcp to copy from snapshot to the current state. However, the performance 
> of this operation could be very bad because we have to go through all files 
> even if we only changed a few files.
> For scenario 2:
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from 
> snapshot sx of either the source or the target cluster, to restore target 
> cluster's current state to sx. 
> Specifically,
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> As a native restore feature, HDFS-4167 would still be ideal to have. However, 
>  HDFS-9820 would hopefully be easier to implement, before HDFS-4167 is in 
> place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics

2016-04-19 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248792#comment-15248792
 ] 

Mingliang Liu commented on HDFS-10175:
--

Hi [~cmccabe], would you kindly have a look at the current patch please? Is it 
OK according to our discussion?

> add per-operation stats to FileSystem.Statistics
> 
>
> Key: HDFS-10175
> URL: https://issues.apache.org/jira/browse/HDFS-10175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Ram Venkatesh
>Assignee: Mingliang Liu
> Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster

[
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248778#comment-15248778
]

Yongjun Zhang edited comment on HDFS-10314 at 4/19/16 10:13 PM:

{code}
Definition: Assuming we have two snapshots, s1 and s2, where s1 is created
earlier, and s1 is newer.

Note: When we talk about source and target, we mean distcp source and distcp
target.

A tablularized comparison:

required snapshots DiffCalc Output After Operation
--
sourcetarget
--
-diff s1, s2 -> s1 source target is at s2
-rdiffs1 -> s1,s2target target is at s1

(note, for -rdiff, the source could be the same as target)

So the "r" (reversed) in the -rdiff means the following:

We require source and target to have same snapshot s1 (same snapshot name, same
content).
{code}

{code}
Definition: Assuming we have two snapshots, s1 and s2, where s1 is created
earlier, and s1 is newer.

Note: When we talk about source and target, we mean distcp source and distcp
target.

B. -rdiff allows distcp to efficiently copy data from snapshot s1 to overwrite
changes made in ta

[jira] [Commented] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster


[ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248784#comment-15248784
 ] 

Yongjun Zhang commented on HDFS-10314:
--

Hi [~jingzhao],

I'm thinking about implementing -rdiff as a hidden switch of distcp, and 
implementing this tool as a script.that calls distcp. What do you think?

Thanks.
 

> Propose a new tool that wraps around distcp to "restore" changes on target 
> cluster
> --
>
> Key: HDFS-10314
> URL: https://issues.apache.org/jira/browse/HDFS-10314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync  
> {code}
> This command ensure   is newer than .
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync  
> {code}
> that ensures   is older than .
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster


[ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248778#comment-15248778
 ] 

Yongjun Zhang commented on HDFS-10314:
--

The idea is the wrap around distcp as a tool to achieve the functionality of 
distcp's switch -rdiff (if we will do the same for -diff, it will be a 
different jira). Here is a description and comparison of the -diff and 
unimplemented -rdiff switches. 

{code}
Definition: Assuming we have two snapshots, s1 and s2, where s1 is created 
earlier, and s1 is newer.

- SnapshotDiff(s1, s2): represents the delta between s1 and s2; That is, if we 
apply 
  snapshotDiff(s1, s2)  on top of s1, we can go to the state of s2.
- SnapshotDiff(s2, s1) represents the reversed delta between s1 and s2. That 
is, if
  we apply SnapshotDiff(s2, s1) on top of s2, we can go back to the state of s1.

Note: When we talk about source and target, we mean distcp source and distcp 
target.

A. -diff allows distcp to efficiently copy incremental changes made (on top of 
previously copied
snapshot s1) in source cluster to target cluster   Assuming snapshot s2 is 
created at the source to
capture s1 + incremental changes, snapshotDiff(s1,s2) is the incremental 
changes, the output of this
operation is that the target will be at s2 sate. this operation involves 
three steps:

  A.1 calculate snapshotDiff(s1, s2) at the source
  A.2 apply the rename and delete portion of the snapshotDiff at the target. 
this step is called "sync"
  A.3 copy created/modified files from source's s2 to target 

B. -rdiff allows distcp to efficiently copy data from snapshot s1 to overwrite 
changes made in target
after snapshot sx was created in target. Assuming snapshot s2 is created at 
the target to capture
the changes that need to be overwritten, snapshotDiff(s2, s1) is what we 
want to apply to target. 
The output of this operation is that the target is at s1 state. Similar to 
-diff, but with differences, 
this operation involves three steps too:

  B.1 calculate snapshotDiff(s2, s1) at the target,
  B.2 apply the rename and delete portion of the snapshot diff at the target. 
this step is called "sync"
  B.3 copy created/modified files from source's s1 to target. (the source here 
can be a different
cluster, or the target itself. When it's a different cluster, the 
cluster has to have snapshot s1 
that's has exact same name and content as the s1 at the target)

A tablularized comparison:

  required snapshots  DiffCalc   Output After Operation
  --
  sourcetarget
  --
-diff s1, s2   ->  s1 source target is at s2
-rdiffs1->   s1,s2target  target is at  s1  

(note, for -rdiff, the source could be the same as target)

So the "r" (reversed) in the -rdiff means the following:

- swap the snapshot requirement of source and target in -diff 
  (from "s1, s2   ->   s1 "  to  "s1  ->   s1,s2")
- swap the result snapshot after operation (from s2 to s1)
- swap the snapshot diff calculation place  (from source to target)

We require source and target to have same snapshot s1 (same snapshot name, same 
content).
{code}


> Propose a new tool that wraps around distcp to "restore" changes on target 
> cluster
> --
>
> Key: HDFS-10314
> URL: https://issues.apache.org/jira/browse/HDFS-10314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync  
> {code}
> This command ensure   is newer than .
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync  
> {code}
> that ensures   is older than .
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck

2016-04-19 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248774#comment-15248774
 ] 

Andrew Wang commented on HDFS-9016:
---

I'm okay with the config flag, since it only affects users who are explicitly 
opting in to this information.

> Display upgrade domain information in fsck
> --
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when 
> upgrade domain is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster


 [ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10314:
-
Description: 
HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
-diff switch. 

Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
around distcp to achieve the same purpose.

I'm thinking about calling the new tool "rsync", similar to unix/linux command 
"rsync". The "r" here means remote.

The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
{code}
rsync  
{code}
This command ensure   is newer than .

I think, In the future, we can add another command to have the functionality of 
-diff switch of distcp.
{code}
sync  
{code}
that ensures   is older than .

Thanks [~jingzhao].

  was:
HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
-diff switch. 

Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
around distcp to achieve the same purpose.

I'm thinking about calling the new tool "rsync", similar to unix/linux command 
"rsync". The "r" here means remote.

The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
 {code}  
rsync  
Pcode}
This command ensure   is newer than .

I think, In the future, we can add another command to have the functionality of 
-diff switch of distcp.
 {code}  
sync  
Pcode}
where   must be older than .

Thanks [~jingzhao].


> Propose a new tool that wraps around distcp to "restore" changes on target 
> cluster
> --
>
> Key: HDFS-10314
> URL: https://issues.apache.org/jira/browse/HDFS-10314
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync  
> {code}
> This command ensure   is newer than .
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync  
> {code}
> that ensures   is older than .
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10314) Propose a new tool that wraps around distcp to "restore" changes on target cluster

Yongjun Zhang created HDFS-10314:


 Summary: Propose a new tool that wraps around distcp to "restore" 
changes on target cluster
 Key: HDFS-10314
 URL: https://issues.apache.org/jira/browse/HDFS-10314
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang


HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
-diff switch. 

Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
around distcp to achieve the same purpose.

I'm thinking about calling the new tool "rsync", similar to unix/linux command 
"rsync". The "r" here means remote.

The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
 {code}  
rsync  
Pcode}
This command ensure   is newer than .

I think, In the future, we can add another command to have the functionality of 
-diff switch of distcp.
 {code}  
sync  
Pcode}
where   must be older than .

Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

Yongjun Zhang created HDFS-10313:


 Summary: Distcp does not check the order of snapshot names passed 
to -diff
 Key: HDFS-10313
 URL: https://issues.apache.org/jira/browse/HDFS-10313
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Reporter: Yongjun Zhang


This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
informative error message.

This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
Thanks Jing.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
Attachment: HDFS-10312.002.patch

[~liuml07] and [~xyao], thank you for the code reviews.  That's a great catch 
on the lack of {{fail}} in the test.  I'm attaching patch v002 with the fix.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9943) Support reconfiguring namenode replication confs


[ 
https://issues.apache.org/jira/browse/HDFS-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248716#comment-15248716
 ] 

Xiaobing Zhou commented on HDFS-9943:
-

v002 is rebased on trunk.

> Support reconfiguring namenode replication confs
> 
>
> Key: HDFS-9943
> URL: https://issues.apache.org/jira/browse/HDFS-9943
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9943-HDFS-9000.000.patch, 
> HDFS-9943-HDFS-9000.001.patch, HDFS-9943-HDFS-9000.002.patch
>
>
> The following confs should be re-configurable in runtime.
> - dfs.namenode.replication.work.multiplier.per.iteration
> - dfs.namenode.replication.interval
> - dfs.namenode.replication.max-streams
> - dfs.namenode.replication.max-streams-hard-limit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9943) Support reconfiguring namenode replication confs


 [ 
https://issues.apache.org/jira/browse/HDFS-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9943:

Attachment: HDFS-9943-HDFS-9000.002.patch

> Support reconfiguring namenode replication confs
> 
>
> Key: HDFS-9943
> URL: https://issues.apache.org/jira/browse/HDFS-9943
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9943-HDFS-9000.000.patch, 
> HDFS-9943-HDFS-9000.001.patch, HDFS-9943-HDFS-9000.002.patch
>
>
> The following confs should be re-configurable in runtime.
> - dfs.namenode.replication.work.multiplier.per.iteration
> - dfs.namenode.replication.interval
> - dfs.namenode.replication.max-streams
> - dfs.namenode.replication.max-streams-hard-limit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-19 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248713#comment-15248713
 ] 

Mingliang Liu commented on HDFS-10312:
--

+1 (non-binding)

One nit is that, in the unit test {{testBlockReportExceedsLengthLimit()}}, we 
can add a {{fail("Should have failed because of the too long RPC data 
length");}} as the last statement of the {{try}} block.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-19 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248712#comment-15248712
 ] 

Xiaoyu Yao commented on HDFS-10312:
---

As a follow up, I suggest we document this *ipc.maximum.data.length* key. 
Currently, I can't find information about it in the core-default.xml.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-19 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248696#comment-15248696
 ] 

Xiaoyu Yao commented on HDFS-10312:
---

Thanks [~cnauroth] for posting the fix along with your analysis. The patch 
looks good to me. +1 pending Jenkins.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10304) Implement moveToLocal


[ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248606#comment-15248606
 ] 

Xiaobing Zhou commented on HDFS-10304:
--

I posted the patch v000, please kindly review, thanks.

> Implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
> Attachments: HDFS-10304.000.patch
>
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10304) Implement moveToLocal


 [ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10304:
-
Attachment: HDFS-10304.000.patch

> Implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
> Attachments: HDFS-10304.000.patch
>
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10304) Implement moveToLocal


[ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248601#comment-15248601
 ] 

Xiaobing Zhou commented on HDFS-10304:
--

Sorry [~steve_l], I mean it's the way I am going to implement it. :))

> Implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10304) Implement moveToLocal


 [ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10304:
-
Summary: Implement moveToLocal  (was: implement moveToLocal)

> Implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10304) implement moveToLocal


 [ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10304:
-
Summary: implement moveToLocal  (was: implement moveToLocal or remove it 
from the usage list)

> implement moveToLocal
> -
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
Status: Patch Available  (was: Open)

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
Attachment: HDFS-10312.001.patch

The attached patch passes the value of {{ipc.maximum.data.length}} through to 
the block list decoding layer, and then applies it as an override to the 
protobuf classes.  I considered introducing a new configuration property, but 
ultimately I decided against it, because the admin would just have to tune 2 
things in sync if they encountered this problem.  I maintained a few of the old 
method signatures that don't include the max length and annotated them 
{{VisibleForTesting}} to avoid larger impact on existing tests.  The new test 
suite demonstrates the problem and the fix.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.


[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248577#comment-15248577
 ] 

Chris Nauroth commented on HDFS-10312:
--

I saw this happen with a block report from a DataNode containing ~6 million 
blocks.  All blocks were on a single data directory, so unfortunately, the 
block report splitting by storage didn't help.  Here is a sample stack trace:

{code}
org.apache.hadoop.ipc.RemoteException: java.lang.IllegalStateException: 
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.runBlockOp(BlockManager.java:4404)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1436)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:173)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:30059)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417)
Caused by: java.lang.IllegalStateException: 
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369)
at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiffSorted(BlockManager.java:2478)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:2313)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:2121)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$1.call(NameNodeRpcServer.java:1439)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$1.call(NameNodeRpcServer.java:1436)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:4463)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4442)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message 
was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to 
increase the size limit.
at 
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at 
com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
at 
com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:365)
... 9 more

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1443)
at org.apache.hadoop.ipc.Client.call(Client.java:1402)
at org.apache.hadoop.ipc.Client.call(Client.java:1352)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy21.blockReport(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:204)
at 
org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:86)
{code}

This is an unusual situation, but we should provide a way for it to succeed.


> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>

[jira] [Created] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

Chris Nauroth created HDFS-10312:


 Summary: Large block reports may fail to decode at NameNode due to 
64 MB protobuf maximum length restriction.
 Key: HDFS-10312
 URL: https://issues.apache.org/jira/browse/HDFS-10312
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Chris Nauroth
Assignee: Chris Nauroth


Our RPC server caps the maximum size of incoming messages at 64 MB by default.  
For exceptional circumstances, this can be uptuned using 
{{ipc.maximum.data.length}}.  However, for block reports, there is still an 
internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
stack trace to follow in comments.)  This issue proposes to apply the same 
override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10311) libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket

2016-04-19 Thread James Clampffer (JIRA)

James Clampffer created HDFS-10311:
--

 Summary: libhdfs++: DatanodeConnection::Cancel should not delete 
the underlying socket
 Key: HDFS-10311
 URL: https://issues.apache.org/jira/browse/HDFS-10311
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


DataNodeConnectionImpl calls reset on the unique_ptr that references the 
underlying asio::tcp::socket.  If this happens after the continuation pipeline 
checks the cancel state but before asio uses the socket it will segfault 
because unique_ptr::reset will explicitly change it's value to nullptr.

Cancel should only call shutdown() and close() on the socket but keep the 
instance of it alive.  The socket can probably also be turned into a member of 
DataNodeConnectionImpl to get rid of the unique pointer and simplify things a 
bit.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10310) libhdfs++: hdfsConnect needs timeout logic

2016-04-19 Thread James Clampffer (JIRA)

James Clampffer created HDFS-10310:
--

 Summary: libhdfs++: hdfsConnect needs timeout logic
 Key: HDFS-10310
 URL: https://issues.apache.org/jira/browse/HDFS-10310
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


hdfsConnect will hang when it attempts to connect to a non-existent NN, right 
now the client has to wait on a TCP timeout to get unstuck.  Adding some 
reasonable timeout on FileSystem::Connect will fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-10276) Different results for exist call for file.ext/name


[ 
https://issues.apache.org/jira/browse/HDFS-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248402#comment-15248402
 ] 

Colin Patrick McCabe edited comment on HDFS-10276 at 4/19/16 6:49 PM:
--

bq. Thanks a lot for your comment. I've reproduced this error and it should be 
marked as a bug. I found that name node would check the access before checking 
whether the file existed.

The NameNode needs to check access before checking whether the file exists.  
Otherwise, unprivileged users could get information about files and directories 
they should not have access to.  In this specific case, though, we do want 
exists to return false, since the user apparently does have permissions to find 
out that the path doesn't exist.


was (Author: cmccabe):
bq. Thanks a lot for your comment. I've reproduced this error and it should be 
marked as a bug. I found that name node would check the access before checking 
whether the file existed.

The NameNode needs to check access before checking whether the file exists.  
Otherwise, unprivileged users could get information about files and directories 
they should not have access to.  In this specific case, though, we do want 
exists to return false, since the user apparently does have permissions to find 
out that the file doesn't exist.

> Different results for exist call for file.ext/name
> --
>
> Key: HDFS-10276
> URL: https://issues.apache.org/jira/browse/HDFS-10276
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kevin Cox
>Assignee: Yuanbo Liu
>
> Given you have a file {{/file}} an existence check for the path 
> {{/file/whatever}} will give different responses for different 
> implementations of FileSystem.
> LocalFileSystem will return false while DistributedFileSystem will throw 
> {{org.apache.hadoop.security.AccessControlException: Permission denied: ..., 
> access=EXECUTE, ...}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10276) Different results for exist call for file.ext/name


[ 
https://issues.apache.org/jira/browse/HDFS-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248402#comment-15248402
 ] 

Colin Patrick McCabe commented on HDFS-10276:
-

bq. Thanks a lot for your comment. I've reproduced this error and it should be 
marked as a bug. I found that name node would check the access before checking 
whether the file existed.

The NameNode needs to check access before checking whether the file exists.  
Otherwise, unprivileged users could get information about files and directories 
they should not have access to.  In this specific case, though, we do want 
exists to return false, since the user apparently does have permissions to find 
out that the file doesn't exist.

> Different results for exist call for file.ext/name
> --
>
> Key: HDFS-10276
> URL: https://issues.apache.org/jira/browse/HDFS-10276
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kevin Cox
>Assignee: Yuanbo Liu
>
> Given you have a file {{/file}} an existence check for the path 
> {{/file/whatever}} will give different responses for different 
> implementations of FileSystem.
> LocalFileSystem will return false while DistributedFileSystem will throw 
> {{org.apache.hadoop.security.AccessControlException: Permission denied: ..., 
> access=EXECUTE, ...}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7240) Object store in HDFS

2016-04-19 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-7240:
---
Attachment: ozone_user_v0.pdf

Proposed user interfaces for Ozone. Documentation of REST and CLI interfaces.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Ozone-architecture-v1.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248379#comment-15248379
 ] 

Hudson commented on HDFS-10264:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9634 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9634/])
HDFS-10264. Logging improvements in FSImageFormatProtobuf.Saver. (arp: rev 
af9bdbe447b119bff10ec5281993bfc36b6dea71)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java


> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Fix For: 2.7.3
>
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-19 Thread Amit Anand (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Anand updated HDFS-10309:
--
Description: 
While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
deployment {{dfs.blocksize}} is set to {{128m}}. 

{code}
hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
nodes = 0, #blockpools = 0, run during upgrade = false]
16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
30mins, 0sec
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 540 
(default=540)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
(default=1000)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 
(default=200)
16/04/19 08:49:52 INFO balancer.Balancer: 
dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
2147483648 (default=2147483648)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size 
= 10485760 (default=10485760)
16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
10737418240 (default=10737418240)
Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.NumberFormatException: For input string: "128m"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
{code}

However, the workaround for this is to run {{hdfs balancer}} with passing 
numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.

{code}
hdfs balancer -Ddfs.blocksize=134217728
{code}


  was:
While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
defined with suffix {{k(kilo), m(mega), g(giga)}}. In my deployment 
{{dfs.blocksize}} is set to {{128m}}. 

{code}
hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
nodes = 0, #blockpools = 0, run during upgrade = false]
16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
30mins, 0sec
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 540 
(default=540)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
(default=1000)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 
(default=200)
16/04/19 08:49:52 INFO balancer.Balancer: 
dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
2147483648 (default=2147483648)
16/04/19 08:49:52 INFO balance

[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver


 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10264:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.7.3
Target Version/s:   (was: 2.6.5)
  Status: Resolved  (was: Patch Available)

I've committed this. Thank you for the contribution [~xiaobingo].

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Fix For: 2.7.3
>
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver


 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10264:
-
Issue Type: Improvement  (was: Bug)

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver


[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248357#comment-15248357
 ] 

Arpit Agarwal commented on HDFS-10264:
--

+1 I will commit this shortly.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-19 Thread Amit Anand (JIRA)

Amit Anand created HDFS-10309:
-

 Summary: HDFS Balancer doesn't honor dfs.blocksize value defined 
with suffix k(kilo), m(mega), g(giga)
 Key: HDFS-10309
 URL: https://issues.apache.org/jira/browse/HDFS-10309
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer & mover
Affects Versions: 2.8.0
Reporter: Amit Anand
Assignee: Amit Anand


While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
defined with suffix {{k(kilo), m(mega), g(giga)}}. In my deployment 
{{dfs.blocksize}} is set to {{128m}}. 

{code}
hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
nodes = 0, #blockpools = 0, run during upgrade = false]
16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
30mins, 0sec
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 540 
(default=540)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
(default=1000)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 
(default=200)
16/04/19 08:49:52 INFO balancer.Balancer: 
dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
2147483648 (default=2147483648)
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size 
= 10485760 (default=10485760)
16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
10737418240 (default=10737418240)
Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.NumberFormatException: For input string: "128m"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
{code}

However, the workaround for this is to run {{hdfs balancer}} with passing 
numeric value for {{dfs.blocksize}}

{code}
hdfs balancer -Ddfs.blocksize=134217728
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver


[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248333#comment-15248333
 ] 

Xiaobing Zhou commented on HDFS-10264:
--

I verified the test failures are not related to this patch. Could anyone help 
to commit it? Thanks.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing