[jira] [Commented] (HDFS-9353) Code and comment mismatch in JavaKeyStoreProvider

2016-07-11 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372244#comment-15372244
 ] 

Daniel Templeton commented on HDFS-9353:


LGTM.  +1 (non-binding)

> Code and comment mismatch in  JavaKeyStoreProvider 
> ---
>
> Key: HDFS-9353
> URL: https://issues.apache.org/jira/browse/HDFS-9353
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: nijel
>Assignee: Andras Bokor
>Priority: Trivial
> Attachments: HDFS-9353.01.patch
>
>
> In
> org.apache.hadoop.crypto.key.JavaKeyStoreProvider.JavaKeyStoreProvider(URI 
> uri, Configuration conf) throws IOException
> The comment mentioned is
> {code}
> // Get the password file from the conf, if not present from the user's
> // environment var
> {code}
> But the code takes the value form ENV first
> I think this make sense since the user can pass the ENV for a particular run.
> My suggestion is to change the comment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8991) Provide information on BPOfferService in DN JMX

2016-07-11 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8991:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

This is a duplicate of [HDFS-10440], which has been committed recently. 
Resolving this JIRA.

> Provide information on BPOfferService in DN JMX
> ---
>
> Key: HDFS-8991
> URL: https://issues.apache.org/jira/browse/HDFS-8991
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8991.000.patch, HDFS-8991.001.patch
>
>
> In cases like HDFS-7714 where the BPOfferService thread is missing, which 
> require nontrivial effort to debug that which NN that the DN thinks it active 
> / standby.
> It would make sense to make the information more accessible through JMX or 
> Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10579) HDFS web interfaces lack configs for X-FRAME-OPTIONS protection

2016-07-11 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372063#comment-15372063
 ] 

Larry McCay commented on HDFS-10579:


[~jnp] - do we need this in branch-2.8 as well?

> HDFS web interfaces lack configs for X-FRAME-OPTIONS protection
> ---
>
> Key: HDFS-10579
> URL: https://issues.apache.org/jira/browse/HDFS-10579
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.9.0
>
> Attachments: HDFS-10579.001.patch, HDFS-10579.002.patch, 
> HDFS-10579.003.patch
>
>
> This JIRA proposes to extend the work done in HADOOP-12964 and enable a 
> configuration value that enables or disables that option.
> This allows HDFS to remain backward compatible as required by the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10579) HDFS web interfaces lack configs for X-FRAME-OPTIONS protection

2016-07-11 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372045#comment-15372045
 ] 

Jitendra Nath Pandey commented on HDFS-10579:
-

I have committed this to trunk and branch-2. Thanks [~anu].

> HDFS web interfaces lack configs for X-FRAME-OPTIONS protection
> ---
>
> Key: HDFS-10579
> URL: https://issues.apache.org/jira/browse/HDFS-10579
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.9.0
>
> Attachments: HDFS-10579.001.patch, HDFS-10579.002.patch, 
> HDFS-10579.003.patch
>
>
> This JIRA proposes to extend the work done in HADOOP-12964 and enable a 
> configuration value that enables or disables that option.
> This allows HDFS to remain backward compatible as required by the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372031#comment-15372031
 ] 

Hudson commented on HDFS-10300:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10077 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10077/])
HDFS-10300. TestDistCpSystem should share MiniDFSCluster. Contributed by (wang: 
rev f292624bd8dbdc1841f225a34346d0392fa76a47)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSystem.java


> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Fix For: 2.8.0
>
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372025#comment-15372025
 ] 

John Zhuge commented on HDFS-10300:
---

Thanks [~andrew.wang].

> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Fix For: 2.8.0
>
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing

2016-07-11 Thread Jiayi Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiayi Zhou updated HDFS-10519:
--
Attachment: HDFS-10519.003.patch

Add a fake sendedits to make committed txn id more up-to-date. This means the 
standby namenode will only tail the in-progress edits which are committed. 
Thank you [~tlipcon] for the idea during an offline discussion.

> Add a configuration option to enable in-progress edit log tailing
> -
>
> Key: HDFS-10519
> URL: https://issues.apache.org/jira/browse/HDFS-10519
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha
>Reporter: Jiayi Zhou
>Assignee: Jiayi Zhou
>Priority: Minor
> Attachments: HDFS-10519.001.patch, HDFS-10519.002.patch, 
> HDFS-10519.003.patch
>
>
> Standby Namenode has the option to do in-progress edit log tailing to improve 
> the data freshness. In-progress tailing is already implemented, but it's not 
> enabled as default configuration. And there's no related configuration key to 
> turn it on.
> Adding a related configuration key to let Standby Namenode is reasonable and 
> would be a basis for further improvement on Standby Namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-10300:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, branch-2.8. Thanks for working on this John!

> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Fix For: 2.8.0
>
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372010#comment-15372010
 ] 

Andrew Wang commented on HDFS-10300:


LGTM thanks John! Will check this one in.

> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372005#comment-15372005
 ] 

Hadoop QA commented on HDFS-10300:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} hadoop-tools/hadoop-distcp: The patch generated 0 
new + 15 unchanged - 4 fixed = 15 total (was 19) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
41s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817303/HDFS-10300.002.patch |
| JIRA Issue | HDFS-10300 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 919fd8f538b9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7bd5d42 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16022/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16022/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test case

[jira] [Updated] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10300:
--
Attachment: HDFS-10300.002.patch

Patch 002:
* Check not null before calling {{cluster.shutdown}}
* Fix 2 checkstyle errors

> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Attachments: HDFS-10300.001.patch, HDFS-10300.002.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371924#comment-15371924
 ] 

Hadoop QA commented on HDFS-10300:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 
2 new + 15 unchanged - 4 fixed = 17 total (was 19) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
46s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12800134/HDFS-10300.001.patch |
| JIRA Issue | HDFS-10300 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 75256d308902 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c447efe |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16021/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16021/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16021/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>   

[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster

2016-07-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371881#comment-15371881
 ] 

Andrew Wang commented on HDFS-10300:


Normally we'd guard the {{cluster.shutdown}} with a null check (like in the 
{{finally}} block), just in case. Let's keep doing that in the AfterClass.

Otherwise LGTM, thanks for working on this John!

> TestDistCpSystem should share MiniDFSCluster
> 
>
> Key: HDFS-10300
> URL: https://issues.apache.org/jira/browse/HDFS-10300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: quality, test
> Attachments: HDFS-10300.001.patch
>
>
> The test cases in this class should share MiniDFSCluster if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10579) HDFS web interfaces lack configs for X-FRAME-OPTIONS protection

2016-07-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371865#comment-15371865
 ] 

Hudson commented on HDFS-10579:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10075 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10075/])
HDFS-10579. HDFS web interfaces lack configs for X-FRAME-OPTIONS (jitendra: rev 
c447efebdb92dcdf3d95e983036f53bfbed2c0b4)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/TestDatanodeHttpXFrame.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/DatanodeHttpServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeHttpServerXFrame.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> HDFS web interfaces lack configs for X-FRAME-OPTIONS protection
> ---
>
> Key: HDFS-10579
> URL: https://issues.apache.org/jira/browse/HDFS-10579
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.9.0
>
> Attachments: HDFS-10579.001.patch, HDFS-10579.002.patch, 
> HDFS-10579.003.patch
>
>
> This JIRA proposes to extend the work done in HADOOP-12964 and enable a 
> configuration value that enables or disables that option.
> This allows HDFS to remain backward compatible as required by the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371826#comment-15371826
 ] 

Hadoop QA commented on HDFS-10441:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
24s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
21s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
30s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817262/HDFS-10441.HDFS-8707.010.patch
 |
| JIRA Issue | HDFS-10441 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 42d34237aefa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d643d8c |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16019/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16019/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.pat

[jira] [Commented] (HDFS-10488) Update WebHDFS documentation regarding CREATE and MKDIR default permissions

2016-07-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371806#comment-15371806
 ] 

Andrew Wang commented on HDFS-10488:


Thanks Vinod, sorry for the miss.

> Update WebHDFS documentation regarding CREATE and MKDIR default permissions
> ---
>
> Key: HDFS-10488
> URL: https://issues.apache.org/jira/browse/HDFS-10488
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, webhdfs
>Affects Versions: 2.6.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: HDFS-10488.002.patch, HDFS-10488.003.patch, 
> HDFS-10488.005.patch, HDFS-10488.006.patch, HDFS-10488.patch
>
>
> WebHDFS methods for creating file/directories were always creating it with 
> 755 permissions as default for both files and directories. 
> The configured *fs.permissions.umask-mode* is intentionally ignored.
> This jira is to update the Documentation properly, explaining *umask* is not 
> applied when using WebHDFS related methods.
> HDFS-6434 has also modified the default permissions for files, which is now 
> *644*. This will also be updated on the current documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10488) Update WebHDFS documentation regarding CREATE and MKDIR default permissions

2016-07-11 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371800#comment-15371800
 ] 

Vinod Kumar Vavilapalli commented on HDFS-10488:


This never made it to branch-2.7.3. I just merged it in.

> Update WebHDFS documentation regarding CREATE and MKDIR default permissions
> ---
>
> Key: HDFS-10488
> URL: https://issues.apache.org/jira/browse/HDFS-10488
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, webhdfs
>Affects Versions: 2.6.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: HDFS-10488.002.patch, HDFS-10488.003.patch, 
> HDFS-10488.005.patch, HDFS-10488.006.patch, HDFS-10488.patch
>
>
> WebHDFS methods for creating file/directories were always creating it with 
> 755 permissions as default for both files and directories. 
> The configured *fs.permissions.umask-mode* is intentionally ignored.
> This jira is to update the Documentation properly, explaining *umask* is not 
> applied when using WebHDFS related methods.
> HDFS-6434 has also modified the default permissions for files, which is now 
> *644*. This will also be updated on the current documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption

2016-07-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371798#comment-15371798
 ] 

Yongjun Zhang commented on HDFS-10587:
--

HI [~szetszwo] and [~kihwal],

I would like to bring this jira to your attention. Would you please help review 
the report and the comments I made?

Especially, I wonder why we have to enforce the size of data sent from 
BlockSender to the end of a chunk (Please see my comments above for details).

The problem here is, the receiving DN would treat the size of the sent data as 
visibleLength, which is wrong.

Thanks a lot.




> Incorrect offset/length calculation in pipeline recovery causes block 
> corruption
> 
>
> Key: HDFS-10587
> URL: https://issues.apache.org/jira/browse/HDFS-10587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> We found incorrect offset and length calculation in pipeline recovery may 
> cause block corruption and results in missing blocks under a very unfortunate 
> scenario. 
> (1) A client established pipeline and started writing data to the pipeline.
> (2) One of the data node in the pipeline restarted, closing the socket, and 
> some written data were unacknowledged.
> (3) Client replaced the failed data node with a new one, initiating block 
> transfer to copy existing data in the block to the new datanode.
> (4) The block is transferred to the new node. Crucially, the entire block, 
> including the unacknowledged data, was transferred.
> (5) The last chunk (512 bytes) was not a full chunk, but the destination 
> still reserved the whole chunk in its buffer, and wrote the entire buffer to 
> disk, therefore some written data is garbage.
> (6) When the transfer was done, the destination data node converted the 
> replica from temporary to rbw, which made its visible length as the length of 
> bytes on disk. That is to say, it thought whatever was transferred was 
> acknowledged. However, the visible length of the replica is different (round 
> up to the next multiple of 512) than the source of transfer.
> (7) Client then truncated the block in the attempt to remove unacknowledged 
> data. However, because the visible length is equivalent of the bytes on disk, 
> it did not truncate unacknowledged data.
> (8) When new data was appended to the destination, it skipped the bytes 
> already on disk. Therefore, whatever was written as garbage was not replaced.
> (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it 
> wouldn’t tell NameNode to mark the replica as corrupt, so the client 
> continued to form a pipeline using the corrupt replica.
> (10) Finally the DN that had the only healthy replica was restarted. NameNode 
> then update the pipeline to only contain the corrupt replica.
> (11) Client continue to write to the corrupt replica, because neither client 
> nor the data node itself knows the replica is corrupt. When the restarted 
> datanodes comes back, their replica are stale, despite they are not corrupt. 
> Therefore, none of the replica is good and up to date.
> The sequence of events was reconstructed based on DataNode/NameNode log and 
> my understanding of code.
> Incidentally, we have observed the same sequence of events on two independent 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Attachment: HDFS-10544.04.patch

Thanks [~shv] for the review! Updating the patch to address the comments.

bq. ... Not sure if this is what you expected.
I updated the test to set a config entry to map {{ns1}} to a physical URI. So 
the {{else}} statement in {{DFSUtil#getNameServiceUris}} will be able to 
resolve {{ns1}} to a physical URI. I think this is the correct configuration in 
an environment that uses {{IPFailoverProxyProvider}}.

Also updated the comments. Please kindly review.

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, 
> HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10537) eclipse running NameNode Class Exception

2016-07-11 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-10537:
---
Fix Version/s: (was: 2.7.3)

> eclipse running NameNode Class Exception
> 
>
> Key: HDFS-10537
> URL: https://issues.apache.org/jira/browse/HDFS-10537
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: .D.
>Priority: Minor
>
> I import hadoop2.7.0 source code to eclipse workspace
> I like in eclipse runinng NameNode.java
> args = "-fromat"
> get Error Message:
> 2016-06-16 22:50:09,074 ERROR namenode.NameNode (NameNode.java:main(1558)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: URI has an authority component
>   at java.io.File.(File.java:423)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:329)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:247)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:984)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1428)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1553)
> 2016-06-16 22:50:09,076 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-06-16 22:50:09,078 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down NameNode at -Pro.local/127.0.0.1
> /
> I will core-site.xml and hdfs-site.xml join hadoop project down
> core-site.xml
> 
>   
>   
>   fs.defaultFS
>   hdfs://master:9000/
>   
>   
>   
>   hadoop.tmp.dir
>   file:///Users/Joker/tmp
>   Abase for other temporary 
> directories.
>   
> 
> hdfs-site.xml
> 
>   
>   dfs.replication
>   1
>   
>   
>   dfs.namenode.name.dir
>   
> file:///Users/Joker/Documents/code_framework/java/hadoop-2.7.0/dfs/name
>   
>   
>   dfs.datanode.data.dir
>   
> file:///Users/Joker/Documents/code_framework/java/hadoop-2.7.0/dfs/data
>   
> 
> Thanks !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10610) DfsClient doesn't add hdfs-site.xml as a resource

2016-07-11 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371711#comment-15371711
 ] 

Mingliang Liu commented on HDFS-10610:
--

+[~wheat9] +[~ste...@apache.org] for making {{HdfsConfiguration}} audience 
public.

> DfsClient doesn't add hdfs-site.xml as a resource
> -
>
> Key: HDFS-10610
> URL: https://issues.apache.org/jira/browse/HDFS-10610
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Eric Badger
>
> Instantiating a new DfsClient used to add hdfs-site.xml as a resource in 2.7, 
> but that compatibility has been broken in 2.8. This only accidentally worked 
> in 2.7, since DfsClient would load HdfsConstants, which would in turn create 
> the static IO_FILE_BUFFER_SIZE, which would instantiate an HdfsConfiguration, 
> which would finally add hdfs-site.xml. 
> In 2.8, IO_FILE_BUFFER_SIZE no longer exists in HdfsConstants.java and so 
> this no longer works out of coincidence. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-07-11 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371697#comment-15371697
 ] 

Colin P. McCabe commented on HDFS-10301:


I apologize for the delays in reviewing.  I am looking at HDFS-10301.007.patch. 
 Is this the latest patch?

I don't understand the purpose behind {{BlockListAsLongs#isStorageReportOnly}}. 
 This function is never called.  This state doesn't seem to be stored anywhere 
in what is sent over the wire, either.  Is this an idea that was 
half-implemented, or did I miss something?

{code}
  if (blocks != BlockListAsLongs.STORAGE_REPORT_ONLY) {
{code}
Again, this is comparing by object reference equality not deep equality.  This 
is comment I also made in the last review that wasn't addressed.

My comment earlier is that I didn't want to overload block reports to be 
storage reports.  A storage report is not a kind of block report.  They 
shouldn't be using the same protobuf objects or Java data structures.  This 
isn't addressed in the current patch, which continues the confusing practice of 
using the same data structure for both.

bq. In the upgrade case, there is no way to detect the zombie storages since 
the old DNs do not send the information about the storages in the BR in the 
last RPC. In practice, hot-swapping of DN drives and upgrading the DN may not 
happen at the same time.

The set of storages that the DN reports can change for a variety of reasons, 
most of which are not hotswap related.  One reason is because a drive has 
become bad and got kicked out of the set of currently active volumes.  Another 
reason is because the DN got taken down by the administrator, a volume got 
removed, and the DN was brought back up.  It's rather frustrating that your 
patch doesn't support zombie storage removal during upgrade, and mine does, and 
yet [~shv] is blocking my patch.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.01.patch, HDFS-10301.sample.patch, 
> zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-07-11 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10441:
---
Attachment: HDFS-10441.HDFS-8707.010.patch

New patch.  Addresses all of the "must haves" from Bob's last review.

bq. status.h: is having both is_server_exception_ and exception_class_ 
redundant?
Yep, got rid of is_server_exception.

bq. hdfs_configuration.c: We have a (faster) split function in uri.cc; let's 
refactor that into a Util method
I went and implemented this but was getting valgrind errors in 
configuration_test and hdfs_configuration_test due to statically initialized 
protobuf stuff even after calling the protobuf shutdown method. Going to push 
this into another jira.  Adding it to the current util.h/cc means tests have to 
link against protobuf and openssl that don't really need to so I might try and 
separate out util methods that don't need external libs.  

bq. HdfsConfiguration::LookupNameService: if the URI parsing failed, we should 
just ignore the URI as mal-formed, not bail out of the entire function. There 
may be a well-formed URI in a later value.
Will lump this in with the above improvement in a different jira.  Want to 
check out how the java client handles that.

bq. HdfsConfiguration: I'm a little uncomfortable using the URI parser to break 
apart host:port. If the user enters "foo:bar@baz", it will interpret that as a 
password and silently drop everything before the baz. Just using split(':') and 
converting the port to int if it exists is solid enough.
Lump this in as well since it's related.

bq. status.cc: I don't think the java exception name should go in the 
(user-visible) output message. A string describing the error ("Invalid 
Argument") would be nice, though.
I agree.  In the short term I'd like to keep them around for debugging though.

bq. filesystem.cc: why do we call InitRpc before checking if there's an 
io_service_?
This was a mistake, but I got rid of InitRpc so it's no longer an issue.

bq. rpc_engine.h: Are ha_persisted_info_ and ha_enabled_ redundant?
Pretty much, except for the initial check that nulls out the ha_persisted_info_ 
if parsing failed.

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10609) Uncaught InvalidEncryptionKeyException during pipeline recovery may abort downstream applications

2016-07-11 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371685#comment-15371685
 ] 

Wei-Chiu Chuang commented on HDFS-10609:


{{SaslDataTransferClient.socketSend}} is also used in a number of places in 
Hadoop. It would be great to also fix these places as well.

> Uncaught InvalidEncryptionKeyException during pipeline recovery may abort 
> downstream applications
> -
>
> Key: HDFS-10609
> URL: https://issues.apache.org/jira/browse/HDFS-10609
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.0
> Environment: CDH5.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> In normal operations, if SASL negotiation fails due to 
> {{InvalidEncryptionKeyException}}, it is typically a benign exception, which 
> is caught and retried :
> {code:title=SaslDataTransferServer#doSaslHandshake}
>   if (ioe instanceof SaslException &&
>   ioe.getCause() != null &&
>   ioe.getCause() instanceof InvalidEncryptionKeyException) {
> // This could just be because the client is long-lived and hasn't gotten
> // a new encryption key from the NN in a while. Upon receiving this
> // error, the client will get a new encryption key from the NN and retry
> // connecting to this DN.
> sendInvalidKeySaslErrorMessage(out, ioe.getCause().getMessage());
>   } 
> {code}
> {code:title=DFSOutputStream.DataStreamer#createBlockOutputStream}
> if (ie instanceof InvalidEncryptionKeyException && refetchEncryptionKey > 0) {
> DFSClient.LOG.info("Will fetch a new encryption key and retry, " 
> + "encryption key was invalid when connecting to "
> + nodes[0] + " : " + ie);
> {code}
> However, if the exception is thrown during pipeline recovery, the 
> corresponding code does not handle it properly, and the exception is spilled 
> out to downstream applications, such as SOLR, aborting its operation:
> {quote}
> 2016-07-06 12:12:51,992 ERROR org.apache.solr.update.HdfsTransactionLog: 
> Exception closing tlog.
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=557709482) doesn't exist. Current key: 1350592619
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:1308)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1272)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)
> 2016-07-06 12:12:51,997 ERROR org.apache.solr.update.CommitTracker: auto 
> commit error...:org.apache.solr.common.SolrException: 
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=557709482) doesn't exist. Current key: 1350592619
> at 
> org.apache.solr.update.HdfsTransactionLog.close(HdfsTransactionLog.java:316)
> at 
> org.apache.solr.update.TransactionLog.decref(TransactionLog.java:505)
> at org.apache.solr.update.UpdateLog.addOldLog(UpdateLog.java:380)
> at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:676)
> at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:623)
> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.

[jira] [Commented] (HDFS-10610) DfsClient doesn't add hdfs-site.xml as a resource

2016-07-11 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371669#comment-15371669
 ] 

Daryn Sharp commented on HDFS-10610:


Although not strictly HDFS-8314's fault, it's the other half of the "2 wrongs 
accidentally make a right".

Simplest answer is remove {{@InterfaceAudience.Private}} from 
{{HdfsConfiguration}} otherwise there's no clean way to trigger loading 
hdfs-site w/o explicitly adding the resource.

Spark complained about the visibility of this class when they encountered a 
similar problem.  In their case they pass serialized confs.   If a filesystem 
isn't accessed prior to serializing the conf  - because the fs service loader 
will trigger creation of 
{{DistributedFileSystem}}/{{DFSClient}}/{{HdfsConfiguration}} - hdfs parameters 
are missing.  Very confusing to debug.

> DfsClient doesn't add hdfs-site.xml as a resource
> -
>
> Key: HDFS-10610
> URL: https://issues.apache.org/jira/browse/HDFS-10610
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Eric Badger
>
> Instantiating a new DfsClient used to add hdfs-site.xml as a resource in 2.7, 
> but that compatibility has been broken in 2.8. This only accidentally worked 
> in 2.7, since DfsClient would load HdfsConstants, which would in turn create 
> the static IO_FILE_BUFFER_SIZE, which would instantiate an HdfsConfiguration, 
> which would finally add hdfs-site.xml. 
> In 2.8, IO_FILE_BUFFER_SIZE no longer exists in HdfsConstants.java and so 
> this no longer works out of coincidence. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10610) DfsClient doesn't add hdfs-site.xml as a resource

2016-07-11 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371573#comment-15371573
 ] 

Mingliang Liu commented on HDFS-10610:
--

Are you aware of [HDFS-8314]? Was it related to your concern? Thanks.

> DfsClient doesn't add hdfs-site.xml as a resource
> -
>
> Key: HDFS-10610
> URL: https://issues.apache.org/jira/browse/HDFS-10610
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Eric Badger
>
> Instantiating a new DfsClient used to add hdfs-site.xml as a resource in 2.7, 
> but that compatibility has been broken in 2.8. This only accidentally worked 
> in 2.7, since DfsClient would load HdfsConstants, which would in turn create 
> the static IO_FILE_BUFFER_SIZE, which would instantiate an HdfsConfiguration, 
> which would finally add hdfs-site.xml. 
> In 2.8, IO_FILE_BUFFER_SIZE no longer exists in HdfsConstants.java and so 
> this no longer works out of coincidence. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2016-07-11 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371546#comment-15371546
 ] 

churro morales edited comment on HDFS-10608 at 7/11/16 8:29 PM:


Here is a patch against trunk, if this looks good to everyone I can provide 
backports.  [~cmccabe] would you mind taking a look to see if this is 
sufficient.  Thanks


was (Author: churromorales):
Here is a patch against trunk, if this looks good to everyone I can provide 
backports.  [~colinmccabe] would you mind taking a look to see if this is 
sufficient.  Thanks

> Include event for AddBlock in Inotify Event Stream
> --
>
> Key: HDFS-10608
> URL: https://issues.apache.org/jira/browse/HDFS-10608
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: churro morales
>Priority: Minor
> Attachments: HDFS-10608.patch
>
>
> It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
> discussions from mailing list:
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2016-07-11 Thread churro morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

churro morales updated HDFS-10608:
--
Attachment: HDFS-10608.patch

Here is a patch against trunk, if this looks good to everyone I can provide 
backports.  [~colinmccabe] would you mind taking a look to see if this is 
sufficient.  Thanks

> Include event for AddBlock in Inotify Event Stream
> --
>
> Key: HDFS-10608
> URL: https://issues.apache.org/jira/browse/HDFS-10608
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: churro morales
>Priority: Minor
> Attachments: HDFS-10608.patch
>
>
> It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
> discussions from mailing list:
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10611) libhdfs++: Add support for HA configurations with more than 2 namenodes

2016-07-11 Thread James Clampffer (JIRA)
James Clampffer created HDFS-10611:
--

 Summary: libhdfs++: Add support for HA configurations with more 
than 2 namenodes
 Key: HDFS-10611
 URL: https://issues.apache.org/jira/browse/HDFS-10611
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


Placeholder for now, doesn't look like you can use more than two namenodes per 
nameservice at the moment.  It shouldn't be too hard to extend 
HANamenodeTracker to support this in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-07-11 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371494#comment-15371494
 ] 

James Clampffer commented on HDFS-10595:


I think this is a duplicate of HDFS-9453, different source but same issue.  
From what I gathered looking through the C++ protobuf source if there are 
embedded nulls it can fall back to a  reflection based parser (slower).  I'm 
not sure if the java protobuf implementation has less restrictive rules on 
string fields. 

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-9809) Abstract implementation-specific details from the datanode

2016-07-11 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371440#comment-15371440
 ] 

Virajith Jalaparti edited comment on HDFS-9809 at 7/11/16 7:35 PM:
---

Hi [~eddyxu],

Thank you for the comments! Replies below. 

bq. {{BlockSender#waitForMinLength}}, it is not clear to me that why you need 
to change {{RBW}} to {{RIP}} ?

This should have actually been {{ReplicaInPipelineInterface}} (or rather a 
class that extends {{ReplicaInfo}} and implements 
{{ReplicaInPipelineInterface}}) instead of {{ReplicaInPipeline}}. The idea in 
replacing this was: when we add a {{ProvidedReplica}} as part of HDFS-9806, the 
{{ProvidedReplica}} would be implementing {{ReplicaInPipelineInterface}} so 
that it can tie into the existing replication pipeline in the datanode. So, if 
references to {{ReplicaInPipeline}} and {{ReplicaBeingWritten}} are replaced by 
{{ReplicaInPipelineInterface}}, parts of the current code can be used for both 
{{LocalReplica}} and {{ProvidedReplica}}. The current patch does not have these 
replacements as implementing writes for {{ProvidedReplica}} was future work of 
HDFS-9806 and not in its scope. I understand that this should not preclude 
replacement by {{ReplicaInPipelineInterface}} as it would eventually be needed. 
I will make these modifications and post a new patch soon. 

bq. {{DataStorage#getTrashDirectoryForBlockFile}} we can take this chance to 
rename it to {{getTrashDirectoryForBlock}} or {{..ForReplica}}.

Agreed. Will do. 

bq. There are many {{UnsupportedOperationException}} in Replica class 
hierarchy. It might indicates that these functions are not supposed to be 
override.

As some of the functions were moved up from the Replica class hierarchy (e.g., 
RIP etc. to {{ReplicaInfo}}) and other sub-classes of {{ReplicaInfo}} may not 
implement it, these were declared to throw {{UnsupportedOperationException}}. I 
agree that this is not necessary as it is a {{RuntimeException}}. I will update 
the patch without the {{UnsupportedOperationException}}.

bq. ASF License for every new file.

Will add it. 

bq. Lets take this chance to use JDK 7 {{try..resource}} in 
{{breakHardlilnks()}} ?

Sure. Will change to use {{try..resource}}. 

bq. {{ReplicaInPipeline#moveReplicaFrom}} it still has a {{File}} parameter.

Yes, this needs to be removed. This will be part of the changes to be made 
following the first point above. 




was (Author: virajith):
Hi [~eddyxu],

Thank you for the comments! Replies below. 

bq. {{BlockSender#waitForMinLength}}, it is not clear to me that why you need 
to change {{RBW}} to {{RIP}} ?

This should have actually been {{ReplicaInPipelineInterface}} instead of 
{{ReplicaInPipeline}}. The idea in replacing this was: when we add a 
{{ProvidedReplica}} as part of HDFS-9806, the {{ProvidedReplica}} would be 
implementing {{ReplicaInPipelineInterface}} so that it can tie into the 
existing replication pipeline in the datanode. So, if references to 
{{ReplicaInPipeline}} and {{ReplicaBeingWritten}} are replaced by 
{{ReplicaInPipelineInterface}}, parts of the current code can be used for both 
{{LocalReplica}} and {{ProvidedReplica}}. The current patch does not have these 
replacements as implementing writes for {{ProvidedReplica}} was future work of 
HDFS-9806 and not in its scope. I understand that this should not preclude 
replacement by {{ReplicaInPipelineInterface}} as it would eventually be needed. 
I will make these modifications and post a new patch soon. 

bq. {{DataStorage#getTrashDirectoryForBlockFile}} we can take this chance to 
rename it to {{getTrashDirectoryForBlock}} or {{..ForReplica}}.

Agreed. Will do. 

bq. There are many {{UnsupportedOperationException}} in Replica class 
hierarchy. It might indicates that these functions are not supposed to be 
override.

As some of the functions were moved up from the Replica class hierarchy (e.g., 
RIP etc. to {{ReplicaInfo}}) and other sub-classes of {{ReplicaInfo}} may not 
implement it, these were declared to throw {{UnsupportedOperationException}}. I 
agree that this is not necessary as it is a {{RuntimeException}}. I will update 
the patch without the {{UnsupportedOperationException}}.

bq. ASF License for every new file.

Will add it. 

bq. Lets take this chance to use JDK 7 {{try..resource}} in 
{{breakHardlilnks()}} ?

Sure. Will change to use {{try..resource}}. 

bq. {{ReplicaInPipeline#moveReplicaFrom}} it still has a {{File}} parameter.

Yes, this needs to be removed. This will be part of the changes to be made 
following the first point above. 



> Abstract implementation-specific details from the datanode
> --
>
> Key: HDFS-9809
> URL: https://issues.apache.org/jira/browse/HDFS-9809
> Project: Hadoop HDFS
>  Issue Type: Task
>  Co

[jira] [Created] (HDFS-10610) DfsClient doesn't add hdfs-site.xml as a resource

2016-07-11 Thread Eric Badger (JIRA)
Eric Badger created HDFS-10610:
--

 Summary: DfsClient doesn't add hdfs-site.xml as a resource
 Key: HDFS-10610
 URL: https://issues.apache.org/jira/browse/HDFS-10610
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Eric Badger


Instantiating a new DfsClient used to add hdfs-site.xml as a resource in 2.7, 
but that compatibility has been broken in 2.8. This only accidentally worked in 
2.7, since DfsClient would load HdfsConstants, which would in turn create the 
static IO_FILE_BUFFER_SIZE, which would instantiate an HdfsConfiguration, which 
would finally add hdfs-site.xml. 

In 2.8, IO_FILE_BUFFER_SIZE no longer exists in HdfsConstants.java and so this 
no longer works out of coincidence. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9809) Abstract implementation-specific details from the datanode

2016-07-11 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371440#comment-15371440
 ] 

Virajith Jalaparti commented on HDFS-9809:
--

Hi [~eddyxu],

Thank you for the comments! Replies below. 

bq. {{BlockSender#waitForMinLength}}, it is not clear to me that why you need 
to change {{RBW}} to {{RIP}} ?

This should have actually been {{ReplicaInPipelineInterface}} instead of 
{{ReplicaInPipeline}}. The idea in replacing this was: when we add a 
{{ProvidedReplica}} as part of HDFS-9806, the {{ProvidedReplica}} would be 
implementing {{ReplicaInPipelineInterface}} so that it can tie into the 
existing replication pipeline in the datanode. So, if references to 
{{ReplicaInPipeline}} and {{ReplicaBeingWritten}} are replaced by 
{{ReplicaInPipelineInterface}}, parts of the current code can be used for both 
{{LocalReplica}} and {{ProvidedReplica}}. The current patch does not have these 
replacements as implementing writes for {{ProvidedReplica}} was future work of 
HDFS-9806 and not in its scope. I understand that this should not preclude 
replacement by {{ReplicaInPipelineInterface}} as it would eventually be needed. 
I will make these modifications and post a new patch soon. 

bq. {{DataStorage#getTrashDirectoryForBlockFile}} we can take this chance to 
rename it to {{getTrashDirectoryForBlock}} or {{..ForReplica}}.

Agreed. Will do. 

bq. There are many {{UnsupportedOperationException}} in Replica class 
hierarchy. It might indicates that these functions are not supposed to be 
override.

As some of the functions were moved up from the Replica class hierarchy (e.g., 
RIP etc. to {{ReplicaInfo}}) and other sub-classes of {{ReplicaInfo}} may not 
implement it, these were declared to throw {{UnsupportedOperationException}}. I 
agree that this is not necessary as it is a {{RuntimeException}}. I will update 
the patch without the {{UnsupportedOperationException}}.

bq. ASF License for every new file.

Will add it. 

bq. Lets take this chance to use JDK 7 {{try..resource}} in 
{{breakHardlilnks()}} ?

Sure. Will change to use {{try..resource}}. 

bq. {{ReplicaInPipeline#moveReplicaFrom}} it still has a {{File}} parameter.

Yes, this needs to be removed. This will be part of the changes to be made 
following the first point above. 



> Abstract implementation-specific details from the datanode
> --
>
> Key: HDFS-9809
> URL: https://issues.apache.org/jira/browse/HDFS-9809
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, 
> HDFS-9809.003.patch
>
>
> Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) 
> implicitly assume that blocks are stored in java.io.File(s) and that volumes 
> are divided into directories. We propose to abstract these details, which 
> would help in supporting other storages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-07-11 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371327#comment-15371327
 ] 

Konstantin Shvachko commented on HDFS-10301:


[~cmccabe] this jira needs some action from you, because you are blocking it.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.01.patch, HDFS-10301.sample.patch, 
> zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10512) VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks

2016-07-11 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371315#comment-15371315
 ] 

Wei-Chiu Chuang commented on HDFS-10512:


Thanks [~linyiqun], [~yzhangal] and [~ajisakaa] for the collaboration!

> VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks
> --
>
> Key: HDFS-10512
> URL: https://issues.apache.org/jira/browse/HDFS-10512
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-10512.001.patch, HDFS-10512.002.patch, 
> HDFS-10512.004.patch, HDFS-10512.005.patch, HDFS-10512.006.patch
>
>
> VolumeScanner may terminate due to unexpected NullPointerException thrown in 
> {{DataNode.reportBadBlocks()}}. This is different from HDFS-8850/HDFS-9190
> I observed this bug in a production CDH 5.5.1 cluster and the same bug still 
> persist in upstream trunk.
> {noformat}
> 2016-04-07 20:30:53,830 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-1800173197-10.204.68.5-125156296:blk_1170134484_96468685 on /dfs/dn
> 2016-04-07 20:30:53,831 ERROR 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
> DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
> at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
> 2016-04-07 20:30:53,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
> DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
> {noformat}
> I think the NPE comes from the volume variable in the following code snippet. 
> Somehow the volume scanner know the volume, but the datanode can not lookup 
> the volume using the block.
> {code}
> public void reportBadBlocks(ExtendedBlock block) throws IOException{
> BPOfferService bpos = getBPOSForBlock(block);
> FsVolumeSpi volume = getFSDataset().getVolume(block);
> bpos.reportBadBlocks(
> block, volume.getStorageID(), volume.getStorageType());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption

2016-07-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371294#comment-15371294
 ] 

Yongjun Zhang commented on HDFS-10587:
--

HI [~jojochuang],

I think it'd be nice to work out a unit test that demonstrates the block 
corruption, for example, to create a block with visibleLength X, and a replica 
with data X+delta written to disk,  then use the involved code to copy the 
replica to a different one, to see if  the corruption happens. If so, then we 
can see my above proposed change can address the issue. 

Of course, we still need to understand better the "chunk end enforcement" 
mentioned in my earlier comment. 
 
What do you think?

Thanks.


> Incorrect offset/length calculation in pipeline recovery causes block 
> corruption
> 
>
> Key: HDFS-10587
> URL: https://issues.apache.org/jira/browse/HDFS-10587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> We found incorrect offset and length calculation in pipeline recovery may 
> cause block corruption and results in missing blocks under a very unfortunate 
> scenario. 
> (1) A client established pipeline and started writing data to the pipeline.
> (2) One of the data node in the pipeline restarted, closing the socket, and 
> some written data were unacknowledged.
> (3) Client replaced the failed data node with a new one, initiating block 
> transfer to copy existing data in the block to the new datanode.
> (4) The block is transferred to the new node. Crucially, the entire block, 
> including the unacknowledged data, was transferred.
> (5) The last chunk (512 bytes) was not a full chunk, but the destination 
> still reserved the whole chunk in its buffer, and wrote the entire buffer to 
> disk, therefore some written data is garbage.
> (6) When the transfer was done, the destination data node converted the 
> replica from temporary to rbw, which made its visible length as the length of 
> bytes on disk. That is to say, it thought whatever was transferred was 
> acknowledged. However, the visible length of the replica is different (round 
> up to the next multiple of 512) than the source of transfer.
> (7) Client then truncated the block in the attempt to remove unacknowledged 
> data. However, because the visible length is equivalent of the bytes on disk, 
> it did not truncate unacknowledged data.
> (8) When new data was appended to the destination, it skipped the bytes 
> already on disk. Therefore, whatever was written as garbage was not replaced.
> (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it 
> wouldn’t tell NameNode to mark the replica as corrupt, so the client 
> continued to form a pipeline using the corrupt replica.
> (10) Finally the DN that had the only healthy replica was restarted. NameNode 
> then update the pipeline to only contain the corrupt replica.
> (11) Client continue to write to the corrupt replica, because neither client 
> nor the data node itself knows the replica is corrupt. When the restarted 
> datanodes comes back, their replica are stale, despite they are not corrupt. 
> Therefore, none of the replica is good and up to date.
> The sequence of events was reconstructed based on DataNode/NameNode log and 
> my understanding of code.
> Incidentally, we have observed the same sequence of events on two independent 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371267#comment-15371267
 ] 

Hadoop QA commented on HDFS-9890:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
49s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
9s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
27s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
6s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  6m  6s{color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_91 with JDK 
v1.8.0_91 generated 1 new + 2 unchanged - 1 fixed = 3 total (was 3) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
30s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817195/HDFS-9890.HDFS-8707.016.patch
 |
| JIRA Issue | HDFS-9890 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 9fd65ae45616 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d643d8c |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| cc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16018/artifact/patchprocess/diff-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_91.txt
 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16018/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16018/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: Add test suite to simulate network issues
> 
>
> K

[jira] [Created] (HDFS-10609) Uncaught InvalidEncryptionKeyException during pipeline recovery may abort downstream applications

2016-07-11 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-10609:
--

 Summary: Uncaught InvalidEncryptionKeyException during pipeline 
recovery may abort downstream applications
 Key: HDFS-10609
 URL: https://issues.apache.org/jira/browse/HDFS-10609
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 2.6.0
 Environment: CDH5.8.0
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


In normal operations, if SASL negotiation fails due to 
{{InvalidEncryptionKeyException}}, it is typically a benign exception, which is 
caught and retried :

{code:title=SaslDataTransferServer#doSaslHandshake}
  if (ioe instanceof SaslException &&
  ioe.getCause() != null &&
  ioe.getCause() instanceof InvalidEncryptionKeyException) {
// This could just be because the client is long-lived and hasn't gotten
// a new encryption key from the NN in a while. Upon receiving this
// error, the client will get a new encryption key from the NN and retry
// connecting to this DN.
sendInvalidKeySaslErrorMessage(out, ioe.getCause().getMessage());
  } 
{code}

{code:title=DFSOutputStream.DataStreamer#createBlockOutputStream}
if (ie instanceof InvalidEncryptionKeyException && refetchEncryptionKey > 0) {
DFSClient.LOG.info("Will fetch a new encryption key and retry, " 
+ "encryption key was invalid when connecting to "
+ nodes[0] + " : " + ie);
{code}

However, if the exception is thrown during pipeline recovery, the corresponding 
code does not handle it properly, and the exception is spilled out to 
downstream applications, such as SOLR, aborting its operation:

{quote}
2016-07-06 12:12:51,992 ERROR org.apache.solr.update.HdfsTransactionLog: 
Exception closing tlog.
org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
Can't re-compute encryption key for nonce, since the required block key 
(keyID=557709482) doesn't exist. Current key: 1350592619
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:1308)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1272)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)
2016-07-06 12:12:51,997 ERROR org.apache.solr.update.CommitTracker: auto commit 
error...:org.apache.solr.common.SolrException: 
org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
Can't re-compute encryption key for nonce, since the required block key 
(keyID=557709482) doesn't exist. Current key: 1350592619
at 
org.apache.solr.update.HdfsTransactionLog.close(HdfsTransactionLog.java:316)
at org.apache.solr.update.TransactionLog.decref(TransactionLog.java:505)
at org.apache.solr.update.UpdateLog.addOldLog(UpdateLog.java:380)
at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:676)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:623)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException

[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-07-11 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu updated HDFS-9890:
--
Attachment: HDFS-9890.HDFS-8707.016.patch

HDFS-9890.HDFS-8707.016.patch change threads value back to 1 in 
FileSystemImpl::FileSystemImpl.

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch, 
> HDFS-9890.HDFS-8707.005.patch, HDFS-9890.HDFS-8707.006.patch, 
> HDFS-9890.HDFS-8707.007.patch, HDFS-9890.HDFS-8707.008.patch, 
> HDFS-9890.HDFS-8707.009.patch, HDFS-9890.HDFS-8707.010.patch, 
> HDFS-9890.HDFS-8707.011.patch, HDFS-9890.HDFS-8707.012.patch, 
> HDFS-9890.HDFS-8707.012.patch, HDFS-9890.HDFS-8707.013.patch, 
> HDFS-9890.HDFS-8707.013.patch, HDFS-9890.HDFS-8707.014.patch, 
> HDFS-9890.HDFS-8707.015.patch, HDFS-9890.HDFS-8707.016.patch, 
> hs_err_pid26832.log, hs_err_pid4944.log
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10608) Include event for AddBlock in Inotify Event Stream

2016-07-11 Thread churro morales (JIRA)
churro morales created HDFS-10608:
-

 Summary: Include event for AddBlock in Inotify Event Stream
 Key: HDFS-10608
 URL: https://issues.apache.org/jira/browse/HDFS-10608
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: churro morales
Priority: Minor


It would be nice to have an AddBlockEvent in the INotify pipeline.  Based on 
discussions from mailing list:

http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-07-11 Thread Xiaowei Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371124#comment-15371124
 ] 

Xiaowei Zhu commented on HDFS-9890:
---

I found the root cause of the non-deterministic failures in our unit tests. Our 
patch with the changes in filesystem.cc changes number of threads from 1 to 2, 
in FileSystemImpl::FileSystemImpl(...), which causes those failures. I verified 
with the latest HDFS-8707 and reproduced the same issue when I increased the 
number of threads. This change was introduced with the original 000.patch and 
is not so related to what this jira is about. So I plan to change the thread 
value back to 1 and file another jira about this found issue.

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch, 
> HDFS-9890.HDFS-8707.005.patch, HDFS-9890.HDFS-8707.006.patch, 
> HDFS-9890.HDFS-8707.007.patch, HDFS-9890.HDFS-8707.008.patch, 
> HDFS-9890.HDFS-8707.009.patch, HDFS-9890.HDFS-8707.010.patch, 
> HDFS-9890.HDFS-8707.011.patch, HDFS-9890.HDFS-8707.012.patch, 
> HDFS-9890.HDFS-8707.012.patch, HDFS-9890.HDFS-8707.013.patch, 
> HDFS-9890.HDFS-8707.013.patch, HDFS-9890.HDFS-8707.014.patch, 
> HDFS-9890.HDFS-8707.015.patch, hs_err_pid26832.log, hs_err_pid4944.log
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9271) Implement basic NN operations

2016-07-11 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371081#comment-15371081
 ] 

Bob Hansen commented on HDFS-9271:
--

Thanks for all that hard work, [~anatoli.shein].

A few comments:
* In GetBlockLocations(hdfspp.h, filesystem.cc), use offset_t or uint64_t 
rather than long.  It's less ambiguous.
* In getAbsolutePath (hdfs.cc), how about returning optional(string) rather 
than an empty string on error.  It makes the error state explicit and 
explicitly checked.
* Make a new bug to capture supporting ".." semantics
* It appears the majority of hdfs_ext_test.c has been commented out.  What this 
intentional, or debugging dirt that slipped in?
* Can we add a test for relative paths for all the functions where we added 
them in?
* Can we implement hdfsMove and/or hdfsTruncateFile with just metadata 
operations?
* Move to libhdfspp implementations in hdfs_shim for GetDefaultBlocksize[AtPath]
* Implement hdfsUnbufferFile as a no-op?
* Do we support single-dot relative paths?  e.g. can I call hdfsGetPathInfo(fs, 
".")?  Do we have tests over that?
* Do we have tests that show that libhdfspp's getReadStatistics match libhdfs's 
getReadStatistics?

Minor little nits:
* For the absolute path, I personally prefer abs_path = getAbsolutPath(...) 
rather than abs_path(getAbsolutePath).  They both compile to the same thing 
(see https://en.wikipedia.org/wiki/Return_value_optimization); I think the 
whitespace with the assignment makes the _what_ and the _content_ separation 
cleaner
* Refactor CheckSystemAndHandle to use CheckHandle 
(https://en.wikipedia.org/wiki/Don't_repeat_yourself)



> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Anatoli Shein
> Attachments: HDFS-9271.HDFS-8707.000.patch, 
> HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10607) libhdfs++:hdfs_shim missing some hdfs++ functions

2016-07-11 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10607:
-

 Summary: libhdfs++:hdfs_shim missing some hdfs++ functions
 Key: HDFS-10607
 URL: https://issues.apache.org/jira/browse/HDFS-10607
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


The hdfsConfGetStr, hdfsConfGetInt, hdfsStrFree, hdfsSeek, and hdfsTell 
functions are all calling into the libhdfs implementations, not the libhdfs++ 
implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10570) Netty-all jar should be first in class path while running tests in eclipse

2016-07-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370980#comment-15370980
 ] 

Steve Loughran commented on HDFS-10570:
---

does this mean that there are conflicting versions of netty in the CP here? As 
that needs to be fixed. Changing the order of the CP simply by re-ordering 
declarations in maven is a dangerous approach as it's an mvn feature that most 
people don't know of, and isn't, AFAIK, guaranteed to hold over time

> Netty-all jar should be first in class path while running tests in eclipse
> --
>
> Key: HDFS-10570
> URL: https://issues.apache.org/jira/browse/HDFS-10570
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Attachments: HDFS-10570-01.patch
>
>
> While debugging tests in eclipse, Cannot access DN http url. 
> Also WebHdfs tests cannot run in eclipse due to classes loading from old 
> version of netty jars instead of netty-all jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9213) Minicluster with Kerberos generates some stacks when checking the ports

2016-07-11 Thread Vijay Srinivasaraghavan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370975#comment-15370975
 ] 

Vijay Srinivasaraghavan commented on HDFS-9213:
---

I was able to move forward by setting following configurations. There is aslo 
couple of flags that you need to enable in MiniDFSCluster to support below 
configurations.

Hadoop Configurations:
conf.set("dfs.datanode.address", "localhost:1002");
conf.set("dfs.datanode.hostname", "localhost");
conf.set("dfs.datanode.http.address", "localhost:1003");

DFSMiniCluster:
MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(conf);
builder.checkDataNodeAddrConfig(true);
builder.checkDataNodeHostConfig(true);

You also need to enable permissions to allow the java process to bind to these 
privileged ports. For Ubuntu setup, I was using below command to enable it.

setcap 'cap_net_bind_service=+ep' /path/to/java

It looks like, with this setup we may not need additional patch? 


> Minicluster with Kerberos generates some stacks when checking the ports
> ---
>
> Key: HDFS-9213
> URL: https://issues.apache.org/jira/browse/HDFS-9213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0-alpha1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
> Attachments: hdfs-9213.v1.patch, hdfs-9213.v1.patch
>
>
> When using the minicluster with kerberos the various checks in 
> SecureDataNodeStarter fail because the ports are not fixed.
> Stacks like this one:
> {quote}
> java.lang.RuntimeException: Unable to bind on specified streaming port in 
> secure context. Needed 0, got 49670
>   at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:108)
> {quote}
> There is already a setting to desactivate this type of check for testing, it 
> could be used here as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10594) HDFS-4949 should support recursive cache directives

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370969#comment-15370969
 ] 

Hadoop QA commented on HDFS-10594:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 
191 unchanged - 1 fixed = 193 total (was 192) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
58s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m  3s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readCacheDirectiveInfo(XMLUtils$Stanza)
 invokes inefficient Boolean constructor; use Boolean.valueOf(...) instead  At 
FSImageSerialization.java:use Boolean.valueOf(...) instead  At 
FSImageSerialization.java:[line 598] |
| Failed junit tests | 
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https:/

[jira] [Commented] (HDFS-7622) Erasure Coding: handling file truncation

2016-07-11 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370844#comment-15370844
 ] 

Rakesh R commented on HDFS-7622:


As per the analysis there are cases of partial block stripes has to be handled, 
which is similar to the case of recovery on partial stripe (which could be 
triggered by a {{#hflush}} call) and {{#hflush}} HDFS-7661. While implementing 
the truncation of partial block stripes, it is touching the 
{{RecoveryTaskStriped#recover()}} part, so I feel to revisit this jira once 
HDFS-7661 discussion/logic is settle down.

It would be great if someone could help me in pushing HDFS-8065 block group 
boundary truncation jira, which I think can be supported now.

> Erasure Coding: handling file truncation
> 
>
> Key: HDFS-7622
> URL: https://issues.apache.org/jira/browse/HDFS-7622
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Rakesh R
>
> This jira will cover the following issues:
> # how to truncate an erasure coded file
> # how to erasure code a file that has been truncated and snapshotted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10594) HDFS-4949 should support recursive cache directives

2016-07-11 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10594:
-
Attachment: HDFS-10594.002.patch

Attach a initial patch for supporting recursive {{CacheDirective}}. Thanks for 
the review!

> HDFS-4949 should support recursive cache directives
> ---
>
> Key: HDFS-10594
> URL: https://issues.apache.org/jira/browse/HDFS-10594
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-10594.001.patch, HDFS-10594.002.patch
>
>
> In {{CacheReplicationMonitor#rescanCacheDirectives}}, it should recursively 
> rescan the path when the inode of the path is a directory. In these code:
> {code}
> } else if (node.isDirectory()) {
> INodeDirectory dir = node.asDirectory();
> ReadOnlyList children = dir
> .getChildrenList(Snapshot.CURRENT_STATE_ID);
> for (INode child : children) {
>   if (child.isFile()) {
> rescanFile(directive, child.asFile());
>   }
> }
>}
> {code}
> If we did the this logic, it means that some inode files will be ignored when 
> the child inode is also a directory and there are some other child inode file 
> in it. Finally the child's child file which belong to this path will not be 
> cached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured

2016-07-11 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HDFS-10606:
--

 Summary: TrashPolicyDefault supports time of auto clean up can 
configured
 Key: HDFS-10606
 URL: https://issues.apache.org/jira/browse/HDFS-10606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: He Xiaoqiao


TrashPolicyDefault clean up Trash based on 
[UTC|http://www.worldtimeserver.com/current_time_in_UTC.aspx] currently and the 
time of cleaning up is 00:00 UTC. when there are large amount of trash data 
should be auto-clean, it will block NN for a long time since Global Lock, In 
the most serious situations it may lead some cron job submit failure. if add 
configuration about time of cleaning up, it will avoid impact on this cron jobs 
at that default time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10548) Remove the long deprecated BlockReaderRemote

2016-07-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370549#comment-15370549
 ] 

Steve Loughran commented on HDFS-10548:
---

I have no opinions; Chris is on vacation this week, so best to hold off until 
he returns and can have his input on this

> Remove the long deprecated BlockReaderRemote
> 
>
> Key: HDFS-10548
> URL: https://issues.apache.org/jira/browse/HDFS-10548
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10548-v1.patch, HDFS-10548-v2.patch, 
> HDFS-10548-v3.patch
>
>
> To lessen the maintain burden like raised in HDFS-8901, suggest we remove 
> {{BlockReaderRemote}} class that's deprecated very long time ago. 
> From {{BlockReaderRemote}} header:
> {quote}
>  * @deprecated this is an old implementation that is being left around
>  * in case any issues spring up with the new {@link BlockReaderRemote2}
>  * implementation.
>  * It will be removed in the next release.
> {quote}
> From {{BlockReaderRemote2}} class header:
> {quote}
>  * This is a new implementation introduced in Hadoop 0.23 which
>  * is more efficient and simpler than the older BlockReader
>  * implementation. It should be renamed to BlockReaderRemote
>  * once we are confident in it.
> {quote}
> So even further, after getting rid of the old class, we could rename as the 
> comment suggested: BlockReaderRemote2 => BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}
Two thread can run synchronized method dn.data.removeVolumes and 
data.finalizeBlock concurrently because dn.data(mocked) and data is not the 
same object(hash 1903955157 and 1508483764).
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}
The UT ran passed.

* Online
When dn.data.removeVolumes the thread run in FsVolumeImpl.closeAndWait() with 
dn.data lock and wait referenceCount() = 0, but the other DataXceiver thread 
maybe blocked by dn.data lock and with referencing volume. This can be happened 
like HDFS-9874.

* Potential issue in project
There are so many usage of Mockito.spy in project, could some issues like that 
the UT passed but deadlock online ?

  was:
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi da

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}
2 thread can run synchronized method dn.data.removeVolumes and 
data.finalizeBlock concurrently because dn.data(mocked) and data is not the 
same object(hash 1903955157 and 1508483764).
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}
The UT ran passed.

* Online
When dn.data.removeVolumes the thread run in FsVolumeImpl.closeAndWait() with 
dn.data lock and wait referenceCount() = 0, but the other DataXceiver thread 
maybe blocked by dn.data lock and with referencing volume. This can be happened 
like HDFS-9874.

* Potential issue in project
There are so many usage of Mockito.spy in project, could some issues like that 
the UT passed but deadlock online ?

  was:
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = d

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}
2 thread can run synchronized method dn.data.removeVolumes and 
data.finalizeBlock concurrently because dn.data(mocked) and data is not the 
same object(hash 1903955157 and 1508483764).
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}
The UT ran passed.

* Online
When dn.data.removeVolumes the thread run in FsVolumeImpl.closeAndWait() with 
dn.data lock and wait referenceCount() = 0, but the other DataXceiver thread 
maybe blocked by dn.data lock and with referencing volume. This can be happened 
like HDFS-9874.

* Potential issue in project
There are so many usage of Mockito.spy in project, could some issues like that 
the UT passed but deadlock online ?

  was:
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.da

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}
2 thread can run synchronized method dn.data.removeVolumes and 
data.finalizeBlock concurrently because dn.data(mocked) and data is not the 
same object.
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}
The UT ran passed.

* Online
When dn.data.removeVolumes the thread run in FsVolumeImpl.closeAndWait() with 
dn.data lock and wait referenceCount() = 0, but the other DataXceiver thread 
maybe blocked by dn.data lock and with referencing volume. This can be happened 
like HDFS-9874.

* Potential issue in project
There are so many usage of Mockito.spy in project, could some issues like that 
the UT passed but deadlock online ?

  was:
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(da

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
The UT: TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten can be ran 
successful, but deadlock like HDFS-9874 maybe happen online.
* UT: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}
2 thread can run synchronized method dn.data.removeVolumes and 
data.finalizeBlock concurrently because dn.data(mocked) and data is not the 
same object.
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}
The UT ran passed.

* Online
When dn.data.removeVolumes the thread run in FsVolumeImpl.closeAndWait() with 
dn.data lock and wait referenceCount() = 0, but the other DataXceiver thread 
maybe blocked by dn.data lock and with referencing volume. This can be happened 
like HDFS-9874.

  was:
UT: TestDataNodeHotSwapVolumes.
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.s

[jira] [Commented] (HDFS-10603) Flaky test org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint

2016-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370438#comment-15370438
 ] 

Hadoop QA commented on HDFS-10603:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 14s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817098/HDFS-10603.002.patch |
| JIRA Issue | HDFS-10603 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d41b002fcec3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0fd3980 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16016/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16016/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16016/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Flaky test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> ---
>
> Key: HDFS-10603
> URL: https://issues.apache.org/jira/browse/HDFS-10603
> Project: Hadoop HDFS
>  Issue Type: 

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Summary: Can not synchronized call method of object and 
Mockito.spy(object), So UT:testRemoveVolumeBeingWritten passed but maybe 
deadlock online  (was: Can not synchronized call method of object and 
Mockito.spy(object), So UT:testRemoveVolumeBeingWrittenForDatanode passed but 
maybe deadlock online)

> Can not synchronized call method of object and Mockito.spy(object), So 
> UT:testRemoveVolumeBeingWritten passed but maybe deadlock online
> ---
>
> Key: HDFS-10605
> URL: https://issues.apache.org/jira/browse/HDFS-10605
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2
>Reporter: ade
>  Labels: test
>
> UT: TestDataNodeHotSwapVolumes.
> {code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
> final FsDatasetSpi data = dn.data;
> dn.data = Mockito.spy(data);
> LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
> dn.data.hashCode());
> doAnswer(new Answer() {
>   public Object answer(InvocationOnMock invocation)
>   throws IOException, InterruptedException {
> Thread.sleep(1000);
> // Bypass the argument to FsDatasetImpl#finalizeBlock to verify 
> that
> // the block is not removed, since the volume reference should not
> // be released at this point.
> data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
> return null;
>   }
> }).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
> {code}
> {noformat}
> 2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
> (TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599))
>  - data hash:1903955157; dn.data hash:1508483764
> 2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
> (DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
> dfs.datanode.data.dir to 
> [DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
> 2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
> (Util.java:stringAsURI(56)) - Path 
> /Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
>  should be specified as a URI in configuration files. Please update hdfs 
> configuration.
> 2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
> (DataNode.java:removeVolumes(674)) - Deactivating volumes (clear 
> failure=true): 
> /Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
> 2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
> (FsDatasetImpl.java:removeVolumes(459)) - Removing 
> /Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
>  from FsDataset.
> 2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
> (FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
> hash:1508483764
> 2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
> (BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
> /Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
>  (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
> 2016-07-11 16:16:07,836 INFO  
> [VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
>  datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
> VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
>  DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
> 2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
> blockmanagement.DatanodeDescriptor 
> (DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
> [DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
> DataNode127.0.0.1:63548
> 2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
> blockmanagement.DatanodeDescriptor 
> (DatanodeDescriptor.java:updateStorage(866)) - Adding new storage ID 
> DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 127.0.0.1:63548
> 2016-07-11 16:16:08,845 INFO  [PacketResponder: 
> BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
> type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
> (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
> hash:1903955157
> 2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
> [Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
> impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finali

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWrittenForDatanode passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
UT: TestDataNodeHotSwapVolumes.
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}

{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}

  was:
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}

{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data h

[jira] [Updated] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWrittenForDatanode passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ade updated HDFS-10605:
---
Description: 
{code:title=TestDataNodeHotSwapVolumes.java|borderStyle=solid}
final FsDatasetSpi data = dn.data;
dn.data = Mockito.spy(data);
LOG.info("data hash:" + data.hashCode() + "; dn.data hash:" + 
dn.data.hashCode());
doAnswer(new Answer() {
  public Object answer(InvocationOnMock invocation)
  throws IOException, InterruptedException {
Thread.sleep(1000);
// Bypass the argument to FsDatasetImpl#finalizeBlock to verify that
// the block is not removed, since the volume reference should not
// be released at this point.
data.finalizeBlock((ExtendedBlock) invocation.getArguments()[0]);
return null;
  }
}).when(dn.data).finalizeBlock(any(ExtendedBlock.class));
{code}

{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}

  was:
{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:0

[jira] [Created] (HDFS-10605) Can not synchronized call method of object and Mockito.spy(object), So UT:testRemoveVolumeBeingWrittenForDatanode passed but maybe deadlock online

2016-07-11 Thread ade (JIRA)
ade created HDFS-10605:
--

 Summary: Can not synchronized call method of object and 
Mockito.spy(object), So UT:testRemoveVolumeBeingWrittenForDatanode passed but 
maybe deadlock online
 Key: HDFS-10605
 URL: https://issues.apache.org/jira/browse/HDFS-10605
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.2, 2.7.1, 2.7.0, 2.8.0
Reporter: ade


{noformat}
2016-07-11 16:16:07,788 INFO  [Thread-0] datanode.TestDataNodeHotSwapVolumes 
(TestDataNodeHotSwapVolumes.java:testRemoveVolumeBeingWrittenForDatanode(599)) 
- data hash:1903955157; dn.data hash:1508483764
2016-07-11 16:16:07,801 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:reconfigurePropertyImpl(456)) - Reconfiguring 
dfs.datanode.data.dir to 
[DISK]file:/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2
2016-07-11 16:16:07,810 WARN  [Thread-157] common.Util 
(Util.java:stringAsURI(56)) - Path 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 should be specified as a URI in configuration files. Please update hdfs 
configuration.
2016-07-11 16:16:07,811 INFO  [Thread-157] datanode.DataNode 
(DataNode.java:removeVolumes(674)) - Deactivating volumes (clear failure=true): 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(459)) - Removing 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 from FsDataset.
2016-07-11 16:16:07,836 INFO  [Thread-157] impl.FsDatasetImpl 
(FsDatasetImpl.java:removeVolumes(463)) - removeVolumes of object 
hash:1508483764
2016-07-11 16:16:07,836 INFO  [Thread-157] datanode.BlockScanner 
(BlockScanner.java:removeVolumeScanner(243)) - Removing scanner for volume 
/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1
 (StorageID DS-f4df3404-9f02-470e-b202-75f5a4de29cb)
2016-07-11 16:16:07,836 INFO  
[VolumeScannerThread(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1)]
 datanode.VolumeScanner (VolumeScanner.java:run(630)) - 
VolumeScanner(/Users/ade/workspace/Hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1,
 DS-f4df3404-9f02-470e-b202-75f5a4de29cb) exiting.
2016-07-11 16:16:07,891 INFO  [IPC Server handler 7 on 63546] 
blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:pruneStorageMap(517)) - Removed storage 
[DISK]DS-f4df3404-9f02-470e-b202-75f5a4de29cb:NORMAL:127.0.0.1:63548 from 
DataNode127.0.0.1:63548
2016-07-11 16:16:07,908 INFO  [IPC Server handler 9 on 63546] 
blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateStorage(866)) 
- Adding new storage ID DS-f4df3404-9f02-470e-b202-75f5a4de29cb for DN 
127.0.0.1:63548
2016-07-11 16:16:08,845 INFO  [PacketResponder: 
BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001, 
type=LAST_IN_PIPELINE, downstreams=0:[]] impl.FsDatasetImpl 
(FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of object 
hash:1903955157
2016-07-11 16:16:12,933 INFO  [DataXceiver for client  at /127.0.0.1:63574 
[Receiving block BP-1077872064-127.0.0.1-1468224964600:blk_1073741825_1001]] 
impl.FsDatasetImpl (FsDatasetImpl.java:finalizeBlock(1559)) - finalizeBlock of 
object hash:1903955157
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10604) What about this?Group DNs and add DN groups--named region to HDFS model , use this region to instead of single DN when saving files.

2016-07-11 Thread Doris Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doris Gu updated HDFS-10604:

Description: 
The biggest difference this feature will bring is *making blocks belong to the 
same file to save in the same region(DN group).*
So the process will be:
1.Config DN groups, for example
bq.Region1:dn1,dn2,dn3
bq.Region2:dn4,dn5,dn6
bq.Region3:dn7,dn8,dn9,dn10

2.Client uploads a file, first analyze whether this file has any existed blocks:
bq.i)Yes:assign new blocks to the DN group where the existed blocks belong to.
bq.ii)No:assign new blocks to a DN group which is chosen by some certain policy 
to avoid imbalance.

3.Other related processes,including append,balancer etc. also need to modify as 
well.   

The benefit we wish is when some DNs are down at the same time, the number of 
affected files(miss all replicas) is small.
But we are wondering if this is worth doing or not, or if there are problems we 
haven't noticed.

  was:
The biggest difference this feature will bring is *strong* making blocks belong 
to the same file to save in the same region(DN group).*strong*
So the process will be:
1.Config DN groups, for example
bq.Region1:dn1,dn2,dn3
bq.Region2:dn4,dn5,dn6
bq.Region3:dn7,dn8,dn9,dn10

2.Client uploads a file, first analyze whether this file has any existed blocks:
bq.i)Yes:assign new blocks to the DN group where the existed blocks belong to.
bq.ii)No:assign new blocks to a DN group which is chosen by some certain policy 
to avoid imbalance.

3.Other related processes,including append,balancer etc. also need to modify as 
well.   

The benefit we wish is when some DNs are down at the same time, the number of 
affected files(miss all replicas) is small.
But we are wondering if this is worth doing or not, or if there are problems we 
haven't noticed.


> What about this?Group DNs and add DN groups--named region to HDFS model , use 
> this region to instead of single DN when saving files.
> 
>
> Key: HDFS-10604
> URL: https://issues.apache.org/jira/browse/HDFS-10604
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: Doris Gu
>
> The biggest difference this feature will bring is *making blocks belong to 
> the same file to save in the same region(DN group).*
> So the process will be:
> 1.Config DN groups, for example
> bq.Region1:dn1,dn2,dn3
> bq.Region2:dn4,dn5,dn6
> bq.Region3:dn7,dn8,dn9,dn10
> 2.Client uploads a file, first analyze whether this file has any existed 
> blocks:
> bq.i)Yes:assign new blocks to the DN group where the existed blocks belong to.
> bq.ii)No:assign new blocks to a DN group which is chosen by some certain 
> policy to avoid imbalance.
> 3.Other related processes,including append,balancer etc. also need to modify 
> as well.   
> The benefit we wish is when some DNs are down at the same time, the number of 
> affected files(miss all replicas) is small.
> But we are wondering if this is worth doing or not, or if there are problems 
> we haven't noticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10604) What about this?Group DNs and add DN groups--named region to HDFS model , use this region to instead of single DN when saving files.

2016-07-11 Thread Doris Gu (JIRA)
Doris Gu created HDFS-10604:
---

 Summary: What about this?Group DNs and add DN groups--named region 
to HDFS model , use this region to instead of single DN when saving files.
 Key: HDFS-10604
 URL: https://issues.apache.org/jira/browse/HDFS-10604
 Project: Hadoop HDFS
  Issue Type: Wish
Reporter: Doris Gu


The biggest difference this feature will bring is *strong* making blocks belong 
to the same file to save in the same region(DN group).*strong*
So the process will be:
1.Config DN groups, for example
bq.Region1:dn1,dn2,dn3
bq.Region2:dn4,dn5,dn6
bq.Region3:dn7,dn8,dn9,dn10

2.Client uploads a file, first analyze whether this file has any existed blocks:
bq.i)Yes:assign new blocks to the DN group where the existed blocks belong to.
bq.ii)No:assign new blocks to a DN group which is chosen by some certain policy 
to avoid imbalance.

3.Other related processes,including append,balancer etc. also need to modify as 
well.   

The benefit we wish is when some DNs are down at the same time, the number of 
affected files(miss all replicas) is small.
But we are wondering if this is worth doing or not, or if there are problems we 
haven't noticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10603) Flaky test org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint

2016-07-11 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10603:
-
Attachment: (was: HDFS-10603.002.patch)

> Flaky test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> ---
>
> Key: HDFS-10603
> URL: https://issues.apache.org/jira/browse/HDFS-10603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10603.001.patch, HDFS-10603.002.patch
>
>
> Test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> may fail intermittently as
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.386 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> testWithCheckpoint(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
>   Time elapsed: 15.092 sec  <<< ERROR!
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint(TestOpenFilesWithSnapshot.java:94)
> Results :
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 » IO Timed out waiting for 
> Min...
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-10603) Flaky test org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint

2016-07-11 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10603:
-
Comment: was deleted

(was: I found the test {{TestOpenFilesWithSnapshot#testOpenFilesWithRename}} 
also failed sometimes. The stack infos:
{code}
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
testOpenFilesWithRename(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
  Time elapsed: 14.069 sec  <<< ERROR!
java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithRename(TestOpenFilesWithSnapshot.java:210)
{code}
Then I made a code review in other tests in {{TestOpenFilesWithSnapshot}} and 
did the fix. Post the new patch, pending jenkins.)

> Flaky test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> ---
>
> Key: HDFS-10603
> URL: https://issues.apache.org/jira/browse/HDFS-10603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10603.001.patch, HDFS-10603.002.patch
>
>
> Test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> may fail intermittently as
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.386 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> testWithCheckpoint(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
>   Time elapsed: 15.092 sec  <<< ERROR!
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint(TestOpenFilesWithSnapshot.java:94)
> Results :
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 » IO Timed out waiting for 
> Min...
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10603) Flaky test org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint

2016-07-11 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10603:
-
Attachment: HDFS-10603.002.patch

I found the test {{TestOpenFilesWithSnapshot#testOpenFilesWithRename}} also 
failed sometimes. The stack infos:
{code}
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
testOpenFilesWithRename(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
  Time elapsed: 14.069 sec  <<< ERROR!
java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithRename(TestOpenFilesWithSnapshot.java:210)
{code}
Then I made a code review in other tests in {{TestOpenFilesWithSnapshot}} and 
did the fix. Post the new patch, pending jenkins.

> Flaky test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> ---
>
> Key: HDFS-10603
> URL: https://issues.apache.org/jira/browse/HDFS-10603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10603.001.patch, HDFS-10603.002.patch, 
> HDFS-10603.002.patch
>
>
> Test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> may fail intermittently as
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.386 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> testWithCheckpoint(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
>   Time elapsed: 15.092 sec  <<< ERROR!
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint(TestOpenFilesWithSnapshot.java:94)
> Results :
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 » IO Timed out waiting for 
> Min...
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10603) Flaky test org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint

2016-07-11 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10603:
-
Attachment: HDFS-10603.002.patch

I found the test {{TestOpenFilesWithSnapshot#testOpenFilesWithRename}} also 
failed sometimes. The stack infos:
{code}
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
testOpenFilesWithRename(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
  Time elapsed: 14.069 sec  <<< ERROR!
java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithRename(TestOpenFilesWithSnapshot.java:210)
{code}
Then I made a code review in other tests in {{TestOpenFilesWithSnapshot}} and 
did the fix. Post the new patch, pending jenkins.

> Flaky test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> ---
>
> Key: HDFS-10603
> URL: https://issues.apache.org/jira/browse/HDFS-10603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10603.001.patch, HDFS-10603.002.patch
>
>
> Test 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint
> may fail intermittently as
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.386 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
> testWithCheckpoint(org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot)
>   Time elapsed: 15.092 sec  <<< ERROR!
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1363)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2041)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2011)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testWithCheckpoint(TestOpenFilesWithSnapshot.java:94)
> Results :
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 » IO Timed out waiting for 
> Min...
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org