[jira] [Commented] (HDFS-9267) TestDiskError should get stored replicas through FsDatasetTestUtils.

2015-11-06 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995112#comment-14995112
 ] 

Lei (Eddy) Xu commented on HDFS-9267:
-

Will fix these tests in the next patch.

> TestDiskError should get stored replicas through FsDatasetTestUtils.
> 
>
> Key: HDFS-9267
> URL: https://issues.apache.org/jira/browse/HDFS-9267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9267.00.patch, HDFS-9267.01.patch, 
> HDFS-9267.02.patch, HDFS-9267.03.patch
>
>
> {{TestDiskError#testReplicationError}} scans local directories to verify 
> blocks and metadata files, which leaks the details of {{FsDataset}} 
> implementation. 
> This JIRA will abstract the "scanning" operation to {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications

2015-11-06 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9381:
--
Description: 
Currently I noticed that we are just returning null if block already exists in 
pendingReplications in replication flow for striped blocks.

{code}
if (block.isStriped()) {
  if (pendingNum > 0) {
// Wait the previous recovery to finish.
return null;
  }
{code}

 Here if we just return null and if neededReplications contains only fewer 
blocks(basically by default if less than numliveNodes*2), then same blocks can 
be picked again from neededReplications from next loop as we are not removing 
element from neededReplications. Since this replication process need to take 
fsnamesystmem lock and do, we may spend some time unnecessarily in every loop. 

So my suggestion/improvement is:
 Instead of just returning null, how about incrementing pendingReplications for 
this block and remove from neededReplications? and also another point to 
consider here is, to add into pendingReplications, generally we need target and 
it is nothing but to which node we issued replication command. Later when after 
replication success and DN reported it, block will be removed from 
pendingReplications from NN addBlock. 

 So since this is newly picked block from neededReplications, we would not have 
selected target yet. So which target to be passed to pendingReplications if we 
add this block? One Option I am thinking is, how about just passing srcNode 
itself as target for this special condition? So, anyway if the block is really 
missed, srcNode will not report it. So this block will not be removed from 
pending replications, so that when it is timed out, it will be considered for 
replication again and that time it will find actual target to replicate while 
processing as part of regular replication flow.


  was:
Currently I noticed that we are just returning null if block already exists in 
pendingReplications in replication flow for striped blocks.

{code}
if (block.isStriped()) {
  if (pendingNum > 0) {
// Wait the previous recovery to finish.
return null;
  }
{code}

 Here if neededReplications contains only fewer blocks(basically by default if 
less than numliveNodes*2), then same blocks can be picked again from 
neededReplications if we just return null as we are not removing element from 
neededReplications. Since this replication process need to take fsnamesystmem 
lock and do, we may spend some time unnecessarily in every loop. 

So my suggestion/improvement is:
 Instead of just returning null, how about incrementing pendingReplications for 
this block and remove from neededReplications? and also another point to 
consider here is, to add into pendingReplications, generally we need target and 
it is nothing to which node we issued replication command. Later when after 
replication success and DN reported it, block will be removed from 
pendingReplications from NN addBlock. 

 So since this is newly picked block from neededReplications, we would not have 
selected target yet. So which target to be passed to pendingReplications if we 
add this block.. One Option I am thinking is, how about just passing srcNode 
itself as target for this special condition? So, anyway if block is really 
missed, srcNode anyway will not report it. So this block will not be removed 
from pending replications, so that when it timeout, it will be considered for 
replication and that time it will find actual target to replicate.





 


 So  


> When same block came for replication for Striped mode, we can move that block 
> to PendingReplications
> 
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, namenode
>Affects Versions: 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> Currently I noticed that we are just returning null if block already exists 
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
>   if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
>   }
> {code}
>  Here if we just return null and if neededReplications contains only fewer 
> blocks(basically by default if less than numliveNodes*2), then same blocks 
> can be picked again from neededReplications from next loop as we are not 
> removing element from neededReplications. Since this replication process need 
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in 
> every loop. 
> So my suggestion/improvement is:
>  Instead of just returning null, how about incrementing pending

[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995073#comment-14995073
 ] 

Hudson commented on HDFS-9379:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1372 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1372/])
HDFS-9379. Make NNThroughputBenchmark support more than 10 datanodes. (arp: rev 
2801b42a7e178ad6a0e6b0f29f22f3571969c530)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9267) TestDiskError should get stored replicas through FsDatasetTestUtils.

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995068#comment-14995068
 ] 

Hadoop QA commented on HDFS-9267:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs introduced 1 new FindBugs 
issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 129m 13s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 182m 48s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 2m 27s 
{color} | {color:red} Patch generated 57 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 337m 19s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice$BlockPoolSliceReplicaIterator$DirIterator.next()
 can't throw NoSuchElementException  At BlockPoolSlice.java:At 
BlockPoolSlice.java:[line 456] |
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.server.namenode.TestSecurityTokenEditLog |
|   | hadoop.hdfs.server.datanode.T

[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995062#comment-14995062
 ] 

Hudson commented on HDFS-9379:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #649 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/649/])
HDFS-9379. Make NNThroughputBenchmark support more than 10 datanodes. (arp: rev 
2801b42a7e178ad6a0e6b0f29f22f3571969c530)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java


> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995047#comment-14995047
 ] 

Hudson commented on HDFS-9379:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2579 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2579/])
HDFS-9379. Make NNThroughputBenchmark support more than 10 datanodes. (arp: rev 
2801b42a7e178ad6a0e6b0f29f22f3571969c530)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2261) AOP unit tests are not getting compiled or run

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995038#comment-14995038
 ] 

Hadoop QA commented on HDFS-2261:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 26 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 16s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 28s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 29s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 58s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 13s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 188m 28s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.ha.TestZKFailoverController |
|   | hadoop.metrics2.impl.TestGangliaMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
| JDK v1.7.0_79 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | 

[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995032#comment-14995032
 ] 

Hudson commented on HDFS-9379:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #639 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/639/])
HDFS-9379. Make NNThroughputBenchmark support more than 10 datanodes. (arp: rev 
2801b42a7e178ad6a0e6b0f29f22f3571969c530)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java


> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995014#comment-14995014
 ] 

Hudson commented on HDFS-9236:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #579 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/579/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3

[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995016#comment-14995016
 ] 

Hudson commented on HDFS-9318:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #579 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/579/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995015#comment-14995015
 ] 

Hudson commented on HDFS-6481:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #579 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/579/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.3
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOu

[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994989#comment-14994989
 ] 

Hudson commented on HDFS-9379:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8770 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8770/])
HDFS-9379. Make NNThroughputBenchmark support more than 10 datanodes. (arp: rev 
2801b42a7e178ad6a0e6b0f29f22f3571969c530)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java


> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994975#comment-14994975
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2518/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.3
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994974#comment-14994974
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2518/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994976#comment-14994976
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2518/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9379:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.8.0
Target Version/s:   (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

Committed for 2.8.0. Thanks for the contribution [~liuml07].

> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994954#comment-14994954
 ] 

Arpit Agarwal commented on HDFS-9379:
-

Thanks for confirming you tested it manually. I will commit this shortly.

> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2015-11-06 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8905:

Status: Patch Available  (was: Open)

> Refactor DFSInputStream#ReaderStrategy
> --
>
> Key: HDFS-8905
> URL: https://issues.apache.org/jira/browse/HDFS-8905
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch
>
>
> DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
> little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2015-11-06 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8905:

Status: Open  (was: Patch Available)

> Refactor DFSInputStream#ReaderStrategy
> --
>
> Key: HDFS-8905
> URL: https://issues.apache.org/jira/browse/HDFS-8905
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch
>
>
> DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
> little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2015-11-06 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8905:

Fix Version/s: (was: HDFS-7285)

> Refactor DFSInputStream#ReaderStrategy
> --
>
> Key: HDFS-8905
> URL: https://issues.apache.org/jira/browse/HDFS-8905
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch
>
>
> DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
> little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994868#comment-14994868
 ] 

Hadoop QA commented on HDFS-9398:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 32s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 7s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
28s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 20s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-07 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771143/HDFS-9398.000.patch |
| JIRA Issue | HDFS-9398 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 0a08e6ac7939 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-ee5baeb/precommit/personality/hadoop.sh
 |
| git revision | trunk / bf6aa30 |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_60 
/usr/lib

[jira] [Commented] (HDFS-9364) Unnecessary DNS resolution attempts when creating NameNodeProxies

2015-11-06 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994859#comment-14994859
 ] 

Xiao Chen commented on HDFS-9364:
-

Thanks [~zhz], attached patch 4 with the fix.

> Unnecessary DNS resolution attempts when creating NameNodeProxies
> -
>
> Key: HDFS-9364
> URL: https://issues.apache.org/jira/browse/HDFS-9364
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, performance
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9364.001.patch, HDFS-9364.002.patch, 
> HDFS-9364.003.patch, HDFS-9364.004.patch
>
>
> When creating NameNodeProxies, we always try to DNS-resolve namenode URIs. 
> This is unnecessary if the URI is logical, and may be significantly slow if 
> the DNS is having problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9364) Unnecessary DNS resolution attempts when creating NameNodeProxies

2015-11-06 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9364:

Attachment: HDFS-9364.004.patch

> Unnecessary DNS resolution attempts when creating NameNodeProxies
> -
>
> Key: HDFS-9364
> URL: https://issues.apache.org/jira/browse/HDFS-9364
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, performance
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9364.001.patch, HDFS-9364.002.patch, 
> HDFS-9364.003.patch, HDFS-9364.004.patch
>
>
> When creating NameNodeProxies, we always try to DNS-resolve namenode URIs. 
> This is unnecessary if the URI is logical, and may be significantly slow if 
> the DNS is having problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9399) Ability to disable HDFS browsing via browseDirectory.jsp , make it configurable

2015-11-06 Thread Raghu C Doppalapudi (JIRA)
Raghu C Doppalapudi created HDFS-9399:
-

 Summary: Ability to disable HDFS browsing via browseDirectory.jsp 
, make it configurable
 Key: HDFS-9399
 URL: https://issues.apache.org/jira/browse/HDFS-9399
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Raghu C Doppalapudi
Assignee: Raghu C Doppalapudi
Priority: Minor


Currently there is no config property available in HDFS to disable file 
browsing capability. make it configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9394) branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader initialization, because HftpFileSystem is missing.

2015-11-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994814#comment-14994814
 ] 

Mingliang Liu commented on HDFS-9394:
-

Test {{org.apache.hadoop.hdfs.TestRollingUpgradeRollback}} fails in branch-2. 
All other tests can pass locally. Seem unrelated?

> branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader 
> initialization, because HftpFileSystem is missing.
> 
>
> Key: HDFS-9394
> URL: https://issues.apache.org/jira/browse/HDFS-9394
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9394.000.branch-2.patch
>
>
> On branch-2, hadoop-hdfs-client contains a {{FileSystem}} service descriptor 
> that lists {{HftpFileSystem}} and {{HsftpFileSystem}}.  These classes do not 
> reside in hadoop-hdfs-client.  Instead, they reside in hadoop-hdfs.  If the 
> application has hadoop-hdfs-client.jar on the classpath, but not 
> hadoop-hdfs.jar, then this can cause a {{ServiceConfigurationError}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-11-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994784#comment-14994784
 ] 

Mingliang Liu commented on HDFS-8971:
-

Thanks for your suggestion [~szetszwo]. I filed [HDFS-9398] to track the effort 
of reverting changes in {{ByteArrayManager}} regarding the log message.

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8971.000.patch, HDFS-8971.001.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9398:

Attachment: HDFS-9398.000.patch

> Make ByteArraryManager log message in one-line format
> -
>
> Key: HDFS-9398
> URL: https://issues.apache.org/jira/browse/HDFS-9398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9398.000.patch
>
>
> Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use one-line 
> message. It's for sure easy to read, especially in case of multiple-threads. 
> The easy fix is to use the old format before [HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9398:

Status: Patch Available  (was: Open)

> Make ByteArraryManager log message in one-line format
> -
>
> Key: HDFS-9398
> URL: https://issues.apache.org/jira/browse/HDFS-9398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9398.000.patch
>
>
> Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use one-line 
> message. It's for sure easy to read, especially in case of multiple-threads. 
> The easy fix is to use the old format before [HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9398:

Description: Per discussion in [HDFS-8971], the {{ByteArrayManager}} should 
use one-line message. It's for sure easy to read, especially in case of 
multiple-threads. The easy fix is to use the old format before [HDFS-8971].  
(was: Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use 
one-line message in ByteArrayManager. It's for sure easy to read, especially in 
case of multiple-threads. The easy fix is to use the old format before 
[HDFS-8971].)

> Make ByteArraryManager log message in one-line format
> -
>
> Key: HDFS-9398
> URL: https://issues.apache.org/jira/browse/HDFS-9398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use one-line 
> message. It's for sure easy to read, especially in case of multiple-threads. 
> The easy fix is to use the old format before [HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-9398:
---

 Summary: Make ByteArraryManager log message in one-line format
 Key: HDFS-9398
 URL: https://issues.apache.org/jira/browse/HDFS-9398
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Mingliang Liu
Assignee: Mingliang Liu


Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use one-line 
message in ByteArrayManager. It's for sure easy to read, especially in case of 
multiple-threads. The easy fix is to use the old format before [HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9398) Make ByteArraryManager log message in one-line format

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9398:

Issue Type: Improvement  (was: Bug)

> Make ByteArraryManager log message in one-line format
> -
>
> Key: HDFS-9398
> URL: https://issues.apache.org/jira/browse/HDFS-9398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> Per discussion in [HDFS-8971], the {{ByteArrayManager}} should use one-line 
> message in ByteArrayManager. It's for sure easy to read, especially in case 
> of multiple-threads. The easy fix is to use the old format before [HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9394) branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader initialization, because HftpFileSystem is missing.

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994771#comment-14994771
 ] 

Hadoop QA commented on HDFS-9394:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
8s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} branch-2 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 54s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in branch-2 has 1 extant 
Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 46s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in branch-2 has 5 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 11s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 10s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 152m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.TestRollingUpgradeRollback |
|   | hadoop.hdfs.TestDistributedFileSystem

[jira] [Updated] (HDFS-9397) Fix typo for readChecksum() LOG.warn in BlockSender.java

2015-11-06 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-9397:
---
Assignee: Nicole Pazmany

> Fix typo for readChecksum() LOG.warn in BlockSender.java
> 
>
> Key: HDFS-9397
> URL: https://issues.apache.org/jira/browse/HDFS-9397
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Enrique Flores
>Assignee: Nicole Pazmany
>Priority: Trivial
> Attachments: HDFS-9397.patch
>
>
> typo for word "verify" found in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L647
>  
> {code}
>   LOG.warn(" Could not read or failed to veirfy checksum for data"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994756#comment-14994756
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2578 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2578/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.3
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOu

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994755#comment-14994755
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2578 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2578/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3

[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994757#comment-14994757
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2578 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2578/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9258) NN should indicate which nodes are stale

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994749#comment-14994749
 ] 

Hadoop QA commented on HDFS-9258:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 16s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 451, now 452). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 36s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 52s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 146m 33s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileCreationClient |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.TestFileCreation |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
| JDK v1.7.0_79 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem |

[jira] [Commented] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-11-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994743#comment-14994743
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8971:
---

Sure, please file a JIRA to revert the change.  Thanks!

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8971.000.patch, HDFS-8971.001.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9397) Fix typo for readChecksum() LOG.warn in BlockSender.java

2015-11-06 Thread Enrique Flores (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enrique Flores updated HDFS-9397:
-
Attachment: HDFS-9397.patch

attaching proposed fix. 

> Fix typo for readChecksum() LOG.warn in BlockSender.java
> 
>
> Key: HDFS-9397
> URL: https://issues.apache.org/jira/browse/HDFS-9397
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Enrique Flores
>Priority: Trivial
> Attachments: HDFS-9397.patch
>
>
> typo for word "verify" found in: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L647
>  
> {code}
>   LOG.warn(" Could not read or failed to veirfy checksum for data"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9397) Fix typo for readChecksum() LOG.warn in BlockSender.java

2015-11-06 Thread Enrique Flores (JIRA)
Enrique Flores created HDFS-9397:


 Summary: Fix typo for readChecksum() LOG.warn in BlockSender.java
 Key: HDFS-9397
 URL: https://issues.apache.org/jira/browse/HDFS-9397
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Enrique Flores
Priority: Trivial


typo for word "verify" found in: 

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L647
 

{code}
  LOG.warn(" Could not read or failed to veirfy checksum for data"
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994727#comment-14994727
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #638 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/638/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994726#comment-14994726
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #638 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/638/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian 

[jira] [Commented] (HDFS-7163) WebHdfsFileSystem should retry reads according to the configured retry policy.

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994725#comment-14994725
 ] 

Hadoop QA commented on HDFS-7163:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 50s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project (total was 58, now 59). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 47s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 48m 41s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 128m 17s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-06 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771091/HDFS-7163.003.patch |
| JIRA Issue | HDFS-7163 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |

[jira] [Commented] (HDFS-2261) AOP unit tests are not getting compiled or run

2015-11-06 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994720#comment-14994720
 ] 

Karthik Kambatla commented on HDFS-2261:


+1, pending Jenkins. Thanks for taking this up, [~wheat9]. 

> AOP unit tests are not getting compiled or run 
> ---
>
> Key: HDFS-2261
> URL: https://issues.apache.org/jira/browse/HDFS-2261
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha, 2.0.4-alpha
> Environment: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/834/console
> -compile-fault-inject ant target 
>Reporter: Giridharan Kesavan
>Priority: Minor
> Attachments: HDFS-2261.000.patch, hdfs-2261.patch
>
>
> The tests in src/test/aop are not getting compiled or run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6481:

Fix Version/s: (was: 2.7.2)
   2.7.3

> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.3
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
> 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
> syncer encountered error, will retry. txid=211
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBounds

[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager

2015-11-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994706#comment-14994706
 ] 

Jing Zhao commented on HDFS-9129:
-

The latest patch looks good to me overall. Here're my comments:
# Let's still define {{BlockManagerSafeMode#status}} as private field, and 
provide getter/setter if necessary. In this way we can have better control of 
its value. Similarly for {{blockTotal}} and {{blockSafe}}.
# The following two initializations may be wrong: with the patch the safemode 
object is created when contructing BlockManager, before loading fsimage and 
editlog from disk.
{code}
private final long startTime = monotonicNow();
{code}
{code}
private long lastStatusReport = monotonicNow();
{code}
# {{shouldIncrementallyTrackBlocks}} is actually determined by {{haEnabled}} 
thus looks like it can be declared as final and {{isSafeModeTrackingBlocks}} 
can be simplified.
# {{BlockManagerSafeMode#setBlockTotal}} currently does two things: 1) updating 
threshold numbers, and 2) triggering mode check. We can separate #2 out of this 
method, and then {{activate}} does not need to do an unnecessary check.
# {{reached}} can be renamed to {{reachedTime}}
# In the old safemode semantic, once entering the extension state, NN never 
comes back to the normal safemode state, but can keep waiting in the extension 
state if the threshold is not met again. The current implementation changes 
this semantic. It's better to avoid this change here.
{code}
case EXTENSION:
  if (!areThresholdsMet()) {
// EXTENSION -> PENDING_THRESHOLD
status = BMSafeModeStatus.PENDING_THRESHOLD;
  } 
{code}
# The following code can be simplified.
{code}
if (status == BMSafeModeStatus.OFF) {
  return;
}
if (!shouldIncrementallyTrackBlocks) {
  return;
}
{code}
# In {{adjustBlockTotals}}, the {{setBlockTotal}} call should be out of the 
synchronized block.
{code}
synchronized (this) {
  ...
  blockSafe += deltaSafe;
  setBlockTotal(blockTotal + deltaTotal);
}
{code}
# Not caused by this patch, but since {{doConsistencyCheck}} sometimes is not 
protected by any lock (e.g., {{computeDatanodeWork}}), the total number of 
blocks retrieved from blockManager and used by the consistency check can be 
inaccurate. So I think here we can replace the AssertionError to a warning log 
message.
# Let's still name the first parameter of {{incrementSafeBlockCount}} as 
"storageNum".
# In {{decrementSafeBlockCount}}, {{checkSafeMode}} only needs to be called 
when the first time the live replica number drops below the safe number. Thus 
{{checkSafeMode}} should be called within the if.
{code}
  if (blockManager.countNodes(b).liveReplicas() == safeReplication - 1) {
this.blockSafe--;
  }
  assert blockSafe >= 0;
  checkSafeMode();
{code}

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch, HDFS-9129.006.patch, HDFS-9129.007.patch, 
> HDFS-9129.008.patch, HDFS-9129.009.patch, HDFS-9129.010.patch, 
> HDFS-9129.011.patch, HDFS-9129.012.patch, HDFS-9129.013.patch, 
> HDFS-9129.014.patch, HDFS-9129.015.patch, HDFS-9129.016.patch, 
> HDFS-9129.017.patch, HDFS-9129.018.patch, HDFS-9129.019.patch, 
> HDFS-9129.020.patch, HDFS-9129.021.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {{BlockManager}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2261) AOP unit tests are not getting compiled or run

2015-11-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994700#comment-14994700
 ] 

Haohui Mai commented on HDFS-2261:
--

Rebase on the latest trunk.

> AOP unit tests are not getting compiled or run 
> ---
>
> Key: HDFS-2261
> URL: https://issues.apache.org/jira/browse/HDFS-2261
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha, 2.0.4-alpha
> Environment: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/834/console
> -compile-fault-inject ant target 
>Reporter: Giridharan Kesavan
>Priority: Minor
> Attachments: HDFS-2261.000.patch, hdfs-2261.patch
>
>
> The tests in src/test/aop are not getting compiled or run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2261) AOP unit tests are not getting compiled or run

2015-11-06 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-2261:
-
Attachment: HDFS-2261.000.patch

> AOP unit tests are not getting compiled or run 
> ---
>
> Key: HDFS-2261
> URL: https://issues.apache.org/jira/browse/HDFS-2261
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha, 2.0.4-alpha
> Environment: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/834/console
> -compile-fault-inject ant target 
>Reporter: Giridharan Kesavan
>Priority: Minor
> Attachments: HDFS-2261.000.patch, hdfs-2261.patch
>
>
> The tests in src/test/aop are not getting compiled or run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2261) AOP unit tests are not getting compiled or run

2015-11-06 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-2261:
-
Status: Patch Available  (was: Open)

> AOP unit tests are not getting compiled or run 
> ---
>
> Key: HDFS-2261
> URL: https://issues.apache.org/jira/browse/HDFS-2261
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.4-alpha, 2.0.0-alpha
> Environment: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/834/console
> -compile-fault-inject ant target 
>Reporter: Giridharan Kesavan
>Priority: Minor
> Attachments: HDFS-2261.000.patch, hdfs-2261.patch
>
>
> The tests in src/test/aop are not getting compiled or run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9117) Config file reader / options classes for libhdfs++

2015-11-06 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994679#comment-14994679
 ] 

Bob Hansen commented on HDFS-9117:
--

[~wheat9] - do you agree that we should be reading in xml streams that follow 
the conventions of the hdfs-config.xml files?  e.g. configuration, property, 
name, value, and final stanzas?  

bq. The functionality is definitely helpful, but it can be provided as a 
utility helper instead of baking it into the main contract of libhdfs+.

That was my intention in writing this class.  A utility helper that would 
encapsulate reading config files from the field and producing a libhdfs++ 
Options object.  That's what each version has done.

I can strip it down to the API you provided, but I wonder what use case it will 
be serving then.

> Config file reader / options classes for libhdfs++
> --
>
> Key: HDFS-9117
> URL: https://issues.apache.org/jira/browse/HDFS-9117
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9369) Use ctest to run tests for hadoop-hdfs-native-client

2015-11-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994676#comment-14994676
 ] 

Jing Zhao commented on HDFS-9369:
-

The change looks good to me. +1

> Use ctest to run tests for hadoop-hdfs-native-client
> 
>
> Key: HDFS-9369
> URL: https://issues.apache.org/jira/browse/HDFS-9369
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
> Attachments: HDFS-9369.000.patch
>
>
> Currently we write special rules in {{pom.xml}} to run tests in 
> {{hadoop-hdfs-native-client}}. This jira proposes to run these tests using 
> ctest to simplify {{pom.xml}} and improve portability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9267) TestDiskError should get stored replicas through FsDatasetTestUtils.

2015-11-06 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9267:

Attachment: HDFS-9267.03.patch

Hi, [~cmccabe]. 

I updated the patch to provide an {{ReplicaIterator}} class and refactor the 
{{BlockPoolSlice}} to use it. The reason that using {{ReplicaIterator}} instead 
of {{java.util.Iterator}} is that it can {{IOException}} in {{next()}}.

Could you give some feedbacks? Thanks a lot.

> TestDiskError should get stored replicas through FsDatasetTestUtils.
> 
>
> Key: HDFS-9267
> URL: https://issues.apache.org/jira/browse/HDFS-9267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9267.00.patch, HDFS-9267.01.patch, 
> HDFS-9267.02.patch, HDFS-9267.03.patch
>
>
> {{TestDiskError#testReplicationError}} scans local directories to verify 
> blocks and metadata files, which leaks the details of {{FsDataset}} 
> implementation. 
> This JIRA will abstract the "scanning" operation to {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9395) getContentSummary is audit logged as success even if failed

2015-11-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994654#comment-14994654
 ] 

Kihwal Lee commented on HDFS-9395:
--

It's by design? HDFS-5163

> getContentSummary is audit logged as success even if failed
> ---
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994596#comment-14994596
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1371 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1371/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.2
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream

[jira] [Commented] (HDFS-9117) Config file reader / options classes for libhdfs++

2015-11-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994598#comment-14994598
 ] 

Haohui Mai commented on HDFS-9117:
--

bq. As an example, let's say we are writing a native replacement for the dfs 
tool using the native libhdfs++ codebase (not the libhdfs compatability layer) 
that cat do "-ls" and "-copyFromLocal", etc. To provide Least Astonishment for 
our consumers, they would expect that a properly configured Hadoop node [with 
the HADOOP_HOME pointing to /etc/hadoop-2.9.9 and its config files] could run 
"hdfspp -ls /tmp" and have it automatically find the NN and configure the 
communications parameters correctly to talk to their cluster.

Unfortunately the assumption is broken in many ways -- it is fully 
implementation defined.  For example, there are issues whether {{HADOOP_HOME}} 
or {{HADOOP_PREFIX}} should be chosen. Configuration files are only required to 
be specified in {{CLASSPATH}} but not necessary in the {{HADOOP_HOME}} 
directory. Different vendors might have changed their scripts and put the 
configuration in different places. Scripts evolves across versions. We have 
very different scripts between trunk and branch-2.

While it definitely useful in the libhdfs compatibility layer, I'm doubtful it 
should be added into the core part of the library due to all these complexity.

Therefore I believe that the focus of the library should be providing 
mechanisms to interact with HDFS but not concrete policy (e.g., location of the 
configuration) on how to interact. We don't have any libraries to implement the 
protocols and mechanisms to interact with HDFS yet (which is the reusable 
part). The policy is highly customized in different environments but it can be 
worked around easily (which is the less reusable part).

bq. given this context, do you agree that we need to support libhdfs++ 
compatibility with the hdfs-site.xml files that are already deployed at 
customer 

There are two levels of APIs when you talk about libhdfs++ APIs. The core API 
focuses on providing mechanisms to interact with HDFS, such as implementing the 
Hadoop RPC, DataTransferProtocol. The API that you're referring to might be a 
convenient API for libhdfs++. The functionality is definitely helpful, but it 
can be provided as a utility helper instead of baking it into the main contract 
of libhdfs++.

My suggestion is the following:

1. Focusing on getting the code on parsing XML in strings (which is the core 
functionality of parsing configuration) in this jira. It should not contain any 
file operations.
2. Separating the tasks on searching through paths, reading files, etc. into 
different jiras. For now it makes sense to put it along with the {{libhdfs}} 
compatibility layer. Since it's an implementation detail I believe we can 
quickly go through it. At a later point of time we can promote the code to a 
common library once we have a proposal on how the libhdfs++ convenient APIs 
look like.


> Config file reader / options classes for libhdfs++
> --
>
> Key: HDFS-9117
> URL: https://issues.apache.org/jira/browse/HDFS-9117
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994597#comment-14994597
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1371 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1371/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994595#comment-14994595
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1371 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1371/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994589#comment-14994589
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #648 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/648/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994587#comment-14994587
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #648 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/648/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3

[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994588#comment-14994588
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #648 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/648/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.2
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOu

[jira] [Commented] (HDFS-9364) Unnecessary DNS resolution attempts when creating NameNodeProxies

2015-11-06 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994550#comment-14994550
 ] 

Zhe Zhang commented on HDFS-9364:
-

Thanks for clarifying this Xiao. I agree with the approach in 03 patch. So +1 
pending the minor fix.

> Unnecessary DNS resolution attempts when creating NameNodeProxies
> -
>
> Key: HDFS-9364
> URL: https://issues.apache.org/jira/browse/HDFS-9364
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, performance
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9364.001.patch, HDFS-9364.002.patch, 
> HDFS-9364.003.patch
>
>
> When creating NameNodeProxies, we always try to DNS-resolve namenode URIs. 
> This is unnecessary if the URI is logical, and may be significantly slow if 
> the DNS is having problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9394) branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader initialization, because HftpFileSystem is missing.

2015-11-06 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9394:

Hadoop Flags: Reviewed

+1 for the patch, pending pre-commit run.  I verified locally that the 
hadoop-hdfs-client tests pass on branch-2 after applying this patch.

> branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader 
> initialization, because HftpFileSystem is missing.
> 
>
> Key: HDFS-9394
> URL: https://issues.apache.org/jira/browse/HDFS-9394
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9394.000.branch-2.patch
>
>
> On branch-2, hadoop-hdfs-client contains a {{FileSystem}} service descriptor 
> that lists {{HftpFileSystem}} and {{HsftpFileSystem}}.  These classes do not 
> reside in hadoop-hdfs-client.  Instead, they reside in hadoop-hdfs.  If the 
> application has hadoop-hdfs-client.jar on the classpath, but not 
> hadoop-hdfs.jar, then this can cause a {{ServiceConfigurationError}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9236:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks [~twu] for the contribution and 
[~walter.k.su] for the review.


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9117) Config file reader / options classes for libhdfs++

2015-11-06 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994539#comment-14994539
 ] 

Bob Hansen commented on HDFS-9117:
--

bq. 2. Adding search paths and parsing them can be replaced by passing in a 
vector of path. Parsing the environment variable is specific to to the 
compatibility layer of libhdfs.

I agree that we should take a vector of paths rather than parsing a string; 
thanks for that suggestion.  See above comment re: dereferencing HADOOP_HOME.

> Config file reader / options classes for libhdfs++
> --
>
> Key: HDFS-9117
> URL: https://issues.apache.org/jira/browse/HDFS-9117
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9117) Config file reader / options classes for libhdfs++

2015-11-06 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994534#comment-14994534
 ] 

Bob Hansen commented on HDFS-9117:
--

[~wheat9]: thanks for the feedback and carrying the conversation forward.

The primary use case of the Configuration class (as I see it) is to provide 
compatibility not with libhdfs, but with deployed Hadoop java environments.  

As an example, let's say we are writing a native replacement for the dfs tool 
using the native libhdfs++ codebase (not the libhdfs compatability layer) that 
cat do "-ls" and "-copyFromLocal", etc.  To provide Least Astonishment for our 
consumers, they would expect that a properly configured Hadoop node [with the 
HADOOP_HOME pointing to /etc/hadoop-2.9.9 and its config files] could run 
"hdfspp -ls /tmp" and have it automatically find the NN and configure the 
communications parameters correctly to talk to their cluster.

To fully support that use case, we need to read xml in the currently-deployed 
file format (which specifies that we honor "final" tags where they appear in 
the files), and dereference at least HADOOP_HOME in loading the default files.  
We could force our consumers to do that, but that doesn't seem a kindness for 
code we need to write anyway.  We also need to be able to read the encodings 
that are being used in the field (such as "1M" for buffer sizes).  

If we really think that config-substitution and environmental substitution is 
exceedingly rare in the field, we can defer the work, but I am concerned that 
we will deploy a libhdfs++ application to the field only to find that it can't 
read an early adopter's config file.  That work has already been shuffled off 
to HDFS-9385 so we can revisit it later.

Other use cases may not need to read existing hdfs-site.xml files, which is why 
I think you are wise in have a separation between the Config reader and the 
Options object.

I agree with your concern that the libhdfs++ default Options object will get 
out of sync with the Java defaults, and will happily write unit test that 
verifies that they stay together.

[~wheat9] - given this context, do you agree that we need to support libhdfs++ 
compatibility with the hdfs-site.xml files that are already deployed at 
customer sites?

> Config file reader / options classes for libhdfs++
> --
>
> Key: HDFS-9117
> URL: https://issues.apache.org/jira/browse/HDFS-9117
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9387) Parse namenodeUri parameter only once in NNThroughputBenchmark$OperationStatsBase#verifyOpArgument()

2015-11-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994501#comment-14994501
 ] 

Mingliang Liu commented on HDFS-9387:
-

I think the failing tests are unrelated.

> Parse namenodeUri parameter only once in 
> NNThroughputBenchmark$OperationStatsBase#verifyOpArgument()
> 
>
> Key: HDFS-9387
> URL: https://issues.apache.org/jira/browse/HDFS-9387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9387.000.patch
>
>
> In {{NNThroughputBenchmark$OperationStatsBase#verifyOpArgument()}}, the   
> {{namenodeUri}} is always parsed from {{-namenode}} argument. This works just 
> fine if the {{-op}} parameter is not {{all}}, as the single benchmark will 
> need to parse the {{namenodeUri}} from args anyway.
> When the {{-op}} is {{all}}, namely all sub-benchmark will run, multiple 
> sub-benchmark will call the {{verifyOpArgument()}} method. In this case, the 
> first sub-benchmark reads the {{namenode}} argument and removes it from args. 
> The other sub-benchmarks will thereafter read {{null}} value since the 
> argument is removed. This contradicts the intension of providing {{namenode}} 
> for all sub-benchmarks.
> {code:title=current code}
>   try {
> namenodeUri = StringUtils.popOptionWithArgument("-namenode", args);
>   } catch (IllegalArgumentException iae) {
> printUsage();
>   }
> {code}
> The fix is to parse the {{namenodeUri}}, which is shared by all 
> sub-benchmarks, from {{-namenode}} argument only once. This follows the 
> convention of parsing other global arguments in 
> {{OperationStatsBase#verifyOpArgument()}}.
> {code:title=simple fix}
>   if (args.indexOf("-namenode") >= 0) {
> try {
>   namenodeUri = StringUtils.popOptionWithArgument("-namenode", args);
> } catch (IllegalArgumentException iae) {
>   printUsage();
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9396) Total files and directories on jmx and web UI on standby is uninitialized

2015-11-06 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9396:
-
Attachment: HDFS-9396.patch

> Total files and directories on jmx and web UI on standby is uninitialized
> -
>
> Key: HDFS-9396
> URL: https://issues.apache.org/jira/browse/HDFS-9396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-9396.patch
>
>
> After HDFS-6763, the quota on the standby namenode is not being updated until 
> it transitions to active. This causes the jmx and the web ui files and dir 
> count to be uninitialized or unupdated. In some cases it shows a negative 
> number.
> It is because the legacy way of getting the inode count, which existed since 
> before the creation of inode table. It relies on the root inode's quota being 
> properly updated.  We can make it simply return the size of the inode table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9396) Total files and directories on jmx and web UI on standby is uninitialized

2015-11-06 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9396:
-
Status: Patch Available  (was: Open)

> Total files and directories on jmx and web UI on standby is uninitialized
> -
>
> Key: HDFS-9396
> URL: https://issues.apache.org/jira/browse/HDFS-9396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-9396.patch
>
>
> After HDFS-6763, the quota on the standby namenode is not being updated until 
> it transitions to active. This causes the jmx and the web ui files and dir 
> count to be uninitialized or unupdated. In some cases it shows a negative 
> number.
> It is because the legacy way of getting the inode count, which existed since 
> before the creation of inode table. It relies on the root inode's quota being 
> properly updated.  We can make it simply return the size of the inode table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9396) Total files and directories on jmx and web UI on standby is uninitialized

2015-11-06 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HDFS-9396:


Assignee: Kihwal Lee

> Total files and directories on jmx and web UI on standby is uninitialized
> -
>
> Key: HDFS-9396
> URL: https://issues.apache.org/jira/browse/HDFS-9396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-9396.patch
>
>
> After HDFS-6763, the quota on the standby namenode is not being updated until 
> it transitions to active. This causes the jmx and the web ui files and dir 
> count to be uninitialized or unupdated. In some cases it shows a negative 
> number.
> It is because the legacy way of getting the inode count, which existed since 
> before the creation of inode table. It relies on the root inode's quota being 
> properly updated.  We can make it simply return the size of the inode table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2015-11-06 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8905:

Attachment: HDFS-8905-v2.patch

> Refactor DFSInputStream#ReaderStrategy
> --
>
> Key: HDFS-8905
> URL: https://issues.apache.org/jira/browse/HDFS-8905
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-7285
>
> Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch
>
>
> DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
> little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes

2015-11-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994491#comment-14994491
 ] 

Mingliang Liu commented on HDFS-9379:
-

Thanks for your review [~arpitagarwal].

{quote}
 Did you get a chance to test it manually?
{quote}
Yes I did test this manually. I ran the test with different combination of 
arguments, including {{-namenode}}, {{-datanodes}} and {{blocksPerReport}}. If 
the {{-datatnodes}} is greater than 9, the trunk code will run the benchmark 
successfully with this patch, and it will fail without this patch. The failing 
code is the assertion which checks the lexicographical order of datanodes.

{quote}
The unit test Test NNThroughputBenchmark looks inadequate. It passed even when 
I replaced the dnIdx computation with zero.
{quote}
The {{TestNNThroughputBenchmark}} seems a driver to run the benchmark, instead 
of unit testing the benchmark itself. Thus I did not change it. If we make the 
{{dnIdx}} always as zero when searching the datatnode index in {{datanodes}} 
array given datanode info, the test could pass as the generated block will 
always be added to the first datanode. The benchmark itself allows this, though 
the test results will be dubious.

{quote}
I looked through the remaining usages of datanodes for any dependencies on 
lexical ordering and didn't find any.
{quote}
That's true. The {{BlockReportStats}} is the only use case I found that depends 
on the lexical ordering of {{datanodes}} array. I ran other tests and they look 
good when the {{-datanodes}} or {{-threads}} is greater than 10.



> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --
>
> Key: HDFS-9379
> URL: https://issues.apache.org/jira/browse/HDFS-9379
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9379.000.patch
>
>
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on 
> sorted {{datanodes}} array in the lexicographical order of datanode's 
> {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when 
> filling the {{datanodes}}, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search 
> against the {{datanodes}} array, see [the 
> code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In 
> {{NNThroughputBenchmark}}, the port is simply _the index of the tiny 
> datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes 
> ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will 
> be invalid as the string value of datanode index is not in lexicographical 
> order any more. For example, 
> {code}
> ...
> 192.168.54.40:8
> 192.168.54.40:9
> 192.168.54.40:10
> 192.168.54.40:11
> ...
> {code}
> {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will 
> fail and the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead 
> of using binary search.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9396) Total files and directories on jmx and web UI on standby is uninitialized

2015-11-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994475#comment-14994475
 ] 

Kihwal Lee commented on HDFS-9396:
--

{code:java}
  long totalInodes() {
readLock();
try {
  return rootDir.getDirectoryWithQuotaFeature().getSpaceConsumed()
  .getNameSpace();
} finally {
  readUnlock();
}
  }
{code}

It can simply do without locking.
{code:java}
  return inodeMap.size();
{code}

> Total files and directories on jmx and web UI on standby is uninitialized
> -
>
> Key: HDFS-9396
> URL: https://issues.apache.org/jira/browse/HDFS-9396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Blocker
>
> After HDFS-6763, the quota on the standby namenode is not being updated until 
> it transitions to active. This causes the jmx and the web ui files and dir 
> count to be uninitialized or unupdated. In some cases it shows a negative 
> number.
> It is because the legacy way of getting the inode count, which existed since 
> before the creation of inode table. It relies on the root inode's quota being 
> properly updated.  We can make it simply return the size of the inode table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9396) Total files and directories on jmx and web UI on standby is uninitialized

2015-11-06 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-9396:


 Summary: Total files and directories on jmx and web UI on standby 
is uninitialized
 Key: HDFS-9396
 URL: https://issues.apache.org/jira/browse/HDFS-9396
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Blocker


After HDFS-6763, the quota on the standby namenode is not being updated until 
it transitions to active. This causes the jmx and the web ui files and dir 
count to be uninitialized or unupdated. In some cases it shows a negative 
number.

It is because the legacy way of getting the inode count, which existed since 
before the creation of inode table. It relies on the root inode's quota being 
properly updated.  We can make it simply return the size of the inode table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots

2015-11-06 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994469#comment-14994469
 ] 

Harsh J commented on HDFS-8986:
---

This continues to cause a bunch of confusion among our user-base who are still 
reliant on the pre-snapshot feature behaviour, so it would be nice to see it 
implemented.

> Add option to -du to calculate directory space usage excluding snapshots
> 
>
> Key: HDFS-8986
> URL: https://issues.apache.org/jira/browse/HDFS-8986
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Gautam Gopalakrishnan
>Assignee: Jagadesh Kiran N
>
> When running {{hadoop fs -du}} on a snapshotted directory (or one of its 
> children), the report includes space consumed by blocks that are only present 
> in the snapshots. This is confusing for end users.
> {noformat}
> $  hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -createSnapshot /tmp/parent snap1
> Created snapshot /tmp/parent/.snapshot/snap1
> $ hadoop fs -rm -skipTrash /tmp/parent/sub1/*
> ...
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -deleteSnapshot /tmp/parent snap1
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 0  0  /tmp/parent
> 0  0  /tmp/parent/sub1
> {noformat}
> It would be helpful if we had a flag, say -X, to exclude any snapshot related 
> disk usage in the output



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9117) Config file reader / options classes for libhdfs++

2015-11-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994468#comment-14994468
 ] 

Haohui Mai commented on HDFS-9117:
--

Thinking about the patch a little bit more, I believe that there is little 
value to follow the original implementation in Java. The implementation looks 
over complicated. 

What is the minimal pieces of loading the configuration? Does the class need to 
address every aspect of the concerns?

To me the core part of the patch should be parsing XML and populating the map 
of configuration. Functionality like searching through the filesystem, 
environment variables, expanding the configuration, etc., are all optional.

1. Note that the main motivation of the {{Options}} class is to make all the 
configurations standalone and explicit. Users should be able to specify all 
configuration through the {{Options}} object. Default values are fundamental 
parts of the contracts in the {{Options}} class. Filling the configuration with 
the default values from the {{-*default.xml}} creates inconsistency and bugs 
that are to be detected. The flip side is that {{Options}} can get out of dated 
if someone changes the default value of the configuration, however it can be 
caught effectively through adding a unit test.

2. Adding search paths and parsing them can be replaced by passing in a 
{{vector}} of path. Parsing the environment variable is specific to to the 
compatibility layer of {{libhdfs}}.

3. Some of the functionality might be useful at later time. Since the code that 
uses that functionality is yet-to-be-written, it is difficult to review and 
justify what is the appropriate design and implementation. We can revisit some 
of these issues one the code that actually uses the functionality is available.

I propose the following interfaces:

{code}
class Configuration {
public:
  enum Priority {
kDefault,
kSpecified,
kFinal,
  };
  /**
   * Load configurations that are specified in XML format.
   * Each configuration file is associated with a priority.
   * A configuration with higher priority will overwrite the ones with lower 
priority.
   **/
  int ParseXMLConfiguration(const std::string &xml, Priority priority);

  /**
   * Get the value configuration, return empty if it's unspecified.
   **/
  template
  Optional get(const std::string &key);

  /**
   * Get the value configuration, return the default value of configuration
   * if it's unspecified.
   **/
  template
  T getWithDefault(const std::string &key);
};

{code}

> Config file reader / options classes for libhdfs++
> --
>
> Key: HDFS-9117
> URL: https://issues.apache.org/jira/browse/HDFS-9117
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9144) Refactor libhdfs into stateful/ephemeral objects

2015-11-06 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994465#comment-14994465
 ] 

Bob Hansen commented on HDFS-9144:
--

Hm.  Very noisy.  Probably want a squashed pull next time, but it looks like it 
is available for using GitHub for review.

> Refactor libhdfs into stateful/ephemeral objects
> 
>
> Key: HDFS-9144
> URL: https://issues.apache.org/jira/browse/HDFS-9144
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS-8707
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9144.HDFS-8707.001.patch, 
> HDFS-9144.HDFS-8707.002.patch
>
>
> In discussion for other efforts, we decided that we should separate several 
> concerns:
> * A posix-like FileSystem/FileHandle object (stream-based, positional reads)
> * An ephemeral ReadOperation object that holds the state for 
> reads-in-progress, which consumes
> * An immutable FileInfo object which holds the block map and file size (and 
> other metadata about the file that we assume will not change over the life of 
> the file)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9144) Refactor libhdfs into stateful/ephemeral objects

2015-11-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994458#comment-14994458
 ] 

ASF GitHub Bot commented on HDFS-9144:
--

GitHub user bobhansen opened a pull request:

https://github.com/apache/hadoop/pull/43

HDFS-9144: libhdfs++ refactoring

Code changes for HDFS-9144 as described in the JIRA.  Removing some 
templates and traits and restructuring the code for more modularity.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bobhansen/hadoop HDFS-9144-merge

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/43.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #43


commit 1fb1ea527c9b5321e6da6c2543859db2ec3eaf7c
Author: Bob Hansen 
Date:   2015-10-22T11:58:41Z

Refactored NameNodeConnection

commit c6cf5175b9c21561bdcbd22be27f50e22a1d3ebd
Author: Bob Hansen 
Date:   2015-10-22T12:01:36Z

Removed fs_ from InputStream

commit 8b8190d334224d8acec9a4bef97d5e0226c1045a
Author: Bob Hansen 
Date:   2015-10-22T13:05:53Z

Moved GetBlockInfo to NN connection

commit 108b54f3079ed21149a59b9222d6d9832ee05d79
Author: Bob Hansen 
Date:   2015-10-22T13:20:56Z

Moved GetBlockLocations to std::function

commit 6d112a17048bcec437701b422209641e56f6196e
Author: Bob Hansen 
Date:   2015-10-22T13:48:02Z

Added comments

commit e57b0ed02e29781f347499f0f3546659870aabab
Author: Bob Hansen 
Date:   2015-10-22T13:52:39Z

Stripped whitespace

commit c9c82125e8c0b742ee3a70d6fdbdedca180cdd4f
Author: Bob Hansen 
Date:   2015-10-27T16:07:33Z

Renamed NameNodeConnection to NameNodeOperations

commit 01499b6027ec771ebf04d4723899ee976b2a6044
Author: Bob Hansen 
Date:   2015-10-27T23:26:26Z

Renamed input_stream and asio_continuation

commit 02c67837fe832e45286a675f1a27fa29e1b80a9a
Author: Bob Hansen 
Date:   2015-10-27T23:30:44Z

Renamed CreatePipeline to Connect

commit 5d28d02e1752be74975647f8dc656776ab9e2cbf
Author: Bob Hansen 
Date:   2015-10-27T23:58:18Z

Rename async_connect to async_request

commit 9d98bf41091c923103cbeeadb5459c3119b50584
Author: Bob Hansen 
Date:   2015-10-28T13:01:38Z

Renamed read_some to read_packet

commit 6ced4a97e297ce0e833db8dbd4b38c91c966d71c
Author: Bob Hansen 
Date:   2015-10-28T13:15:50Z

Renamed async_request to async_request_block

commit f05a771e578969b9b281de4e0c97887f98b0f2cf
Author: Bob Hansen 
Date:   2015-10-28T13:19:09Z

Renamed BlockReader::request to request_block

commit fcf1585bf67f84ef8c0acc72660d2ad250005e3b
Author: Bob Hansen 
Date:   2015-10-28T19:12:39Z

Moved to file_info

commit a3fd975285b25a3eae448e5ac46d0118a14d6610
Author: Bob Hansen 
Date:   2015-10-28T19:16:20Z

Made file_info pointers const

commit 366f488b8e8364eba3f1966b931216d2bf404ae1
Author: Bob Hansen 
Date:   2015-10-28T21:37:46Z

Refactored DataNodeConnection, etc.

commit 418799feb8d12181d9e5bd6b6aa94333bb21e126
Author: Bob Hansen 
Date:   2015-10-29T13:53:46Z

Added shared_ptr to DN_Connection

commit f043e154a261e9ff64f1ead450e3a256ecd023a2
Author: Bob Hansen 
Date:   2015-10-29T15:31:28Z

Moved DNConnection into trait

commit aea859ff34a6768c7df29ec25f1abd2b92835b9e
Author: Bob Hansen 
Date:   2015-10-29T15:32:12Z

Trimmed whitespace

commit 55d7b5dcd92b0fd9d0011e97d8f47e78c3316205
Author: Bob Hansen 
Date:   2015-10-29T17:23:30Z

Re-enabled IS tests

commit 142efabbda38852b431d94096d6cef69f5c96393
Author: Bob Hansen 
Date:   2015-10-29T17:31:05Z

Cleaned up some tests

commit 4bc0f448fe52a762a242428a1331272c9fee3247
Author: Bob Hansen 
Date:   2015-10-29T21:53:57Z

Working on less templates

commit dd16d4fa9f08f55f9d4140219471f002eca5a8ed
Author: Bob Hansen 
Date:   2015-10-29T23:28:01Z

Compiles!

commit 2b14efa8277c66a3e9e0fb67af925501757d39f8
Author: Bob Hansen 
Date:   2015-10-30T20:46:52Z

Fixed DNconnection signature

commit 8d143e789a98431f8cd2cb08db37a0a05f4d9c77
Author: Bob Hansen 
Date:   2015-11-02T16:35:54Z

Fixed segfault in ReadData

commit b6f5454e626c1caa1b76398c9edf220fc1252be9
Author: Bob Hansen 
Date:   2015-11-02T18:36:15Z

Removed BlockReader callback templates

commit 3b5d712b454f5b817c22909bac2f3477a64624fe
Author: Bob Hansen 
Date:   2015-11-02T18:52:16Z

Removed last templates from BlockReader

commit d9b9241f12a957226df7ccacad07d8e1a0d98cca
Author: Bob Hansen 
Date:   2015-11-02T20:56:43Z

Moved entirely over to BlockReader w/out templates

commit 5de0bce35fb52b7a688d3fc4ad02748106fca38e
Author: Bob Hansen 
Date:   2015-11-02T21:06:25Z

Removed unnecessary impls

commit d5baa8784643bdfed454c8a4ba0edb102d73f40a
Author: Bob Hansen 
Date:   2015-11-03T15:00:50Z

Moved DN to its own file




> Refactor libhdfs int

[jira] [Updated] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt

2015-11-06 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9328:
--
Attachment: HDFS-9328.HDFS-8707.001.patch

New Patch:
-left in clang-format
-changed name to CONTRIBUTING.md, added some markdown to make it look nicer.
-Added a couple extra bits about portability to (4) that [~ste...@apache.org] 
suggested.

I'm new to markdown.  Do people typically self limit line width or just assume 
the rendering software will handle that?  I'd appreciate any other feedback as 
well. 

> Formalize coding standards for libhdfs++ and put them in a README.txt
> -
>
> Key: HDFS-9328
> URL: https://issues.apache.org/jira/browse/HDFS-9328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Blocker
> Attachments: HDFS-9328.HDFS-8707.000.patch, 
> HDFS-9328.HDFS-8707.001.patch
>
>
> We have 2-3 people working on this project full time and hopefully more 
> people will start contributing.  In order to efficiently scale we need a 
> single, easy to find, place where developers can check to make sure they are 
> following the coding standards of this project to both save their time and 
> save the time of people doing code reviews.
> The most practical place to do this seems like a README file in libhdfspp/. 
> The foundation of the standards is google's C++ guide found here: 
> https://google-styleguide.googlecode.com/svn/trunk/cppguide.html
> Any exceptions to google's standards or additional restrictions need to be 
> explicitly enumerated so there is one single point of reference for all 
> libhdfs++ code standards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9395) getContentSummary is audit logged as success even if failed

2015-11-06 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla reassigned HDFS-9395:
-

Assignee: Kuhu Shukla

> getContentSummary is audit logged as success even if failed
> ---
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994439#comment-14994439
 ] 

Hudson commented on HDFS-9236:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8769 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8769/])
HDFS-9236. Missing sanity check for block size during block recovery. (yzhang: 
rev b64242c0d2cabd225a8fb7d25fed449d252e4fa1)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/ReplicaRecoveryInfo.java


> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994440#comment-14994440
 ] 

Hudson commented on HDFS-9318:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8769 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8769/])
HDFS-9318. considerLoad factor can be improved. Contributed by Kuhu (kihwal: 
rev bf6aa30a156b3c5cac5469014a5989e0dfdc7256)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8708) DFSClient should ignore dfs.client.retry.policy.enabled for HA proxies

2015-11-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994405#comment-14994405
 ] 

Kihwal Lee commented on HDFS-8708:
--

bq. The default value is false for HA. I think it's good enough.
I agree. Besides, there are cases where we want this to be on and work with HA. 
E.g. the IP address change detection code in ipc Client does not work, if the 
exception bubbles upto HA retey logic. It only works when retry is done within 
the same Client instance.

> DFSClient should ignore dfs.client.retry.policy.enabled for HA proxies
> --
>
> Key: HDFS-8708
> URL: https://issues.apache.org/jira/browse/HDFS-8708
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Brahma Reddy Battula
>Priority: Critical
>
> DFSClient should ignore dfs.client.retry.policy.enabled for HA proxies to 
> ensure fast failover. Otherwise, dfsclient retries the NN which is no longer 
> active and delays the failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9395) getContentSummary is audit logged as success even if failed

2015-11-06 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-9395:


 Summary: getContentSummary is audit logged as success even if 
failed
 Key: HDFS-9395
 URL: https://issues.apache.org/jira/browse/HDFS-9395
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee


Audit logging is in the fainally block along with the lock unlocking, so it is 
always logged as success even for cases like FileNotFoundException is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9318:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for working on this, Kuhu.

> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994298#comment-14994298
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #637 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/637/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.2
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.

[jira] [Updated] (HDFS-9394) branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader initialization, because HftpFileSystem is missing.

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9394:

Status: Patch Available  (was: Open)

> branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader 
> initialization, because HftpFileSystem is missing.
> 
>
> Key: HDFS-9394
> URL: https://issues.apache.org/jira/browse/HDFS-9394
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9394.000.branch-2.patch
>
>
> On branch-2, hadoop-hdfs-client contains a {{FileSystem}} service descriptor 
> that lists {{HftpFileSystem}} and {{HsftpFileSystem}}.  These classes do not 
> reside in hadoop-hdfs-client.  Instead, they reside in hadoop-hdfs.  If the 
> application has hadoop-hdfs-client.jar on the classpath, but not 
> hadoop-hdfs.jar, then this can cause a {{ServiceConfigurationError}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7163) WebHdfsFileSystem should retry reads according to the configured retry policy.

2015-11-06 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7163:
-
Attachment: HDFS-7163.003.patch

Fixed the checkstyle and findbugs warnings. None of the unit tests listed above 
failed in my own build environment.

> WebHdfsFileSystem should retry reads according to the configured retry policy.
> --
>
> Key: HDFS-7163
> URL: https://issues.apache.org/jira/browse/HDFS-7163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.5.1
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: HDFS-7163.001.patch, HDFS-7163.002.patch, 
> HDFS-7163.003.patch, WebHDFS Read Retry.pdf
>
>
> In the current implementation of WebHdfsFileSystem, opens are retried 
> according to the configured retry policy, but not reads. Therefore, if a 
> connection goes down while data is being read, the read will fail and the 
> read will have to be retried by the client code.
> Also, after a connection has been established, the next read (or seek/read) 
> will fail and the read will have to be restarted by the client code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-11-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994284#comment-14994284
 ] 

Mingliang Liu commented on HDFS-8971:
-

Thanks for reporting this [~szetszwo]. I totally agree with you that we should 
consider one-line message in {{ByteArrayManager}}. It's for sure easy to read, 
especially in case of multiple-threads. Perhaps we can simply revert the 
changes in this class? I revisited the patch and other classes should be fine. 

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8971.000.patch, HDFS-8971.001.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9318) considerLoad factor can be improved

2015-11-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994281#comment-14994281
 ] 

Kihwal Lee commented on HDFS-9318:
--

+1 lgtm

> considerLoad factor can be improved
> ---
>
> Key: HDFS-9318
> URL: https://issues.apache.org/jira/browse/HDFS-9318
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9318-v1.patch, HDFS-9318-v2.patch
>
>
> Currently considerLoad avoids choosing nodes that are too active, so it helps 
> level the HDFS load across the cluster. Under normal conditions, this is 
> desired. However, when a cluster has a large percentage of nearly full nodes, 
> this can make it difficult to find good targets because the placement policy 
> wants to avoid the full nodes, but considerLoad wants to avoid the busy 
> less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994263#comment-14994263
 ] 

Hudson commented on HDFS-6481:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8768 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8768/])
HDFS-6481. DatanodeManager#getDatanodeStorageInfos() should check the (arp: rev 
0b18e5e8c69b40c9a446fff448d38e0dd10cb45e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.2
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputSt

[jira] [Updated] (HDFS-9258) NN should indicate which nodes are stale

2015-11-06 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9258:
--
Status: Patch Available  (was: In Progress)

> NN should indicate which nodes are stale
> 
>
> Key: HDFS-9258
> URL: https://issues.apache.org/jira/browse/HDFS-9258
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
> Attachments: HDFS-9258-v1.patch
>
>
> Determining why the NN is not coming out of safemode is difficult - is it a 
> bug or pending block reports?  If the number of nodes appears sufficient, but 
> there are missing blocks, it would be nice to know which nodes haven't block 
> reported (stale).  Instead of forcing the NN to leave safemode prematurely, 
> the SE can first force block reports from stale nodes.
> The datanode report and the web ui's node list should contain this 
> information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9236) Missing sanity check for block size during block recovery

2015-11-06 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9236:

Target Version/s: 2.8.0  (was: 2.7.3)

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, 
> HDFS-9236.006.patch, HDFS-9236.007.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9258) NN should indicate which nodes are stale

2015-11-06 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9258:
--
Attachment: HDFS-9258-v1.patch

Added isStale to jmx info. Added an {{isStale()}} call to be on DNInfo and 
replaced the old one where ever it was possible. Also 
{{chooseDatanodesForCaching()}} was a static call which is called only once 
from  {{addNewPendingCached()}} which is non-static. Hence moving 
chooseDatanodesForCaching to non-static method.

> NN should indicate which nodes are stale
> 
>
> Key: HDFS-9258
> URL: https://issues.apache.org/jira/browse/HDFS-9258
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
> Attachments: HDFS-9258-v1.patch
>
>
> Determining why the NN is not coming out of safemode is difficult - is it a 
> bug or pending block reports?  If the number of nodes appears sufficient, but 
> there are missing blocks, it would be nice to know which nodes haven't block 
> reported (stale).  Instead of forcing the NN to leave safemode prematurely, 
> the SE can first force block reports from stale nodes.
> The datanode report and the web ui's node list should contain this 
> information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9249) NPE thrown if an IOException is thrown in NameNode.

2015-11-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994235#comment-14994235
 ] 

Yongjun Zhang commented on HDFS-9249:
-

Thanks [~jojochuang] for the new rev, +1 pending jenkins.


> NPE thrown if an IOException is thrown in NameNode.
> -
>
> Key: HDFS-9249
> URL: https://issues.apache.org/jira/browse/HDFS-9249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9249.001.patch, HDFS-9249.002.patch, 
> HDFS-9249.003.patch, HDFS-9249.004.patch, HDFS-9249.005.patch, 
> HDFS-9249.006.patch
>
>
> This issue was found when running test case 
> TestBackupNode.testCheckpointNode, but upon closer look, the problem is not 
> due to the test case.
> Looks like an IOException was thrown in
> try {
>   initializeGenericKeys(conf, nsId, namenodeId);
>   initialize(conf);
>   try {
> haContext.writeLock();
> state.prepareToEnterState(haContext);
> state.enterState(haContext);
>   } finally {
> haContext.writeUnlock();
>   }
> causing the namenode to stop, but the namesystem was not yet properly 
> instantiated, causing NPE.
> I tried to reproduce locally, but to no avail.
> Because I could not reproduce the bug, and the log does not indicate what 
> caused the IOException, I suggest make this a supportability JIRA to log the 
> exception for future improvement.
> Stacktrace
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:906)
> at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:210)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:827)
> at 
> org.apache.hadoop.hdfs.server.namenode.BackupNode.(BackupNode.java:89)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1474)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.startBackupNode(TestBackupNode.java:102)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpointNode(TestBackupNode.java:130)
> The last few lines of log:
> 2015-10-14 19:45:07,807 INFO namenode.NameNode 
> (NameNode.java:createNameNode(1422)) - createNameNode [-checkpoint]
> 2015-10-14 19:45:07,807 INFO impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:init(158)) - CheckpointNode metrics system started 
> (again)
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(402)) - fs.defaultFS is 
> hdfs://localhost:37835
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(422)) - Clients are to use 
> localhost:37835 to access this namenode/service.
> 2015-10-14 19:45:07,810 INFO hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1708)) - Shutting down the Mini HDFS Cluster
> 2015-10-14 19:45:07,810 INFO namenode.FSNamesystem 
> (FSNamesystem.java:stopActiveServices(1298)) - Stopping services started for 
> active state
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:endCurrentLogSegment(1228)) - Ending log segment 1
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5306)) - NameNodeEditLogRoller was interrupted, exiting
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:printStatistics(703)) - Number of transactions: 3 Total time 
> for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of 
> syncs: 4 SyncTimes(ms): 2 1 
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5373)) - LazyPersistFileScrubber was interrupted, 
> exiting
> 2015-10-14 19:45:07,822 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_inprogress_001
>  -> 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_001-003
> 2015-10-14 19:45:07,835 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_inprogress_001
>  -> 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_001-000

[jira] [Updated] (HDFS-9394) branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader initialization, because HftpFileSystem is missing.

2015-11-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9394:

Attachment: HDFS-9394.000.branch-2.patch

Thank you [~cnauroth] for reporting this. As [~wheat9] said, when we separated 
the classes to {{hadoop-hdfs-client}}, we tried to address this in [HDFS-9166]. 
I think the original patch should work just fine, but it was probably not fully 
committed.

Hopefully the fix is simple. Let's see if the v0 patch works.

> branch-2 hadoop-hdfs-client fails during FileSystem ServiceLoader 
> initialization, because HftpFileSystem is missing.
> 
>
> Key: HDFS-9394
> URL: https://issues.apache.org/jira/browse/HDFS-9394
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9394.000.branch-2.patch
>
>
> On branch-2, hadoop-hdfs-client contains a {{FileSystem}} service descriptor 
> that lists {{HftpFileSystem}} and {{HsftpFileSystem}}.  These classes do not 
> reside in hadoop-hdfs-client.  Instead, they reside in hadoop-hdfs.  If the 
> application has hadoop-hdfs-client.jar on the classpath, but not 
> hadoop-hdfs.jar, then this can cause a {{ServiceConfigurationError}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt

2015-11-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994229#comment-14994229
 ] 

Steve Loughran commented on HDFS-9328:
--

think it's Power. Nobody owns up to Itanium as nobody has the power budget to 
build up a rack of  enough nodes for 3x redundancy to work as a storage 
mechanism

> Formalize coding standards for libhdfs++ and put them in a README.txt
> -
>
> Key: HDFS-9328
> URL: https://issues.apache.org/jira/browse/HDFS-9328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Blocker
> Attachments: HDFS-9328.HDFS-8707.000.patch
>
>
> We have 2-3 people working on this project full time and hopefully more 
> people will start contributing.  In order to efficiently scale we need a 
> single, easy to find, place where developers can check to make sure they are 
> following the coding standards of this project to both save their time and 
> save the time of people doing code reviews.
> The most practical place to do this seems like a README file in libhdfspp/. 
> The foundation of the standards is google's C++ guide found here: 
> https://google-styleguide.googlecode.com/svn/trunk/cppguide.html
> Any exceptions to google's standards or additional restrictions need to be 
> explicitly enumerated so there is one single point of reference for all 
> libhdfs++ code standards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-9258) NN should indicate which nodes are stale

2015-11-06 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-9258 started by Kuhu Shukla.
-
> NN should indicate which nodes are stale
> 
>
> Key: HDFS-9258
> URL: https://issues.apache.org/jira/browse/HDFS-9258
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>
> Determining why the NN is not coming out of safemode is difficult - is it a 
> bug or pending block reports?  If the number of nodes appears sufficient, but 
> there are missing blocks, it would be nice to know which nodes haven't block 
> reported (stale).  Instead of forcing the NN to leave safemode prematurely, 
> the SE can first force block reports from stale nodes.
> The datanode report and the web ui's node list should contain this 
> information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt

2015-11-06 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994193#comment-14994193
 ] 

James Clampffer commented on HDFS-9328:
---

Good idea on the markdown.

I'd really like this to be a complete set of rules to avoid new rule surprises 
down the road. I used short circuit as an example because I happened to know 
that that'd be an exception.  There's plenty of other places where I could see 
adding that sort of stuff if I was only concerned about x86-64.

I'd hate for someone to work really hard on a patch that does some really cool 
but platform specific optimizations and then have the idea shot down during 
code review.



> Formalize coding standards for libhdfs++ and put them in a README.txt
> -
>
> Key: HDFS-9328
> URL: https://issues.apache.org/jira/browse/HDFS-9328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Blocker
> Attachments: HDFS-9328.HDFS-8707.000.patch
>
>
> We have 2-3 people working on this project full time and hopefully more 
> people will start contributing.  In order to efficiently scale we need a 
> single, easy to find, place where developers can check to make sure they are 
> following the coding standards of this project to both save their time and 
> save the time of people doing code reviews.
> The most practical place to do this seems like a README file in libhdfspp/. 
> The foundation of the standards is google's C++ guide found here: 
> https://google-styleguide.googlecode.com/svn/trunk/cppguide.html
> Any exceptions to google's standards or additional restrictions need to be 
> explicitly enumerated so there is one single point of reference for all 
> libhdfs++ code standards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt

2015-11-06 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994179#comment-14994179
 ] 

James Clampffer commented on HDFS-9328:
---

I'll change it to markdown as you and [~wheat9] suggested.

Good idea about alignment/endianness.  I'll and get this running on an ARM 
machine in big endian mode and see if anything shakes out of the existing code.

Out of curiosity what architectures are people running Hadoop/HDFS on that 
can't do unaligned accesses?  Itanium or Sparc?



> Formalize coding standards for libhdfs++ and put them in a README.txt
> -
>
> Key: HDFS-9328
> URL: https://issues.apache.org/jira/browse/HDFS-9328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Blocker
> Attachments: HDFS-9328.HDFS-8707.000.patch
>
>
> We have 2-3 people working on this project full time and hopefully more 
> people will start contributing.  In order to efficiently scale we need a 
> single, easy to find, place where developers can check to make sure they are 
> following the coding standards of this project to both save their time and 
> save the time of people doing code reviews.
> The most practical place to do this seems like a README file in libhdfspp/. 
> The foundation of the standards is google's C++ guide found here: 
> https://google-styleguide.googlecode.com/svn/trunk/cppguide.html
> Any exceptions to google's standards or additional restrictions need to be 
> explicitly enumerated so there is one single point of reference for all 
> libhdfs++ code standards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-11-06 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6481:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.2
   Status: Resolved  (was: Patch Available)

Committed this patch. It's a low risk change so I committed it to branch-2.7 
too.

Thanks for diagnosing and fixing this [~szetszwo].

> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.7.2
>
> Attachments: h6481_20151105.patch, hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java

  1   2   >