[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2016-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242483#comment-15242483
 ] 

Kai Zheng commented on HDFS-7859:
-

bq. IIUC, persistence is more necessary when we supports custom schemas, isn't 
it?
I thought you're right. For the builtin schema and policies, IIRC, there was a 
consideration that we still need to persist the schema and policy to indicate 
the software upgrades (so the builtin ones may be changed).
bq. Do we have any plan to implement HDFS-7337?
I thought many considerations originally targeted for the issue have already 
been implemented elsewhere, therefore the only thing left is custom codec and 
schema support. I don't think there is a strong requirement for this feature 
but we can implement it perhaps in phase II I guess.

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Xinwei Qin 
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
> HDFS-7859.001.patch, HDFS-7859.002.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2016-04-14 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242464#comment-15242464
 ] 

Rakesh R commented on HDFS-7859:


The proposed patch in this JIRA handles saving and loading the schema in 
fsimage/editlog. IIUC, persistence is more necessary when we supports custom 
schemas, isn't it?. I could see we are still discussing the ways to support 
HDFS-7337 and not yet reached a common agreement. Do we have any plan to 
implement HDFS-7337?.

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Xinwei Qin 
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
> HDFS-7859.001.patch, HDFS-7859.002.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10297) Increase default balance bandwidth and concurrent moves

2016-04-14 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10297:
-

 Summary: Increase default balance bandwidth and concurrent moves
 Key: HDFS-10297
 URL: https://issues.apache.org/jira/browse/HDFS-10297
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer & mover
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Minor


Adjust the default values to better support the current level of customer host 
and network configurations.

Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} from 
1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network.

Increase the default for property {{dfs.datanode.balance.max.concurrent.moves}} 
from 5 to 50. Apply to DN and Balancer. The default number of DN receiver 
threads is 4096. The default number of balancer mover threads is 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Takashi Ohnishi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242399#comment-15242399
 ] 

Takashi Ohnishi commented on HDFS-10216:


Thank you [~jingzhao] for committing !

Thank you [~jzhuge] for helpful reviewing !!

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
> Fix For: 2.9.0
>
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9820:

Description: 
A common use scenario (scenaio 1): 

# create snapshot sx in clusterX, 
# do some experiemnts in clusterX, which creates some files. 
# throw away the files changed and go back to sx.

Another scenario (scenario 2) is, there is a production cluster and a backup 
cluster, we periodically sync up the data from production cluster to the backup 
cluster with distcp. 

The cluster in scenario 1 could be the backup cluster in scenario 2.

For scenario 1:

HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
some complexity and challenges.  Before that jira is implemented, we count on 
distcp to copy from snapshot to the current state. However, the performance of 
this operation could be very bad because we have to go through all files even 
if we only changed a few files.

For scenario 2:

HDFS-7535 improved distcp performance by avoiding copying files that changed 
name since last backup.

On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
from source to target cluster, by only copying changed files since last backup. 
The way it works is use snapshot diff to find out all files changed, and copy 
the changed files only.

See 
https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/

This jira is to propose a variation of HDFS-8828, to find out the files changed 
in target cluster since last snapshot sx, and copy these from snapshot sx of 
either the source or the target cluster, to restore target cluster's current 
state to sx. 

Specifically,

If a file/dir is

- renamed, rename it back
- created in target cluster, delete it
- modified, put it to the copy list
- run distcp with the copy list, copy from the source cluster's corresponding 
snapshot

This could be a new command line switch -rdiff in distcp.

As a native restore feature, HDFS-4167 would still be ideal to have. However,  
HDFS-9820 would hopefully be easier to implement, before HDFS-4167 is in place.


  was:
HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
some complexity and challenges. 

HDFS-7535 improved distcp performance by avoiding copying files that changed 
name since last backup.

On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
from source to target cluster, by only copying changed files since last backup. 
The way it works is use snapshot diff to find out all files changed, and copy 
the changed files only.

See 
https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/

This jira is to propose a variation of HDFS-8828, to find out the files changed 
in target cluster since last snapshot sx, and copy these from the source 
target's same snapshot sx, to restore target cluster to sx.

If a file/dir is

- renamed, rename it back
- created in target cluster, delete it
- modified, put it to the copy list
- run distcp with the copy list, copy from the source cluster's corresponding 
snapshot

This could be a new command line switch -rdiff in distcp.

HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 would 
hopefully be easier to implement.




> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> A common use scenario (scenaio 1): 
> # create snapshot sx in clusterX, 
> # do some experiemnts in clusterX, which creates some files. 
> # throw away the files changed and go back to sx.
> Another scenario (scenario 2) is, there is a production cluster and a backup 
> cluster, we periodically sync up the data from production cluster to the 
> backup cluster with distcp. 
> The cluster in scenario 1 could be the backup cluster in scenario 2.
> For scenario 1:
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges.  Before that jira is implemented, we count on 
> distcp to copy from snapshot to the current state. However, the performance 
> of this operation could be very bad because we have to go through all files 
> even if we only changed a few files.
> For scenario 2:
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since 

[jira] [Commented] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242299#comment-15242299
 ] 

Hadoop QA commented on HDFS-10224:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
1s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s 
{color} | {color:red} root: patch generated 10 new + 143 unchanged - 2 fixed = 
153 total (was 145) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 54s 
{color} | {color:red} hadoop-common-project/hadoop-common generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 4m 49s 
{color} | {color:red} hadoop-common-project_hadoop-common-jdk1.8.0_77 with JDK 
v1.8.0_77 generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 8m 29s 
{color} | {color:red} hadoop-common-project_hadoop-common-jdk1.7.0_95 with JDK 
v1.7.0_95 generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 7s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 53s 

[jira] [Commented] (HDFS-10258) Erasure Coding: support small cluster whose #DataNode < # (Blocks in a BlockGroup)

2016-04-14 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242287#comment-15242287
 ] 

Li Bo commented on HDFS-10258:
--

Thanks for Kai's idea, I will try to find a solution with the  lowest cost.

> Erasure Coding: support small cluster whose #DataNode < # (Blocks in a 
> BlockGroup)
> --
>
> Key: HDFS-10258
> URL: https://issues.apache.org/jira/browse/HDFS-10258
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
>
> Currently EC has not supported small clusters whose datanode number is 
> smaller than the block numbers in a block group. This sub task will solve 
> this problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9940) Balancer should not use property name dfs.datanode.balance.max.concurrent.moves

2016-04-14 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-9940:
-
Summary: Balancer should not use property name 
dfs.datanode.balance.max.concurrent.moves  (was: Balancer should not use 
property name )

> Balancer should not use property name 
> dfs.datanode.balance.max.concurrent.moves
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9940) Balancer should not use property name

2016-04-14 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-9940:
-
Summary: Balancer should not use property name   (was: Rename 
dfs.balancer.max.concurrent.moves to avoid confusion)

> Balancer should not use property name 
> --
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242208#comment-15242208
 ] 

Mingliang Liu commented on HDFS-10284:
--

I see no related failing tests.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242203#comment-15242203
 ] 

Hadoop QA commented on HDFS-10284:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 13s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 164m 14s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
| JDK v1.8.0_77 Timed out junit tests | 
org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
|   | 

[jira] [Commented] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242189#comment-15242189
 ] 

Mingliang Liu commented on HDFS-10293:
--

I don't see related failing tests.

Still, I'm surprised that we have so many intermittently failing tests. I have 
not seen a pre-commit that passes all UT for a while.

> StripedFileTestUtil#readAll flaky
> -
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10293.000.patch
>
>
> The flaky test helper method cause several UT test failing intermittently. 
> For example, the 
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
>  timed out in a recent run (see 
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
>  which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an 
> infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10289) Balancer configures DNs directly

2016-04-14 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242183#comment-15242183
 ] 

John Zhuge commented on HDFS-10289:
---

Split into 2 tasks:
* HDFS-10294: Balancer configure DN properties still with NN APIs
* HDFS-10295: Switch DN APIs and only config DNs involved in the balancing

HDFS-10295 is a further enhancement so it is conceivable for a simpler 
HDFS-10294 to be accepted first.

> Balancer configures DNs directly
> 
>
> Key: HDFS-10289
> URL: https://issues.apache.org/jira/browse/HDFS-10289
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
>
> Balancer directly configures the 2 balance-related properties 
> (bandwidthPerSec and concurrentMoves) on the DNs involved.
> Details:
> * Before each balancing iteration, set the properties on all DNs involved in 
> the current iteration.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * Need new DN APIs to query and set the 2 properties.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin 
> -setBalancerBandwidth}} to configure every DN in the cluster.
> Pros:
> * Improve ease of use because all configurations are done at one place, the 
> balancer. We saw many customers often forgot to set concurrentMoves properly 
> since it is required on both DN and Balancer.
> * Support new DNs added between iterations
> * Handle DN restarts between iterations
> * May be able to dynamically adjust the thresholds in different iterations. 
> Don't know how useful though.
> Cons:
> * New DN property API
> * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin 
> -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by 
> admin.
> Questions:
> * Can we create {{BalancerConcurrentMovesCommand}} similar to 
> {{BalancerBandwidthCommand}}? Can Balancer use them directly without going 
> through NN?
> One proposal to implement HDFS-7466 calls for an API to query DN properties. 
> DN Conf Servlet returns all config properties. It does not return individual 
> property and it does not return the value set by {{hdfs dfsadmin 
> -setBalancerBandwidth}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10295) Balancer configures DN properties directly on-demand

2016-04-14 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10295:
-

 Summary: Balancer configures DN properties directly on-demand
 Key: HDFS-10295
 URL: https://issues.apache.org/jira/browse/HDFS-10295
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: John Zhuge
Assignee: John Zhuge


This is a further enhancement to HDFS-10294. Instead of using NN APIs, use new 
DN APIs to query and set necessary properties on the DNs involved.

Details:
* Before each balancing iteration, set the properties on all DNs involved in 
the current iteration.
* Need new DN APIs to query and set the balancing properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10296) FileContext.getDelegationTokens() fails to obtain KMS delegation token

2016-04-14 Thread Andreas Neumann (JIRA)
Andreas Neumann created HDFS-10296:
--

 Summary: FileContext.getDelegationTokens() fails to obtain KMS 
delegation token
 Key: HDFS-10296
 URL: https://issues.apache.org/jira/browse/HDFS-10296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 2.6.0
 Environment: CDH 5.6 with a Java KMS
Reporter: Andreas Neumann


This little program demonstrates the problem: With FileSystem, we can get both 
the HDFS and the kms-dt token, whereas with FileContext, we can only obtain the 
HDFS delegation token. 

{code}
public class SimpleTest {

  public static void main(String[] args) throws IOException {
YarnConfiguration hConf = new YarnConfiguration();
String renewer = "renewer";

FileContext fc = FileContext.getFileContext(hConf);
List tokens = fc.getDelegationTokens(new Path("/"), renewer);
for (Token token : tokens) {
  System.out.println("Token from FC: " + token);
}

FileSystem fs = FileSystem.get(hConf);
for (Token token : fs.addDelegationTokens(renewer, new Credentials())) {
  System.out.println("Token from FS: " + token);
}
  }
}
{code}
Sample output (host/user name x'ed out):
{noformat}
Token from FC: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:xxx, Ident: 
(HDFS_DELEGATION_TOKEN token 49 for xxx)
Token from FS: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:xxx, Ident: 
(HDFS_DELEGATION_TOKEN token 50 for xxx)
Token from FS: Kind: kms-dt, Service: xx.xx.xx.xx:16000, Ident: 00 04 63 64 61 
70 07 72 65 6e 65 77 65 72 00 8a 01 54 16 96 c2 95 8a 01 54 3a a3 46 95 0e 02
{noformat}
Apparently FileContext does not return the KMS token. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10294) Balancer configures DN properties

2016-04-14 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10294:
--
Priority: Major  (was: Critical)

> Balancer configures DN properties
> -
>
> Key: HDFS-10294
> URL: https://issues.apache.org/jira/browse/HDFS-10294
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>
> Balancer configures the 2 balance-related properties (bandwidthPerSec and 
> concurrentMoves) using NN API {{get/setBalancerBandwidth}} and the new 
> {{get/setBalancerConcurrentMoves}}.
> Details:
> * Upon the start of the balancer, set the DN properties.
> * Use NN API to query and set the 2 properties. There might be a slight delay 
> for the property changes to be propagated to all DNs.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin 
> -setBalancerBandwidth}} to configure every DN in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242159#comment-15242159
 ] 

Hadoop QA commented on HDFS-10293:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
1s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 132m 15s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 111m 30s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
36s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 275m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | 

[jira] [Created] (HDFS-10294) Balancer configures DN properties

2016-04-14 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10294:
-

 Summary: Balancer configures DN properties
 Key: HDFS-10294
 URL: https://issues.apache.org/jira/browse/HDFS-10294
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer & mover
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Critical


Balancer configures the 2 balance-related properties (bandwidthPerSec and 
concurrentMoves) using NN API {{get/setBalancerBandwidth}} and the new 
{{get/setBalancerConcurrentMoves}}.

Details:
* Upon the start of the balancer, set the DN properties.
* Use NN API to query and set the 2 properties. There might be a slight delay 
for the property changes to be propagated to all DNs.
* The DN property changes will not survive restart.
* Balancer gets the property values from command line or its config file.
* No need to edit the config file on each DN or run {{hdfs dfsadmin 
-setBalancerBandwidth}} to configure every DN in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242150#comment-15242150
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Hi [~jingzhao],

Thanks for proposing offline discussion, I was thinking about the same:-) Just 
shared contact info.

Because of the similarity to HDFS-7535/HDFS-8828, the change indeed can be 
small (I have tried).  In latest patch, some changes tries to address the 
in-symmetric output (HDFS-10263) by always going with forward snapshot diff; 
some other changes are intended to reorg the code for better readability.

For completeness' sake, if you could comment back to the comments I made in my 
prior update, it would be appreciated.

Thanks.



> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242139#comment-15242139
 ] 

Masatake Iwasaki commented on HDFS-9905:


[~jojochuang], can you update the patch to address comments above?

> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242127#comment-15242127
 ] 

Masatake Iwasaki commented on HDFS-9905:


bq. I don't know what would cause SocketTimeoutException to give a null message 
instead of the expected Read timed out.

Though underlying implementation of 
[PlainSocketImpl|http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/34c594b52b73/src/solaris/native/java/net/PlainSocketImpl.c]
 and 
[SocketInputStream|http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/34c594b52b73/src/solaris/native/java/net/SocketInputStream.c]
 throws SocketTimeoutException with expected message, SocketTimeoutException 
without message could be thrown by {{SocksSocketImpl#remainingMillis}} before 
reaching to those code paths if connect timeout is set to very small value.

I'm +1 on the fix of {{WebHdfsFileSystem#AbstractRunner#runWithRetry}} 
suggested by [~eepayne] in addition to 001.


> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242112#comment-15242112
 ] 

Jing Zhao commented on HDFS-9820:
-

[~yzhangal], I have a very good understanding about HDFS-10263 But I'm not 
sure if you understand my point about why this issue can by solved in a much 
easier way... Please let me know if you want an offline discussion.

> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-14 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242094#comment-15242094
 ] 

Xiaoyu Yao commented on HDFS-10207:
---

One additional comments on the patch: backoff reconfigure is added for both 
client/service/lifeline ports. This is not necessary as we never want to 
backoff rpc requests on service and lifeline port. We only need to support 
reconfigure backoff for client rpc port. 

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-14 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242061#comment-15242061
 ] 

Xiaobing Zhou commented on HDFS-10224:
--

v001 is posted. FutureX is renamed to AsyncX and unit tests are added. 
[~szetszwo]szetszwo] thank you for review.

> Implement an asynchronous DistributedFileSystem
> ---
>
> Key: HDFS-10224
> URL: https://issues.apache.org/jira/browse/HDFS-10224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10224-HDFS-9924.000.patch, 
> HDFS-10224-HDFS-9924.001.patch, HDFS-10224-and-HADOOP-12909.000.patch
>
>
> This is proposed to implement an asynchronous DistributedFileSystem based on 
> AsyncFileSystem APIs in HADOOP-12910. In addition, rename is implemented as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-14 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10224:
-
Attachment: HDFS-10224-HDFS-9924.001.patch

> Implement an asynchronous DistributedFileSystem
> ---
>
> Key: HDFS-10224
> URL: https://issues.apache.org/jira/browse/HDFS-10224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10224-HDFS-9924.000.patch, 
> HDFS-10224-HDFS-9924.001.patch, HDFS-10224-and-HADOOP-12909.000.patch
>
>
> This is proposed to implement an asynchronous DistributedFileSystem based on 
> AsyncFileSystem APIs in HADOOP-12910. In addition, rename is implemented as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242030#comment-15242030
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Many thanks [~jingzhao]. Good discussion!

1.
{quote}
No. This is incorrect. We allow distcp -diff s1 .. "s2" can be done after the 
copy. See TestDistCpSync#testSyncWithCurrent as an example
{quote}
if some changes is made while we were running distcp or after, but before s2 is 
created, then the stuff copied is not exact the content of s2. Right?

2.
{quote}
This assumption must be verified before the new distcp. Currently we do a 
snapshot diff report on target (between from and ".") to check. This check 
cannot be dropped as in your current patch.
{quote}
I certainly agree that we should do the checking. I emphasized the assumption I 
and II in my last comment. However, since the checking can only be done in the 
beginning of distcp,  if some changes are made before s2 is created, they will 
be missed in the checking. So I think we need to document that no change should 
be made when we do this operation.

3.
{quote}
I mean "" or "." should never be used as the fromState in distcp -diff command, 
otherwise we have no way to verify there is no change happening on target. So 
we actually should use "s2" here.
{quote}
Then in the case HDFS-9820 tries to solve, are you suggesting to create a 
snapshot s2 first (for the sake of doing a check), before reverting it back to 
s1? The issue described in #2 above also applies.

4.
{quote}
This is also wrong. In command line "." is the alias of the current state.
{quote}
I saw distcp was using {{""}}, maybe we should change to stick to using {{"."}}.

5.
{quote}
For any modification/creation happening under a renamed directory, the diff 
report always uses the paths before the rename (as reported by HDFS-10263). 
prepareDiffList changes these paths to new paths after the rename, but when 
applying the reverse diff, we do not need to do this.
{quote}
Renaming x in s1 to y in s2 means that x is the original name before the 
rename, as reported in snapShotDiff(s1, s2), where s1 is fromSS, s2 is toSS;
When we look at the reversion, the rename operation become renaming y in s2 to 
x in s1,   so y should be the original name before the rename. 
as I expect to see from the reports of in snapshotDiff(s2, s1), where s2 is 
fromSS, s1 is toSS. 

However, snapshotDiff(s2, s1) still uses the names in s1 as the original name 
(x in this case, I really expect it to be y), though It does change the order 
operands, comparing with snapshotDiff(s1,s2),  This is the issue I reported in 
HDFS-10263. You can see some example there.

Basically I expect snapshotDiff(fromSS, toSS) to use names in fromSS. In the 
reversion case, it's the "." state. This is the symmetry I was referring to.

Does this explanation make sense?

Thanks again!


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241943#comment-15241943
 ] 

Jing Zhao commented on HDFS-9820:
-

bq. One small correction: before we do the incremental copy, we create a 
snapshot s2 on source cluster first
No. This is incorrect. We allow {{distcp -diff s1 .}}. "s2" can be done after 
the copy. See TestDistCpSync#testSyncWithCurrent as an example.

bq. We assume that no changes have been made at target cluster after s1 was 
created before we do incremental copy in this case (assumption I)
This assumption must be verified before the new distcp. Currently we do a 
snapshot diff report on target (between {{from}} and ".") to check. This check 
cannot be dropped as in your current patch.

bq. Do you mean if "" ever appear as one parameter of -diff
I mean "" or "." should *never* be used as the {{fromState}} in {{distcp 
-diff}} command, otherwise we have no way to verify there is no change 
happening on target. So we actually should use "s2" here.

bq. Because "" is just an alias of current state "snapshot"
This is also wrong. In command line "." is the alias of the current state.

bq. -diff   is what I feel more intuitive
bq. But if this is what you prefer, we can relax the order requirement, and let 
"" means revert operation. Would you please confirm?
What I mean is: we should always use "-diff  ", but instead 
of using ".", we should use "s2". No change is necessary on {{DistCpOptions}}.

bq. bypass DistCpSync#prepareDiffList
For any modification/creation happening under a renamed directory, the diff 
report always uses the paths before the rename (as reported by HDFS-10263). 
{{prepareDiffList}} changes these paths to new paths after the rename, but when 
applying the reverse diff, we do not need to do this.


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10283) o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241932#comment-15241932
 ] 

Mingliang Liu commented on HDFS-10283:
--

Failing tests are not related as the changes are only for test case 
{{o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending
 fails intermittently}}, which passed in the pre-commit run. I also ran it 
locally ~10 times and it was good.

> o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending
>  fails intermittently
> --
>
> Key: HDFS-10283
> URL: https://issues.apache.org/jira/browse/HDFS-10283
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10283.000.patch
>
>
> The test fails with exception as following: 
> {code}
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10284:
-
Attachment: HDFS-10284.001.patch

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241905#comment-15241905
 ] 

Hudson commented on HDFS-10281:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9616 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9616/])
HDFS-10281. TestPendingCorruptDnMessages fails intermittently. (kihwal: rev 
b9c9d03591a49be31f3fbc738d01a31700bfdbc4)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPendingCorruptDnMessages.java


> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241897#comment-15241897
 ] 

Mingliang Liu commented on HDFS-10284:
--

I found the {{BlockManagerSafeMode$SafeModeMonitor#canLeave}} is not checking 
the {{namesystem#inTransitionToActive()}}, while it should. I think according 
to the fix of [HDFS-10192], we should add this check to prevent the 
{{smmthread}} from calling {{leaveSafeMode()}}.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241868#comment-15241868
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Thanks a lot [~jingzhao]!

My thoughts to share:

1.
{quote}
Let's say we first have snapshot s1 both both source and target (and the source 
and the target have been synced). Then we make some changes on the source, do a 
forward incremental distcp copy to apply the changes to the target. Based on 
our assumption, before the next incremental copy, we will create a snapshot s2 
on both the source and the target.
{quote}
This is HDFS-7535/HDFS-8828. One small correction: before we do the incremental 
copy, we create a snapshot s2 on source cluster first,  find snapshot diff 
between s1 and s2, and apply this diff to target cluster, then finally create 
s2 on target cluster.  We assume that no changes have been made at target 
cluster after s1 was created before we do incremental copy in this case 
(*assumption I*).

2. Do you mean if {{""}} ever appear as one parameter of {{-diff}}, then it's a 
revert operation, otherwise it's forward operation?
 
In theory, we could copy incremental changes from source cluster to destination 
cluster without creating a new snapshot (s2 in our example). Say, after s1 is 
made in source cluster, and s1 is sync-ed to  target cluster, and s1 is also 
created in target cluster, we could interpret

{{distcp -diff s1 "" source target}}.

as to incrementally copy changes made after s1 in source cluster to target, 
right?  Because {{""}} is just an alias of current state "snapshot", 

I personally feel it's more intuitive to count on the parameter order, and let 
({{-diff s1 s2}} mean the forward change from s1 to s2, {{-diff s2 s1}} mean 
the revert change from s2 to s1. Say, assume a cluster is already at state s2, 
and we do {{-diff s1 s2}}, it would be a non-op; If we do {{-diff s2 s1}}, it 
means to go back to s1. In other words, {{-diff  }} is what 
I feel more intuitive.
 
But if this is what you prefer, we can relax the order requirement, and let 
{{""}} means revert operation. Would you please confirm? 

And would you please let me know whether my comment #1 in my previous reply 
makes sense to you?

3. Not quite follow what you meant by "bypass DistCpSync#prepareDiffList. ". 
Some more details would help.

Many thanks.


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241859#comment-15241859
 ] 

Mingliang Liu commented on HDFS-10281:
--

Thanks for the review and commit, [~kihwal].

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10281:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-2 and branch-2.8. Thanks for fixing this, 
[~liuml07].

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241844#comment-15241844
 ] 

Eric Payne commented on HDFS-9905:
--

bq.  java.net.SocksSocketImpl is possible to throw SocketTimeoutException with 
null message. We seem not to be able to expect that SocketTimeoutException 
always contains message such as "Read timed out" or "connect timed out".
bq. Use GenericTestUtils.assertExceptionContains instead of Assert.assertEquals 
so that if the string doesn't match, it logs the exception.

Thanks, [~iwasakims] and [~jojochuang] for your work on this issue. I don't 
know what would cause {{SocketTimeoutException}} to give a null message instead 
of the expected {{Read timed out}}. However, your point about the original 
stack trace being lost is a very good one:
bq. the exception object was reinterpreted in the exception handling, so the 
original stack trace was lost.

In {{WebHdfsFileSystem#AbstractRunner#runWithRetry}}, the code that recreates 
the exception with the node name should also propagate the stack trace:
{code}
  ioe = ioe.getClass().getConstructor(String.class)
.newInstance(node + ": " + ioe.getMessage());
{code}
Should be:
{code}
  IOException newIoe =
  ioe.getClass().getConstructor(String.class)
.newInstance(node + ": " + ioe.getMessage());
  newIoe.setStackTrace(ioe.getStackTrace());
  ioe = newIoe;
{code}
I can open a separate JIRA for this if you want.

> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241836#comment-15241836
 ] 

Kihwal Lee commented on HDFS-10281:
---

The test failures don't seem to be related, as this patch only touched one test 
which didn't fail here.
+ lgtm

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241815#comment-15241815
 ] 

Mingliang Liu commented on HDFS-10292:
--

A space between "block" and {{last}} will be better, but it's fine without it.

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241811#comment-15241811
 ] 

Jing Zhao commented on HDFS-10293:
--

+1 pending Jenkins.

> StripedFileTestUtil#readAll flaky
> -
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10293.000.patch
>
>
> The flaky test helper method cause several UT test failing intermittently. 
> For example, the 
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
>  timed out in a recent run (see 
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
>  which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an 
> infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241808#comment-15241808
 ] 

Hadoop QA commented on HDFS-10281:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 32s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 22s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
35s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 119m 55s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
| JDK v1.8.0_77 Timed out junit tests | 
org.apache.hadoop.hdfs.TestDFSClientRetries |
|   | org.apache.hadoop.hdfs.TestLeaseRecovery |
|   | 

[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241806#comment-15241806
 ] 

Mingliang Liu commented on HDFS-10284:
--

[~brahmareddy], would you review this for me? Thanks.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241802#comment-15241802
 ] 

Mingliang Liu commented on HDFS-10281:
--

Thanks for studying the pre-commit build issue. Let's wait for the test result. 
I ran it ~10 times locally and it was good.

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241781#comment-15241781
 ] 

Kihwal Lee commented on HDFS-10281:
---

https://builds.apache.org/job/PreCommit-HDFS-Build/15143/console
It did try to build first time but failed.
{noformat}
error: pathspec 'trunk' did not match any file(s) known to git.
ERROR: git checkout --force trunk is failing
{noformat}

This one seems to be working okay..
https://builds.apache.org/job/PreCommit-HDFS-Build/15166/console

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241778#comment-15241778
 ] 

Hudson commented on HDFS-10292:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9615 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9615/])
HDFS-10292. Add block id when client got Unable to close file exception. 
(kihwal: rev 2c155afe2736a5571bbb3bdfb2fe6f9709227229)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java


> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10292:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed it to trunk, branch-2 and branch-2.8. Thanks for the patch, 
[~brahmareddy].

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241761#comment-15241761
 ] 

Kihwal Lee commented on HDFS-10292:
---

+1 lgtm. I will commit it shortly.

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10293:
-
Attachment: HDFS-10293.000.patch

The code is as following:


{code}
  static int readAll(FSDataInputStream in, byte[] buf) throws IOException {
int readLen = 0;
int ret;
while ((ret = in.read(buf, readLen, buf.length - readLen)) >= 0 &&
readLen <= buf.length) {
  readLen += ret;
}
return readLen;
  }
{code}

If the {{readLen}} equals to {{buf.length}}, then {{buf.length - readLen}} will 
be zero, and {{in.read()}} will simply returns zero without reading from the 
stream. This case, no exception will be thrown, and the code is stuck in the 
while-loop.

One possible fix is to strict the condition as {{ret = in.read(buf, readLen, 
buf.length - readLen)) > 0 && readLen < buf.length}}. A probable better fix is 
to use the {{IOUtils.readFully()}}, which will throw an IOException if it reads 
premature EOF from inputStream, see the v0 patch.

> StripedFileTestUtil#readAll flaky
> -
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10293.000.patch
>
>
> The flaky test helper method cause several UT test failing intermittently. 
> For example, the 
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
>  timed out in a recent run (see 
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
>  which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an 
> infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10293:
-
Status: Patch Available  (was: Open)

> StripedFileTestUtil#readAll flaky
> -
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10293.000.patch
>
>
> The flaky test helper method cause several UT test failing intermittently. 
> For example, the 
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
>  timed out in a recent run (see 
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
>  which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an 
> infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10293:
-
Issue Type: Sub-task  (was: Bug)
Parent: HDFS-8031

> StripedFileTestUtil#readAll flaky
> -
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> The flaky test helper method cause several UT test failing intermittently. 
> For example, the 
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
>  timed out in a recent run (see 
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
>  which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an 
> infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10293) StripedFileTestUtil#readAll flaky

2016-04-14 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-10293:


 Summary: StripedFileTestUtil#readAll flaky
 Key: HDFS-10293
 URL: https://issues.apache.org/jira/browse/HDFS-10293
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding, test
Affects Versions: 3.0.0
Reporter: Mingliang Liu
Assignee: Mingliang Liu


The flaky test helper method cause several UT test failing intermittently. For 
example, the 
{{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
 timed out in a recent run (see 
[exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
 which can be easily reproduced locally.

Debugging at the code, chances are that the helper method is stuck in an 
infinite loop. We need a fix to make the test robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-14 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241716#comment-15241716
 ] 

Xiaoyu Yao commented on HDFS-10207:
---

[~xiaobingo], can you rebase the patch to the trunk as it won't apply now. 
Also check if testNameNodeGetReconfigurableProperties need to update  with the 
new reconfigurable property? Thanks!

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10263) Provide symmetric entries in reversed snapshot diff report

2016-04-14 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241707#comment-15241707
 ] 

John Zhuge commented on HDFS-10263:
---

Thanks [~yzhangal] for the great report. {{SnapshotDiffReport}} API should be 
stable and well-documented, not only for {{distcp -diff}}, but also for other 
polyglot applications, in order for it to be better adopted. It may be a good 
starting point to beef up {{TestSnapshotDiffReport}} with a complete set of 
behavior based unit tests.

> Provide symmetric entries in reversed snapshot diff report
> --
>
> Key: HDFS-10263
> URL: https://issues.apache.org/jira/browse/HDFS-10263
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, snapshots
>Reporter: Yongjun Zhang
>Assignee: Jing Zhao
>
> Steps to reproduce:
> 1. Take a snapshot s1 at:
> {code}
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup   1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/foo
> -rw-r--r--   1 yzhang supergroup   1024 2016-04-05 14:48 /target/foo/f1
> {code}
> 2. Make the following change:
> {code}
>   private int changeData7(Path dir) throws Exception {
> final Path foo = new Path(dir, "foo");
> final Path foo2 = new Path(dir, "foo2");
> final Path foo_f1 = new Path(foo, "f1");
> final Path foo2_f2 = new Path(foo2, "f2");
> final Path foo2_f1 = new Path(foo2, "f1");
> final Path foo_d1 = new Path(foo, "d1");
> final Path foo_d1_f3 = new Path(foo_d1, "f3");
> int numDeletedAndModified = 0;
> dfs.rename(foo, foo2);
> dfs.delete(foo2_f1, true);
> 
> DFSTestUtil.createFile(dfs, foo_f1, BLOCK_SIZE, DATA_NUM, 0L);
> DFSTestUtil.appendFile(dfs, foo_f1, (int) BLOCK_SIZE);
> dfs.rename(foo_f1, foo2_f2);
> numDeletedAndModified += 1; // "M ./foo"
> DFSTestUtil.createFile(dfs, foo_d1_f3, BLOCK_SIZE, DATA_NUM, 0L);
> return numDeletedAndModified;
>   }
> {code}
> that results in
> {code}
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup   1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/foo
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/foo/d1
> -rw-r--r--   1 yzhang supergroup   1024 2016-04-05 14:48 /target/foo/d1/f3
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/foo2
> -rw-r--r--   1 yzhang supergroup   2048 2016-04-05 14:48 /target/foo2/f2
> {code}
> 3. take snapshot s2 here
> 4. Do the following to revert the change done in step 2
> {code}
>  private int revertChangeData7(Path dir) throws Exception {
> final Path foo = new Path(dir, "foo");
> final Path foo2 = new Path(dir, "foo2");
> final Path foo_f1 = new Path(foo, "f1");
> final Path foo2_f2 = new Path(foo2, "f2");
> final Path foo2_f1 = new Path(foo2, "f1");
> final Path foo_d1 = new Path(foo, "d1");
> final Path foo_d1_f3 = new Path(foo_d1, "f3");
> int numDeletedAndModified = 0;
> 
> dfs.delete(foo_d1, true);
> dfs.rename(foo2_f2, foo_f1);
> 
> dfs.delete(foo, true);
> 
> DFSTestUtil.createFile(dfs, foo2_f1, BLOCK_SIZE, DATA_NUM, 0L);
> DFSTestUtil.appendFile(dfs, foo2_f1, (int) BLOCK_SIZE);
> dfs.rename(foo2,  foo);
> 
> return numDeletedAndModified;
>   }
> {code}
> that get the following results:
> {code}
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup   1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup  0 2016-04-05 14:48 /target/foo
> -rw-r--r--   1 yzhang supergroup   2048 2016-04-05 14:48 /target/foo/f1
> {code}
> 4. Take snapshot s3 here.
> Below is the different snapshots
> {code}
> s1-s2: Difference between snapshot s1 and snapshot s2 under directory /target:
> M .
> + ./foo
> R ./foo -> ./foo2
> M ./foo
> + ./foo/f2
> - ./foo/f1
> s2-s1: Difference between snapshot s2 and snapshot s1 under directory /target:
> M .
> - ./foo
> R ./foo2 -> ./foo
> M ./foo
> - ./foo/f2
> + ./foo/f1
> s2-s3: Difference between snapshot s2 and snapshot s3 under directory /target:
> M .
> - ./foo
> R ./foo2 -> ./foo
> M ./foo2
> + ./foo2/f1
> - ./foo2/f2
> s3-s2: Difference between snapshot s3 and snapshot s2 under directory /target:
> M .
> + ./foo
> R ./foo -> ./foo2
> M ./foo2
> - ./foo2/f1
> + ./foo2/f2
> {code}
> The s2-s1 snapshot is supposed to be the same as s2-s3, because  the change 
> from s2 to s3 is an exact reversion of the change from s1 to s2.  We can see 
> 

[jira] [Commented] (HDFS-7499) Add NFSv4 + Kerberos / client authentication support

2016-04-14 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241677#comment-15241677
 ] 

John Zhuge commented on HDFS-7499:
--

Wonder whether we can add an HDFS based layout type for 
[pNFS|http://tools.ietf.org/html/rfc5661#section-12], similar to [Object-Based 
Parallel NFS (pNFS) Operations|http://tools.ietf.org/html/rfc5664]. The storage 
protocol can be a C/C++ based {{DFSClient}}.

> Add NFSv4 + Kerberos / client authentication support
> 
>
> Key: HDFS-7499
> URL: https://issues.apache.org/jira/browse/HDFS-7499
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.4.0
> Environment: HDP2.1
>Reporter: Hari Sekhon
>
> We have a requirement for secure file share access to HDFS on a kerberized 
> cluster.
> This is spun off from HDFS-7488 where adding Kerberos to the front end client 
> was considered, I believe this would require NFSv4 support?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10286) Fix TestDFSAdmin#testNameNodeGetReconfigurableProperties

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241668#comment-15241668
 ] 

Hudson commented on HDFS-10286:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9613 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9613/])
HDFS-10286. Fix TestDFSAdmin#testNameNodeGetReconfigurableProperties. (xyao: 
rev 809226752dd109e16956038017dece16ada6ee0f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java


> Fix TestDFSAdmin#testNameNodeGetReconfigurableProperties
> 
>
> Key: HDFS-10286
> URL: https://issues.apache.org/jira/browse/HDFS-10286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Fix For: 2.9.0
>
> Attachments: HDFS-10286.000.patch
>
>
> HDFS-10209 introduced a new reconfigurable properties which requires an 
> update to the validation in 
> TestDFSAdmin#testNameNodeGetReconfigurableProperties. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10286) Fix TestDFSAdmin#testNameNodeGetReconfigurableProperties

2016-04-14 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10286:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Thanks [~xiaobingo] for the contribution. I committed the patch to trunk and 
branch-2.

> Fix TestDFSAdmin#testNameNodeGetReconfigurableProperties
> 
>
> Key: HDFS-10286
> URL: https://issues.apache.org/jira/browse/HDFS-10286
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Fix For: 2.9.0
>
> Attachments: HDFS-10286.000.patch
>
>
> HDFS-10209 introduced a new reconfigurable properties which requires an 
> update to the validation in 
> TestDFSAdmin#testNameNodeGetReconfigurableProperties. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241643#comment-15241643
 ] 

Jing Zhao commented on HDFS-9820:
-

Let's say we first have snapshot s1 both both source and target (and the source 
and the target have been synced). Then we make some changes on the source, do a 
forward incremental distcp copy to apply the changes to the target. Based on 
our assumption, before the next incremental copy, we will create a snapshot s2 
on both the source and the target.

Let's say at this time you want to restore the target back to s1. Theoretically 
we only need to do "distcp -diff s2 s1", where s2 is the {{from}} snapshot and 
s1 is the {{to}} snapshot. Note there is no diff between the current states and 
s2 on the target. Only after verifying this can we continue the incremental 
dictcp. Because of the lack of HDFS-10263, we need to make slight changes when 
applying the reversed diff, i.e., to bypass {{DistCpSync#prepareDiffList}}. 
This requires the distcp tool to understand s2 is actually after s1, and we can 
call {{getListing}} against {{path-of-snapshottable-dir/.snapshot}} to achieve 
this.

We should allow user to pass in two snapshots in any order. The only 
restriction here is the {{from}} snapshot must also exist in the target cluster 
and there is no difference between this snapshot and the current status in the 
target cluster. 

> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241634#comment-15241634
 ] 

Mingliang Liu commented on HDFS-10287:
--

This will be a good improvement and users of {{MiniDFSCluster}} will be 
grateful. If backward compatibility is a concern and {{close()}} is idempotent 
(seems true), implementing {{Closeable}} can be an alternative.

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241630#comment-15241630
 ] 

Hudson commented on HDFS-10216:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9612 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9612/])
HDFS-10216. Distcp -diff throws exception when handling relative path. (jing9: 
rev 404f57f328b00a42ec8b952ad08cd7a80207c7f2)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSync.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/SimpleCopyListing.java


> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
> Fix For: 2.9.0
>
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241629#comment-15241629
 ] 

Hudson commented on HDFS-10280:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9612 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9612/])
HDFS-10280. Document new dfsadmin command -evictWriters. Contributed by 
(kihwal: rev c970f1d00525e4273075cff7406dcbd71305abd5)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md


> Document new dfsadmin command -evictWriters
> ---
>
> Key: HDFS-10280
> URL: https://issues.apache.org/jira/browse/HDFS-10280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
> Attachments: HDFS-10280.001.patch
>
>
> HDFS-9945 added a new dfsadmin command -evictWriters, which is great.
> However I noticed typing {{dfs dfsadmin}} does not show a command line help 
> summary. It is shown only when I type {{dfs dfsadmin  -help}}.
> Also, it would be great to document it in {{HDFS Commands Guide}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10281) o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10281:
-
Attachment: HDFS-10281.001.patch

Thanks [~jnp] for reviewing.

Re-upload the patch for triggering Jenkins. No changes in the patch.

> o.a.h.hdfs.server.namenode.ha.TestPendingCorruptDnMessages fails 
> intermittently
> ---
>
> Key: HDFS-10281
> URL: https://issues.apache.org/jira/browse/HDFS-10281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10281.000.patch, HDFS-10281.001.patch
>
>
> In our daily UT test, we found the 
> {{TestPendingCorruptDnMessages#testChangedStorageId}} failed intermittently, 
> see following information:
> *Error Message*
> expected:<1> but was:<0>
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.getRegisteredDatanodeUid(TestPendingCorruptDnMessages.java:124)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages.testChangedStorageId(TestPendingCorruptDnMessages.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241612#comment-15241612
 ] 

Masatake Iwasaki commented on HDFS-9905:


... s/TestWebHdfsTimeouts#runWithRetry/WebHdfsFileSystem#runWithRetry/

> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10280:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-2.8. Thanks for reporting and fixing 
this, [~jojochuang].

> Document new dfsadmin command -evictWriters
> ---
>
> Key: HDFS-10280
> URL: https://issues.apache.org/jira/browse/HDFS-10280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
> Attachments: HDFS-10280.001.patch
>
>
> HDFS-9945 added a new dfsadmin command -evictWriters, which is great.
> However I noticed typing {{dfs dfsadmin}} does not show a command line help 
> summary. It is shown only when I type {{dfs dfsadmin  -help}}.
> Also, it would be great to document it in {{HDFS Commands Guide}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241602#comment-15241602
 ] 

Masatake Iwasaki commented on HDFS-9905:


s/TestWebHdfsTimeouts#runWithRetry/WebHdfsTimeouts#runWithRetry/

> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241584#comment-15241584
 ] 

Kihwal Lee commented on HDFS-10280:
---

+1 lgtm

> Document new dfsadmin command -evictWriters
> ---
>
> Key: HDFS-10280
> URL: https://issues.apache.org/jira/browse/HDFS-10280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10280.001.patch
>
>
> HDFS-9945 added a new dfsadmin command -evictWriters, which is great.
> However I noticed typing {{dfs dfsadmin}} does not show a command line help 
> summary. It is shown only when I type {{dfs dfsadmin  -help}}.
> Also, it would be great to document it in {{HDFS Commands Guide}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241574#comment-15241574
 ] 

Kihwal Lee commented on HDFS-9905:
--

[~eepayne], do you by any chance have any idea about this test failures?

> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10216:
-
Priority: Major  (was: Critical)

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
> Fix For: 2.9.0
>
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10216:
-
  Resolution: Fixed
   Fix Version/s: 2.9.0
Target Version/s:   (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

+1. I've committed this to trunk and branch-2. Thanks for the fix, [~bwtakacy]. 
Thanks for reporting the issue and reviewing the patch, [~jzhuge]. 

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10216:
-
Affects Version/s: (was: 2.6.0)
   2.8.0

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241569#comment-15241569
 ] 

Masatake Iwasaki commented on HDFS-9905:


{code}
  private static final int SHORT_SOCKET_TIMEOUT = 5;
{code}

I had to decrease the value of SHORT_SOCKET_TIMEOUT to reproduce the issue in a 
few tries on my environment. Maybe just increasing the value to 20 or 30 is 
enough to avoid the issue even on heavily loaded build servers.


> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241567#comment-15241567
 ] 

Mingliang Liu commented on HDFS-10284:
--

Failing tests are not related.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9905) TestWebHdfsTimeouts fails occasionally

2016-04-14 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241555#comment-15241555
 ] 

Masatake Iwasaki commented on HDFS-9905:


Hmm... the stacktrace printed by {{GenericTestUtils.assertExceptionContains}} 
does not show the root cause because {{TestWebHdfsTimeouts#runWithRetry}} 
recreate SocketTimeoutException to add host address to the message.

I got following stack by commenting out the recreating exception part of 
{{TestWebHdfsTimeouts#runWithRetry}}. 
[java.net.SocksSocketImpl|http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/34c594b52b73/src/share/classes/java/net/SocksSocketImpl.java#l103]
 is possible to throw SocketTimeoutException with null message. We seem not to 
be able to expect that SocketTimeoutException always contains message such as 
"Read timed out" or "connect timed out".

{noformat}
java.lang.AssertionError: Expected to find ': Read timed out' but got 
unexpected exception:java.net.SocketTimeoutException
at java.net.SocksSocketImpl.remainingMillis(SocksSocketImpl.java:111)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:684)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:637)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:709)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:555)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:586)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:582)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1466)
at 
org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts.testAuthUrlReadTimeout(TestWebHdfsTimeouts.java:198)
{noformat}


> TestWebHdfsTimeouts fails occasionally
> --
>
> Key: HDFS-9905
> URL: https://issues.apache.org/jira/browse/HDFS-9905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9905.001.patch
>
>
> When checking for a timeout, it does get {{SocketTimeoutException}}, but the 
> message sometimes does not contain "connect timed out". Since the original 
> exception is not logged, we do not know details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10282) The VolumeScanner should warn about replica files which are misplaced

2016-04-14 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241515#comment-15241515
 ] 

Colin Patrick McCabe commented on HDFS-10282:
-

Thanks, [~kihwal].

> The VolumeScanner should warn about replica files which are misplaced
> -
>
> Key: HDFS-10282
> URL: https://issues.apache.org/jira/browse/HDFS-10282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-10282.001.patch, HDFS-10282.002.patch
>
>
> The VolumeScanner should warn about replica files which are misplaced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241514#comment-15241514
 ] 

Wei-Chiu Chuang commented on HDFS-10280:


I had to violate checkstyle warning, because the existing code already 
violated, and I just indent the existing code. Unless we want to fix the 
existing code, but I see not much value in it.

> Document new dfsadmin command -evictWriters
> ---
>
> Key: HDFS-10280
> URL: https://issues.apache.org/jira/browse/HDFS-10280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10280.001.patch
>
>
> HDFS-9945 added a new dfsadmin command -evictWriters, which is great.
> However I noticed typing {{dfs dfsadmin}} does not show a command line help 
> summary. It is shown only when I type {{dfs dfsadmin  -help}}.
> Also, it would be great to document it in {{HDFS Commands Guide}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-04-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241455#comment-15241455
 ] 

Arpit Agarwal commented on HDFS-9349:
-

Look for commit ddfe677. 

> Support reconfiguring fs.protected.directories without NN restart
> -
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 2.9.0
>
> Attachments: HDFS-9349-HDFS-9000.003.patch, 
> HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, 
> HDFS-9349-HDFS-9000.006.patch, HDFS-9349-HDFS-9000.007.patch, 
> HDFS-9349-HDFS-9000.008.patch, HDFS-9349.001.patch, HDFS-9349.002.patch
>
>
> This is to reconfigure
> {code}
> fs.protected.directories
> {code}
> without restarting NN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10292:

Affects Version/s: 2.7.2

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241426#comment-15241426
 ] 

Hadoop QA commented on HDFS-10280:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
228 unchanged - 0 fixed = 229 total (was 228) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 59s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 29s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 184m 53s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.fs.TestWebHdfsFileContextMainOperations |
|   | hadoop.hdfs.tools.TestDFSAdmin |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
| JDK 

[jira] [Commented] (HDFS-10289) Balancer configures DNs directly

2016-04-14 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241396#comment-15241396
 ] 

John Zhuge commented on HDFS-10289:
---

Thanks [~anu], I will take a look.

> Balancer configures DNs directly
> 
>
> Key: HDFS-10289
> URL: https://issues.apache.org/jira/browse/HDFS-10289
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
>
> Balancer directly configures the 2 balance-related properties 
> (bandwidthPerSec and concurrentMoves) on the DNs involved.
> Details:
> * Before each balancing iteration, set the properties on all DNs involved in 
> the current iteration.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * Need new DN APIs to query and set the 2 properties.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin 
> -setBalancerBandwidth}} to configure every DN in the cluster.
> Pros:
> * Improve ease of use because all configurations are done at one place, the 
> balancer. We saw many customers often forgot to set concurrentMoves properly 
> since it is required on both DN and Balancer.
> * Support new DNs added between iterations
> * Handle DN restarts between iterations
> * May be able to dynamically adjust the thresholds in different iterations. 
> Don't know how useful though.
> Cons:
> * New DN property API
> * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin 
> -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by 
> admin.
> Questions:
> * Can we create {{BalancerConcurrentMovesCommand}} similar to 
> {{BalancerBandwidthCommand}}? Can Balancer use them directly without going 
> through NN?
> One proposal to implement HDFS-7466 calls for an API to query DN properties. 
> DN Conf Servlet returns all config properties. It does not return individual 
> property and it does not return the value set by {{hdfs dfsadmin 
> -setBalancerBandwidth}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10216:
--
Hadoop Flags: Reviewed

+1 LGTM

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241378#comment-15241378
 ] 

Hadoop QA commented on HDFS-10292:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 39s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 24s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 46s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798748/HDFS-10292.patch |
| JIRA Issue | HDFS-10292 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 349fbc280626 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-10289) Balancer configures DNs directly

2016-04-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241375#comment-15241375
 ] 

Anu Engineer commented on HDFS-10289:
-

As part of diskBalancer work we have added an API that might be useful for you. 
There is a DN RPC in HDFS-1312 branch which allows you to query generic 
properties from DataNode. Please look at HDFS-9647 if you are interested. if 
you find this API to be useful for you, you are most welcome to use HDFS-1312 
to develop this feature.

Unfortunately the API is named getDiskBalancerSetting or 
DiskBalancerSettingRequestProto. You might want to rename that call to 
getDatanodeSetting or something to that effect to make it generic.


> Balancer configures DNs directly
> 
>
> Key: HDFS-10289
> URL: https://issues.apache.org/jira/browse/HDFS-10289
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
>
> Balancer directly configures the 2 balance-related properties 
> (bandwidthPerSec and concurrentMoves) on the DNs involved.
> Details:
> * Before each balancing iteration, set the properties on all DNs involved in 
> the current iteration.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * Need new DN APIs to query and set the 2 properties.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin 
> -setBalancerBandwidth}} to configure every DN in the cluster.
> Pros:
> * Improve ease of use because all configurations are done at one place, the 
> balancer. We saw many customers often forgot to set concurrentMoves properly 
> since it is required on both DN and Balancer.
> * Support new DNs added between iterations
> * Handle DN restarts between iterations
> * May be able to dynamically adjust the thresholds in different iterations. 
> Don't know how useful though.
> Cons:
> * New DN property API
> * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin 
> -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by 
> admin.
> Questions:
> * Can we create {{BalancerConcurrentMovesCommand}} similar to 
> {{BalancerBandwidthCommand}}? Can Balancer use them directly without going 
> through NN?
> One proposal to implement HDFS-7466 calls for an API to query DN properties. 
> DN Conf Servlet returns all config properties. It does not return individual 
> property and it does not return the value set by {{hdfs dfsadmin 
> -setBalancerBandwidth}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot

2016-04-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241365#comment-15241365
 ] 

Yongjun Zhang commented on HDFS-9820:
-

Hi [~jingzhao],

Thanks a lot for your review and comments!

Here is my reply in the same order of your questions, hope they make sense to 
you:

# Without HDFS-10263 fix, internally I always use forward snapshot diff, and do 
transformation from there.  Not sure if your first question implies you suggest 
we still use reversed diff that doesn't have HDFS-10263 fix, and translate the 
result to be symmetric as forward snapshot diff (same as what HDFS-10263 would 
have achieved). If so, because the result still need another (existing) 
transformation as we currently do, that would cause the complexity I referred 
to in HDFS-10263.
# We now use {{-diff "" }} at command line to do the same behavior as 
{{-rdiff }} as in last patch rev. Due to lack of HDFS-10263, I swapped the 
source and target internally (and added the {{useRdiff}} flag to indicate the 
swapping), and always use forward snapshot diff. 
# Seems you mean we should allow user to pass snapshot names in any order, 
either {{-diff s1 s2}} or {{-diff s2 s1}}, and let the program to order s1 s2? 
What I was thinking was, we need to use the order user passed to indicate 
whether we are doing forward diff (HDFS-8828) or reverse diff (HDFS-9820). Thus 
{{-diff s1 s2}} and {{-diff s2 s1}} means different thing to me. I may have 
misunderstood you though.
 
In addition, after HDFS-10263 is in place, we can make the implementation more 
symmetric (HDFS-8828 vs HDFS-9820).

Thanks much.


> Improve distcp to support efficient restore to an earlier snapshot
> --
>
> Key: HDFS-9820
> URL: https://issues.apache.org/jira/browse/HDFS-9820
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch, 
> HDFS-9820.003.patch, HDFS-9820.004.patch
>
>
> HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are 
> some complexity and challenges. 
> HDFS-7535 improved distcp performance by avoiding copying files that changed 
> name since last backup.
> On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data 
> from source to target cluster, by only copying changed files since last 
> backup. The way it works is use snapshot diff to find out all files changed, 
> and copy the changed files only.
> See 
> https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/
> This jira is to propose a variation of HDFS-8828, to find out the files 
> changed in target cluster since last snapshot sx, and copy these from the 
> source target's same snapshot sx, to restore target cluster to sx.
> If a file/dir is
> - renamed, rename it back
> - created in target cluster, delete it
> - modified, put it to the copy list
> - run distcp with the copy list, copy from the source cluster's corresponding 
> snapshot
> This could be a new command line switch -rdiff in distcp.
> HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 
> would hopefully be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10292:

Priority: Minor  (was: Major)

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Minor
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10292:

Status: Patch Available  (was: Open)

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10292:

Attachment: HDFS-10292.patch

Uploaded the patch, kindly review.

> Add block id when client got Unable to close file exception
> ---
>
> Key: HDFS-10292
> URL: https://issues.apache.org/jira/browse/HDFS-10292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10292.patch
>
>
> Add block id when client got Unable to close file exception,, It's good to 
> have block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10292) Add block id when client got Unable to close file exception

2016-04-14 Thread Brahma Reddy Battula (JIRA)
Brahma Reddy Battula created HDFS-10292:
---

 Summary: Add block id when client got Unable to close file 
exception
 Key: HDFS-10292
 URL: https://issues.apache.org/jira/browse/HDFS-10292
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula


Add block id when client got Unable to close file exception,, It's good to have 
block id, for better debugging purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9271) Implement basic NN operations

2016-04-14 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer reassigned HDFS-9271:
-

Assignee: James Clampffer  (was: Aliaksei Sandryhaila)

> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner
> * fsync



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10282) The VolumeScanner should warn about replica files which are misplaced

2016-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241136#comment-15241136
 ] 

Hudson commented on HDFS-10282:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9611 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9611/])
HDFS-10282. The VolumeScanner should warn about replica files which are 
(kihwal: rev 0d1c1152f1ce2706f92109bfbdff0d62e98e6797)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/VolumeScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java


> The VolumeScanner should warn about replica files which are misplaced
> -
>
> Key: HDFS-10282
> URL: https://issues.apache.org/jira/browse/HDFS-10282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-10282.001.patch, HDFS-10282.002.patch
>
>
> The VolumeScanner should warn about replica files which are misplaced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10280) Document new dfsadmin command -evictWriters

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241110#comment-15241110
 ] 

Kihwal Lee commented on HDFS-10280:
---

Kicked the build manually.

> Document new dfsadmin command -evictWriters
> ---
>
> Key: HDFS-10280
> URL: https://issues.apache.org/jira/browse/HDFS-10280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10280.001.patch
>
>
> HDFS-9945 added a new dfsadmin command -evictWriters, which is great.
> However I noticed typing {{dfs dfsadmin}} does not show a command line help 
> summary. It is shown only when I type {{dfs dfsadmin  -help}}.
> Also, it would be great to document it in {{HDFS Commands Guide}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10282) The VolumeScanner should warn about replica files which are misplaced

2016-04-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10282:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

> The VolumeScanner should warn about replica files which are misplaced
> -
>
> Key: HDFS-10282
> URL: https://issues.apache.org/jira/browse/HDFS-10282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-10282.001.patch, HDFS-10282.002.patch
>
>
> The VolumeScanner should warn about replica files which are misplaced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10282) The VolumeScanner should warn about replica files which are misplaced

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241100#comment-15241100
 ] 

Kihwal Lee commented on HDFS-10282:
---

Committed this to trunk and branch-2.

> The VolumeScanner should warn about replica files which are misplaced
> -
>
> Key: HDFS-10282
> URL: https://issues.apache.org/jira/browse/HDFS-10282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-10282.001.patch, HDFS-10282.002.patch
>
>
> The VolumeScanner should warn about replica files which are misplaced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10282) The VolumeScanner should warn about replica files which are misplaced

2016-04-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241087#comment-15241087
 ] 

Kihwal Lee commented on HDFS-10282:
---

+1

> The VolumeScanner should warn about replica files which are misplaced
> -
>
> Key: HDFS-10282
> URL: https://issues.apache.org/jira/browse/HDFS-10282
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-10282.001.patch, HDFS-10282.002.patch
>
>
> The VolumeScanner should warn about replica files which are misplaced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241080#comment-15241080
 ] 

Hadoop QA commented on HDFS-10216:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 16s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 39s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 4s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798718/HDFS-10216.4.patch |
| JIRA Issue | HDFS-10216 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b748750ed614 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / df18b6e9 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_77 

[jira] [Commented] (HDFS-10258) Erasure Coding: support small cluster whose #DataNode < # (Blocks in a BlockGroup)

2016-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241069#comment-15241069
 ] 

Kai Zheng commented on HDFS-10258:
--

Thanks [~libo-intel] for the consideration. It's good to support a smaller 
cluster because it will allow someone to play with erasure coding with only a 
few nodes. However, I'm not sure if it's good to change much logic to allow 
#DataNode < # (Blocks in a BlockGroup) (if fortunately we don't have to, fine) 
because it may not make so much sense. Instead, we can consider to leverage the 
3+2 schema and policy.


> Erasure Coding: support small cluster whose #DataNode < # (Blocks in a 
> BlockGroup)
> --
>
> Key: HDFS-10258
> URL: https://issues.apache.org/jira/browse/HDFS-10258
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
>
> Currently EC has not supported small clusters whose datanode number is 
> smaller than the block numbers in a block group. This sub task will solve 
> this problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241066#comment-15241066
 ] 

Steve Loughran commented on HDFS-10291:
---



1. test is doing a short circuited read of a 13 byte file 
{{doTestShortCircuitRead(false, size= 13, readOffset= 0)}}
2. # creates 13 bytes of data, saves it: {{byte[] fileData = 
AppendTestUtil.randomBytes(seed, size)}}
3. dest buffer is created for this
{code}
byte[] actual = new byte[expected.length-readOffset];
{code}
4 does some small reads:
{code}
//Read a small number of bytes first.
int nread = stm.read(actual, 0, 3);
nread += stm.read(actual, nread, 2);
//Read across chunk boundary
nread += stm.read(actual, nread, 517);// ** HERE **
{code}

The exception is being raised as the code is asking to read 517 bytes into a 
buffer 13 bytes long. This breaks IOStream's rules: you can't ask for more than 
you have space for. It says that clearly in the IOStream API spec; what was 
added in HADOOP-12994 was the checking of passing in too big a length or 
negative offsets.

I think this a bug in the test. Whatever it is trying to do, it shouldn't be 
trying to do it on such a small buffer.

What's interesting though is when you delve into the code: the block reader 
logic doesn't look at the length of the read at all. That is, it appears to 
fill up the entire byte array passed in, from the offset supplied, stopping at 
the end of the buffer or file, whichever comes first.

Which is something that other code (i.e. production code) could be relying on. 
They shouldn't, as the code will break when working with any FS other than 
HDFS, but there is a risk that they might.

What to do?

# retain checks, fix test.
# log at warning and shrink len parameter when passed down. People shouldn't be 
doing this, but HDFS will reluctantly let you.


> TestShortCircuitLocalRead failing
> -
>
> Key: HDFS-10291
> URL: https://issues.apache.org/jira/browse/HDFS-10291
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> {{TestShortCircuitLocalRead}} failing as length of read is considered off end 
> of buffer. There's an off-by-one error somewhere in the test or the new 
> validation code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241038#comment-15241038
 ] 

Steve Loughran commented on HDFS-10291:
---

{code}
java.lang.IndexOutOfBoundsException: Requested more bytes than destination 
buffer size
at 
org.apache.hadoop.fs.FSInputStream.validatePositionedReadArgs(FSInputStream.java:107)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:975)
at java.io.DataInputStream.read(DataInputStream.java:149)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.checkFileContent(TestShortCircuitLocalRead.java:157)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitReadImpl(TestShortCircuitLocalRead.java:286)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitRead(TestShortCircuitLocalRead.java:241)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.testSmallFileLocalRead(TestShortCircuitLocalRead.java:308)
{code}

> TestShortCircuitLocalRead failing
> -
>
> Key: HDFS-10291
> URL: https://issues.apache.org/jira/browse/HDFS-10291
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> {{TestShortCircuitLocalRead}} failing as length of read is considered off end 
> of buffer. There's an off-by-one error somewhere in the test or the new 
> validation code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-14 Thread Steve Loughran (JIRA)
Steve Loughran created HDFS-10291:
-

 Summary: TestShortCircuitLocalRead failing
 Key: HDFS-10291
 URL: https://issues.apache.org/jira/browse/HDFS-10291
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.8.0
Reporter: Steve Loughran
Assignee: Steve Loughran


{{TestShortCircuitLocalRead}} failing as length of read is considered off end 
of buffer. There's an off-by-one error somewhere in the test or the new 
validation code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Takashi Ohnishi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HDFS-10216:
---
Status: Patch Available  (was: Open)

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Takashi Ohnishi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HDFS-10216:
---
Attachment: HDFS-10216.4.patch

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch, HDFS-10216.4.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Takashi Ohnishi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241012#comment-15241012
 ] 

Takashi Ohnishi commented on HDFS-10216:


{quote}
* Line 709 is too long
* Line 710-711 can be merged into 1 line: new DistCp(...).execute()
{quote}

All right. I have fixed them in the v4 patch.
And, I have added a verification the result of distcp.



> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10216) distcp -diff relative path exception

2016-04-14 Thread Takashi Ohnishi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HDFS-10216:
---
Status: Open  (was: Patch Available)

> distcp -diff relative path exception
> 
>
> Key: HDFS-10216
> URL: https://issues.apache.org/jira/browse/HDFS-10216
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: HDFS-10216.1.patch, HDFS-10216.2.patch, 
> HDFS-10216.3.patch
>
>
> Got this exception when running {{distcp -diff}} with relative paths:
> {code}
> $ hadoop distcp -update -diff s1 s2 d1 d2
> 16/03/25 09:45:40 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], 
> targetPath=d2, targetPathExists=true, preserveRawXattrs=false, 
> filtersFile='null'}
> 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at 
> jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032
> 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at org.apache.hadoop.fs.Path.initialize(Path.java:206)
>   at org.apache.hadoop.fs.Path.(Path.java:197)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:123)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:436)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2
>   at java.net.URI.checkPath(URI.java:1804)
>   at java.net.URI.(URI.java:752)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:203)
>   ... 11 more
> {code}
> But theses commands worked:
> * Absolute path: {{hadoop distcp -update -diff s1 s2 /user/systest/d1 
> /user/systest/d2}}
> * No {{-diff}}: {{hadoop distcp -update d1 d2}}
> However, everything was fine when I ran {{hadoop distcp -update -diff s1 s2 
> d1 d2}} again. I am not sure the problem only exists with option {{-diff}}. 
> Trying to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9412) getBlocks occupies FSLock and takes too long to complete

2016-04-14 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240996#comment-15240996
 ] 

Walter Su commented on HDFS-9412:
-

{{TestBalancer}} passes locally. +1 for the last patch.

> getBlocks occupies FSLock and takes too long to complete
> 
>
> Key: HDFS-9412
> URL: https://issues.apache.org/jira/browse/HDFS-9412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: He Tianyi
>Assignee: He Tianyi
> Attachments: HDFS-9412..patch, HDFS-9412.0001.patch, 
> HDFS-9412.0002.patch
>
>
> {{getBlocks}} in {{NameNodeRpcServer}} acquires a read lock then may take a 
> long time to complete (probably several seconds, if number of blocks are too 
> much). 
> During this period, other threads attempting to acquire write lock will wait. 
> In an extreme case, RPC handlers are occupied by one reader thread calling 
> {{getBlocks}} and all other threads waiting for write lock, rpc server acts 
> like hung. Unfortunately, this tends to happen in heavy loaded cluster, since 
> read operations come and go fast (they do not need to wait), leaving write 
> operations waiting.
> Looks like we can optimize this thing like DN block report did in past, by 
> splitting the operation into smaller sub operations, and let other threads do 
> their work between each sub operation. The whole result is returned at once, 
> though (one thing different from DN block report). 
> I am not sure whether this will work. Any better idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >