[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning

2016-05-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10381:
--
Attachment: HDFS-10381.001.patch

Please review patch 001:
* Change 2 logs from info to warn in {{DataStreamer.nextBlockOutputStream}}
* Change 1 log from info to warn in 
{{StripedDataStreamer.nextBlockOutputStream}}

Couldn't find any suitable unit test to change or run.

> DataStreamer DataNode exclusion log message should be warning
> -
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10381.001.patch
>
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning

2016-05-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10381:
--
Status: Patch Available  (was: In Progress)

> DataStreamer DataNode exclusion log message should be warning
> -
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-10381.001.patch
>
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9559) Add haadmin command to get HA state of all the namenodes

2016-05-09 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277653#comment-15277653
 ] 

Surendra Singh Lilhore commented on HDFS-9559:
--

[~eddyxu] Please can you review ?

> Add haadmin command to get HA state of all the namenodes
> 
>
> Key: HDFS-9559
> URL: https://issues.apache.org/jira/browse/HDFS-9559
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9559.01.patch
>
>
> Currently we have one command to get state of namenode.
> {code}
> ./hdfs haadmin -getServiceState 
> {code}
> It will be good to have command which will give state of all the namenodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10303) DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.

2016-05-09 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277650#comment-15277650
 ] 

Surendra Singh Lilhore commented on HDFS-10303:
---

Checkstyle warning and Failed test case is unrelated..

Please review..

bq. org.apache.hadoop.hdfs.TestHFlush.testHFlushInterrupted

This test case fixed in HDFS-2043.

> DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.
> -
>
> Key: HDFS-10303
> URL: https://issues.apache.org/jira/browse/HDFS-10303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-10303-001.patch
>
>
> Packets acknowledge duration should be calculated based on the packet send 
> time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem

2016-05-09 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277640#comment-15277640
 ] 

Xiaobing Zhou commented on HDFS-10346:
--

Thank you for review. I posted patch v004. It 
1. called fixRelativePart and added doc for return value.
2. moved test preparation out of async call test.
3. added two tests, testConservativeConcurrentAsyncAPI and 
testAggressiveConcurrentAsyncAPI, which limit max #async calls to 100 and 
unlimited (e.g. 1, large enough so as not to be hit)
4. removed testPermissionChecking which did similar thing with 
testAggressiveConcurrentAsyncAPI in patch v003
5. added rename test in the cases of testConservativeConcurrentAsyncAPI and 
testAggressiveConcurrentAsyncAPI, but assumed specific order/dependency of 
rename, setPermission and setOwner, otherwise, the test needs to deal with a 
swamp of non-deterministic results due to dependency of async calls.


> Implement asynchronous setPermission/setOwner for DistributedFileSystem
> ---
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch, 
> HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch, 
> HDFS-10346-HDFS-9924.004.patch
>
>
> This is proposed to implement an asynchronous setPermission and setOwner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem

2016-05-09 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10346:
-
Attachment: HDFS-10346-HDFS-9924.004.patch

> Implement asynchronous setPermission/setOwner for DistributedFileSystem
> ---
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch, 
> HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch, 
> HDFS-10346-HDFS-9924.004.patch
>
>
> This is proposed to implement an asynchronous setPermission and setOwner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning

2016-05-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277626#comment-15277626
 ] 

John Zhuge commented on HDFS-10381:
---

Thanks [~liuml07]. Fixed.

> DataStreamer DataNode exclusion log message should be warning
> -
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning

2016-05-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10381:
--
Component/s: (was: datanode)
 hdfs-client

> DataStreamer DataNode exclusion log message should be warning
> -
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning

2016-05-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10381:
--
Summary: DataStreamer DataNode exclusion log message should be warning  
(was: DataNode exclusion log message should be a warning)

> DataStreamer DataNode exclusion log message should be warning
> -
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277597#comment-15277597
 ] 

John Zhuge commented on HDFS-10287:
---

Thanks [~boky01]. Leave the unit test alone to minimize noise for this patch. 
We can always file a separate jira to clean it up. The unit test failures seem 
unrelated.

+1 LGTM.

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: Andras Bokor
>Priority: Trivial
> Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-09 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Status: Patch Available  (was: In Progress)

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, 
> HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, 
> HDFS-8449-v10.patch, HDFS-8449-v11.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-09 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Status: In Progress  (was: Patch Available)

To trigger the test

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, 
> HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, 
> HDFS-8449-v10.patch, HDFS-8449-v11.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10383) Safely close resources in DFSTestUtil

2016-05-09 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-10383:


 Summary: Safely close resources in DFSTestUtil
 Key: HDFS-10383
 URL: https://issues.apache.org/jira/browse/HDFS-10383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Mingliang Liu
Assignee: Mingliang Liu
Priority: Minor


There are a few of methods in {{DFSTestUtil}} that do not close the resource 
safely, or elegantly. We can use the try-with-resource statement to address 
this problem.

Specially, as {{DFSTestUtil}} is popularly used in test, we need to preserve 
any exceptions thrown during the processing of the resource while still 
guaranteeing it's closed finally. Take for example,the current implementation 
of {{DFSTestUtil#createFile()}} closes the FSDataOutputStream in the 
{{finally}} block, and when closing if the internal {{DFSOutputStream#close()}} 
throws any exception, which it often does, the exception thrown during the 
processing will be lost. See this [test 
failure|https://builds.apache.org/job/PreCommit-HADOOP-Build/9320/testReport/org.apache.hadoop.hdfs/TestAsyncDFSRename/testAggressiveConcurrentAsyncRenameWithOverwrite/],
 and we have to guess what was the root cause.

Using try-with-resource, we can close the resources safely, and the exceptions 
thrown both in processing and closing will be available (closing exception will 
be suppressed).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode

2016-05-09 Thread ramtin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramtin updated HDFS-10382:
--
Status: Patch Available  (was: Open)

> In WebHDFS numeric usernames do not work with DataNode
> --
>
> Key: HDFS-10382
> URL: https://issues.apache.org/jira/browse/HDFS-10382
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: ramtin
>Assignee: ramtin
>
> Operations like {code:java}curl -i 
> -L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that 
> directed to DataNode fail because of not reading the suggested domain pattern 
> from the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode

2016-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277390#comment-15277390
 ] 

ASF GitHub Bot commented on HDFS-10382:
---

GitHub user ramtinb opened a pull request:

https://github.com/apache/hadoop/pull/94

HDFS-10382 In WebHDFS numeric usernames do not work with DataNode

In WebHDFS for cat operation, we have 2 sequential HTTP requests.
The first HTTP request is handled by NN and the second one by DN.
Unlike the NN, the DN is not using the suggested domain pattern from the 
configuration!

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ramtinb/hadoop feature/HDFS-10382

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/94.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #94


commit 6cbb8f60702b8c3211e52e71681fc8d981fe0525
Author: Ramtin Boustani 
Date:   2016-05-10T00:05:05Z

HDFS-10382 In WebHDFS numeric usernames do not work with DataNode




> In WebHDFS numeric usernames do not work with DataNode
> --
>
> Key: HDFS-10382
> URL: https://issues.apache.org/jira/browse/HDFS-10382
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: ramtin
>Assignee: ramtin
>
> Operations like {code:java}curl -i 
> -L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that 
> directed to DataNode fail because of not reading the suggested domain pattern 
> from the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277292#comment-15277292
 ] 

Hadoop QA commented on HDFS-10360:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 55s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s 
{color} | {color:red} root: patch generated 6 new + 142 unchanged - 1 fixed = 
148 total (was 143) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 17s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m 20s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 33s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 34s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 40s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}

[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277267#comment-15277267
 ] 

Hudson commented on HDFS-10372:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9737 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9737/])
HDFS-10372. Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume. 
(kihwal: rev b9e5a32fa14b727b44118ec7f43fb95de05a7c2c)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java


> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.7.3
>
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277254#comment-15277254
 ] 

Hadoop QA commented on HDFS-10287:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 
new + 208 unchanged - 3 fixed = 208 total (was 211) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 6s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 46s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestFileCreationDelete |
|   | hadoop.hdfs.TestCrcCorruption |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestFileCreationDelete |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803063/HDFS-10287.02.patch |
| JIRA Issue | HDFS-10287 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8c6cd1036e86 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT W

[jira] [Comment Edited] (HDFS-10378) setOwner throws AccessControlException with wrong message

2016-05-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276994#comment-15276994
 ] 

Yongjun Zhang edited comment on HDFS-10378 at 5/9/16 10:41 PM:
---

HI [~jzhuge],

Thanks for reporting the issue and the patch. Couple of comments:
1. would you please provide single patch file that contains both the fix and 
unit test, instead of multiple files?
2. {{setOwner}} expects user to be  superuser or belonging to superuser group. 
The call to {{fsd.checkOwner(pc, iip);}} appears to be no-op for this kind of 
user. Would you please look into whether we really need the call   
{{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser 
both necessary and sufficient condition for this method?

Thanks.






was (Author: yzhangal):
HI [~jzhuge],

Thanks for reporting the issue and the patch. Couple of comments:
1. would you please provide single patch file that contains both the fix and 
unit test, instead of multiple files?
2. {{setOwner}} expects user to be  superuser or belonging to superuser group. 
The call to {{fsd.checkOwner(pc, iip);}} appears to be non-op for this kind of 
user. Would you please look into whether we really need the call   
{{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser 
both necessary and sufficient condition for this method?

Thanks.





> setOwner throws AccessControlException with wrong message
> -
>
> Key: HDFS-10378
> URL: https://issues.apache.org/jira/browse/HDFS-10378
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch
>
>
> Calling {{setOwner}} as a non-super user does trigger 
> {{AccessControlException}}, however, the message "Permission denied. 
> user=user1967821757 is not the owner of inode=child" is wrong. Expect this 
> message: "Non-super user cannot change owner".
> Output of patched unit test {{TestPermission.testFilePermission}}:
> {noformat}
> 2016-05-06 16:45:44,915 [main] INFO  security.TestPermission 
> (TestPermission.java:testFilePermission(280)) - GOOD: got 
> org.apache.hadoop.security.AccessControlException: Permission denied. 
> user=user1967821757 is not the owner of inode=child1
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417)
> {noformat}
> Will upload the unit test patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode

2016-05-09 Thread ramtin (JIRA)
ramtin created HDFS-10382:
-

 Summary: In WebHDFS numeric usernames do not work with DataNode
 Key: HDFS-10382
 URL: https://issues.apache.org/jira/browse/HDFS-10382
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: ramtin
Assignee: ramtin


Operations like {code:java}curl -i 
-L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that 
directed to DataNode fail because of not reading the suggested domain pattern 
from the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Attachment: HDFS-10360.003.patch

Forgot to rebase my patch... submitted a rebased patch.

> DataNode may format directory and lose blocks if current/VERSION is missing
> ---
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch, HDFS-10360.002.patch, 
> HDFS-10360.003.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. 
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545
> However, block files may still exist, and in my opinion, we should do 
> everything possible to retain the block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access

2016-05-09 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277091#comment-15277091
 ] 

Zhe Zhang commented on HDFS-9924:
-

Echoing Andrew that a design doc should be added. In particular, does the scope 
include read/write? Or only metadata operations as shown in current subtask 
list? Thanks.

> [umbrella] Asynchronous HDFS Access
> ---
>
> Key: HDFS-9924
> URL: https://issues.apache.org/jira/browse/HDFS-9924
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10378) setOwner throws AccessControlException with wrong message

2016-05-09 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277080#comment-15277080
 ] 

John Zhuge commented on HDFS-10378:
---

Thanks [~yzhangal]. HDFS-10378.001.patch does include the unit test in 
HDFS-10378-unit.patch. I will look into No 2.

> setOwner throws AccessControlException with wrong message
> -
>
> Key: HDFS-10378
> URL: https://issues.apache.org/jira/browse/HDFS-10378
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch
>
>
> Calling {{setOwner}} as a non-super user does trigger 
> {{AccessControlException}}, however, the message "Permission denied. 
> user=user1967821757 is not the owner of inode=child" is wrong. Expect this 
> message: "Non-super user cannot change owner".
> Output of patched unit test {{TestPermission.testFilePermission}}:
> {noformat}
> 2016-05-06 16:45:44,915 [main] INFO  security.TestPermission 
> (TestPermission.java:testFilePermission(280)) - GOOD: got 
> org.apache.hadoop.security.AccessControlException: Permission denied. 
> user=user1967821757 is not the owner of inode=child1
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417)
> {noformat}
> Will upload the unit test patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type

2016-05-09 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9902:

Release Note: Reserved space can be configured independently for different 
storage types for clusters with heterogeneous storage. The 
'dfs.datanode.du.reserved' property name can be suffixed with a storage types 
(i.e. one of ssd, disk, archival or ram_disk). e.g. reserved space for RAM_DISK 
storage can be configured using the property 
'dfs.datanode.du.reserved.ram_disk'. If specific storage type reservation is 
not configured then the value specified by 'dfs.datanode.du.reserved' will be 
used for all volumes.

> Support different values of dfs.datanode.du.reserved per storage type
> -
>
> Key: HDFS-9902
> URL: https://issues.apache.org/jira/browse/HDFS-9902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Pan Yuxuan
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, 
> HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch
>
>
> Now Hadoop support different storage type for DISK, SSD, ARCHIVE and 
> RAM_DISK, but they share one configuration dfs.datanode.du.reserved.
> The DISK size may be several TB and the RAM_DISK size may be only several 
> tens of GB.
> The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same 
> DN, and I set  dfs.datanode.du.reserved values 10GB, this will waste a lot of 
> RAM_DISK size. 
> Since the usage of RAM_DISK can be 100%, so I don't want 
> dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs.
> So can we make a new configuration for RAM_DISK or just skip this 
> configuration for RAM_DISK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access

2016-05-09 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277047#comment-15277047
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9924:
---

> Can we do this work on a branch? ...

I think we don't need a branch for the moment since this feature does not 
affect the other components.  This feature mainly adds new code but not 
changing the existing code much.

> Could someone post a design doc with the motivations, proposed API, and 
> discussion? ...

Motivations can be found in the description in this JIRA.  For proposed API, it 
makes more sense to discuss it in HADOOP-12910.  We could post some design doc, 
if it helps the discussion.

Thanks!

> [umbrella] Asynchronous HDFS Access
> ---
>
> Key: HDFS-9924
> URL: https://issues.apache.org/jira/browse/HDFS-9924
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem

2016-05-09 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277031#comment-15277031
 ] 

Tsz Wo Nicholas Sze commented on HDFS-10346:


Thanks Xiaobing!  Some comment ont he patch.
- In AsyncDistributedFileSystem,
-* Need to call fixRelativePart.
-* javadoc should describe return value.
- testPermissionChecking should first create all files/directories 
synchronously, and then call setPermission/setOwner asynchronously in a 
separated loop.
- Indeed, testAggressiveConcurrentAsyncAPI is a better test.  Why don't we just 
keep it and remove testPermissionChecking?
- I suggest to add one more test to mix up rename/setPermission/setOwner 
together.


> Implement asynchronous setPermission/setOwner for DistributedFileSystem
> ---
>
> Key: HDFS-10346
> URL: https://issues.apache.org/jira/browse/HDFS-10346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, hdfs-client
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10346-HDFS-9924.000.patch, 
> HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch
>
>
> This is proposed to implement an asynchronous setPermission and setOwner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277030#comment-15277030
 ] 

Rushabh S Shah commented on HDFS-10372:
---

Thanks [~kihwal] for reviews and committing. 
Thanks [~xiaochen] and [~iwasakims] for reviews.

> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.7.3
>
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10372:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3
   Status: Resolved  (was: Patch Available)

I've committed this to trunk through branch-2.7. Thanks for fixing this, 
[~shahrs87]. Thanks for valuable reviews, [~xiaochen] and [~iwasakims].

> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.7.3
>
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277007#comment-15277007
 ] 

Kihwal Lee edited comment on HDFS-10372 at 5/9/16 8:58 PM:
---

[~iwasakims] has already +1'ed it, meaning the suggested change is not strictly 
necessary.
I am committing this as is.


was (Author: kihwal):
[~iwasakims] has already +1'ed it, meaning the suggested change is strictly 
necessary.
I am committing this as is.

> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type

2016-05-09 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9902:

Fix Version/s: 2.8.0

> Support different values of dfs.datanode.du.reserved per storage type
> -
>
> Key: HDFS-9902
> URL: https://issues.apache.org/jira/browse/HDFS-9902
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Pan Yuxuan
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, 
> HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch
>
>
> Now Hadoop support different storage type for DISK, SSD, ARCHIVE and 
> RAM_DISK, but they share one configuration dfs.datanode.du.reserved.
> The DISK size may be several TB and the RAM_DISK size may be only several 
> tens of GB.
> The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same 
> DN, and I set  dfs.datanode.du.reserved values 10GB, this will waste a lot of 
> RAM_DISK size. 
> Since the usage of RAM_DISK can be 100%, so I don't want 
> dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs.
> So can we make a new configuration for RAM_DISK or just skip this 
> configuration for RAM_DISK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277007#comment-15277007
 ] 

Kihwal Lee commented on HDFS-10372:
---

[~iwasakims] has already +1'ed it, meaning the suggested change is strictly 
necessary.
I am committing this as is.

> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10378) setOwner throws AccessControlException with wrong message

2016-05-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276994#comment-15276994
 ] 

Yongjun Zhang commented on HDFS-10378:
--

HI [~jzhuge],

Thanks for reporting the issue and the patch. Couple of comments:
1. would you please provide single patch file that contains both the fix and 
unit test, instead of multiple files?
2. {{setOwner}} expects user to be  superuser or belonging to superuser group. 
The call to {{fsd.checkOwner(pc, iip);}} appears to be non-op for this kind of 
user. Would you please look into whether we really need the call   
{{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser 
both necessary and sufficient condition for this method?

Thanks.





> setOwner throws AccessControlException with wrong message
> -
>
> Key: HDFS-10378
> URL: https://issues.apache.org/jira/browse/HDFS-10378
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch
>
>
> Calling {{setOwner}} as a non-super user does trigger 
> {{AccessControlException}}, however, the message "Permission denied. 
> user=user1967821757 is not the owner of inode=child" is wrong. Expect this 
> message: "Non-super user cannot change owner".
> Output of patched unit test {{TestPermission.testFilePermission}}:
> {noformat}
> 2016-05-06 16:45:44,915 [main] INFO  security.TestPermission 
> (TestPermission.java:testFilePermission(280)) - GOOD: got 
> org.apache.hadoop.security.AccessControlException: Permission denied. 
> user=user1967821757 is not the owner of inode=child1
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417)
> {noformat}
> Will upload the unit test patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10260) TestFsDatasetImpl#testCleanShutdownOfVolume often fails

2016-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276911#comment-15276911
 ] 

Rushabh S Shah commented on HDFS-10260:
---

bq. FYI: It come up on my build
By my machine, I meant my  mac.
I found the root cause for the failure.
Please see HDFS-10372 for more details.


> TestFsDatasetImpl#testCleanShutdownOfVolume often fails
> ---
>
> Key: HDFS-10260
> URL: https://issues.apache.org/jira/browse/HDFS-10260
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, test
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Fix For: 2.8.0
>
> Attachments: HDFS-10260-v1.patch, HDFS-10260.patch
>
>
> This test failure occurs in upstream Jenkins. Looking at the test code, I 
> think it should be improved to capture the root cause of failure:
> E.g. change {{Thread.sleep(1000)}} to {{GenericTestUtils.waitFor}} and use 
> {{GenericTestUtils.assertExceptionContains}} to replace 
> {code}
> Assert.assertTrue(ioe.getMessage().contains(info.toString()));
> {code}
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/1062/testReport/junit/org.apache.hadoop.hdfs.server.datanode.fsdataset.impl/TestFsDatasetImpl/testCleanShutdownOfVolume/
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testCleanShutdownOfVolume(TestFsDatasetImpl.java:683)
> Standard Error
> Exception in thread "DataNode: 
> [[[DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/,
>  
> [DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]]
>   heartbeating to localhost/127.0.0.1:35113" java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockReports(FsDatasetImpl.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdownBlockPool(FsDatasetImpl.java:2591)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1479)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:411)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:494)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:749)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276902#comment-15276902
 ] 

Rushabh S Shah commented on HDFS-10372:
---

bq. it's up to you to decide whether to improve it as Masatake Iwasaki 
suggested.
I don't think this is required.
I edited the test on my machine to create one file after one of the volume gone 
bad. It is able to create the file and I can see the block on the datanode's 
good volume. But I don't see any value adding it to patch.
[~iwasakims]: Let me know if you still want me to create a new file. I will 
edit my patch.


> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-05-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276872#comment-15276872
 ] 

Andrew Wang commented on HDFS-6489:
---

I guess this means that the available space block placement policy is also 
broken then :( Agree that we need a methodical rework.

I'm a +0.5 on this patch, it looks good to me, but would need to spend more 
time understanding the existing accounting before I could +1. If another 
reviewer more familiar with this area can also review, that'd be appreciated.

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, 
> HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276857#comment-15276857
 ] 

Hadoop QA commented on HDFS-10380:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 25s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
21s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 46s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 50s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 9s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 11s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 45s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803024/HDFS-10380.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-10380 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 4d6709fd8099 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d187112 |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15398/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15398/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++:  Get rid of socket template parameter in RpcConnection
> -
>
> Key: HDFS-10380
> URL: https://issues.apache.org/jira/browse/HDFS-10380
> 

[jira] [Updated] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10287:

Attachment: HDFS-10287.02.patch

Test failure seems unrelated. Another JIRA was reported regarding this failure: 
HDFS-10260
Just as double check triggering build again with [^HDFS-10287.02.patch]

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: Andras Bokor
>Priority: Trivial
> Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10260) TestFsDatasetImpl#testCleanShutdownOfVolume often fails

2016-05-09 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276863#comment-15276863
 ] 

Andras Bokor commented on HDFS-10260:
-

bq. I am not able to make it fail on my machine. It never failed on our 
internal Jenkins server also.
FYI: It come up on my 
[build|https://builds.apache.org/job/PreCommit-HDFS-Build/15397/testReport/]

> TestFsDatasetImpl#testCleanShutdownOfVolume often fails
> ---
>
> Key: HDFS-10260
> URL: https://issues.apache.org/jira/browse/HDFS-10260
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, test
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Fix For: 2.8.0
>
> Attachments: HDFS-10260-v1.patch, HDFS-10260.patch
>
>
> This test failure occurs in upstream Jenkins. Looking at the test code, I 
> think it should be improved to capture the root cause of failure:
> E.g. change {{Thread.sleep(1000)}} to {{GenericTestUtils.waitFor}} and use 
> {{GenericTestUtils.assertExceptionContains}} to replace 
> {code}
> Assert.assertTrue(ioe.getMessage().contains(info.toString()));
> {code}
> https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/1062/testReport/junit/org.apache.hadoop.hdfs.server.datanode.fsdataset.impl/TestFsDatasetImpl/testCleanShutdownOfVolume/
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testCleanShutdownOfVolume(TestFsDatasetImpl.java:683)
> Standard Error
> Exception in thread "DataNode: 
> [[[DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/,
>  
> [DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]]
>   heartbeating to localhost/127.0.0.1:35113" java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockReports(FsDatasetImpl.java:1714)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdownBlockPool(FsDatasetImpl.java:2591)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1479)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:411)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:494)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:749)
>   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9545) DiskBalancer : Add Plan Command

2016-05-09 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9545:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> DiskBalancer : Add Plan Command
> ---
>
> Key: HDFS-9545
> URL: https://issues.apache.org/jira/browse/HDFS-9545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9545-HDFS-1312.001.patch, 
> HDFS-9545-HDFS-1312.002.patch
>
>
> Allows user to create a Plan and persist it. This is useful if the users want 
> to evaluate the actions of disk balancer before running the balancing job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9545) DiskBalancer : Add Plan Command

2016-05-09 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276834#comment-15276834
 ] 

Anu Engineer commented on HDFS-9545:


[~eddyxu] Thanks for the code review. I have committed this to the feature 
branch.

> DiskBalancer : Add Plan Command
> ---
>
> Key: HDFS-9545
> URL: https://issues.apache.org/jira/browse/HDFS-9545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9545-HDFS-1312.001.patch, 
> HDFS-9545-HDFS-1312.002.patch
>
>
> Allows user to create a Plan and persist it. This is useful if the users want 
> to evaluate the actions of disk balancer before running the balancing job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10381) DataNode exclusion log message should be a warning

2016-05-09 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276825#comment-15276825
 ] 

Mingliang Liu commented on HDFS-10381:
--

Is this for {{DataStreamer}}? Perhaps we can set the component field as 
{{hdfs-client}}.

> DataNode exclusion log message should be a warning
> --
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing

2016-05-09 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10360:
---
Attachment: HDFS-10360.002.patch

Rev02: add log to record what files are being deleted during a DataNode 
reformat operation. There's no need to add extra JMX reporting code, as long as 
an IOException thrown adding the volume, it is added into JMX by the existing 
code.

The test failures seems unrelated. I don't have these test failures in my tree.

> DataNode may format directory and lose blocks if current/VERSION is missing
> ---
>
> Key: HDFS-10360
> URL: https://issues.apache.org/jira/browse/HDFS-10360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10360.001.patch, HDFS-10360.002.patch
>
>
> Under certain circumstances, if the current/VERSION of a storage directory is 
> missing, DataNode may format the storage directory even though _block files 
> are not missing_.
> This is very easy to reproduce. Simply launch a HDFS cluster and create some 
> files. Delete current/VERSION, and restart the data node.
> After the restart, the data node will format the directory and remove all 
> existing block files:
> {noformat}
> 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Lock on /data/dfs/dn/in_use.lock acquired by nodename 
> 5...@weichiu-dn-2.vpc.cloudera.com
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Storage directory /data/dfs/dn is not formatted for 
> BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Locking is disabled for 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Block pool storage directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted 
> for BP-787466439-172
> .26.24.43-1462305406642
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting ...
> 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: 
> Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory 
> /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
> {noformat}
> The bug is: DataNode assumes that if none of {{current/VERSION}}, 
> {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and 
> {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing 
> important to HDFS and decides to format it. 
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545
> However, block files may still exist, and in my opinion, we should do 
> everything possible to retain the block files.
> I have two suggestions:
> # check if {{current/}} directory is empty. If not, throw an 
> InconsistentFSStateException in {{Storage#analyzeStorage}} instead of 
> asumming its not formatted. Or,
> # In {{Storage#clearDirectory}}, before it formats the storage directory, 
> rename or move {{current/}} directory. Also, log whatever is being 
> renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276794#comment-15276794
 ] 

Rushabh S Shah commented on HDFS-10372:
---

bq. The test expected that the message in exception on out.close() contains the 
name of failed volume (to which the replica was written) but it contained only 
info about live volume (data2).
When the client asked for locations for first block, namenode selected a 
datanode with any random storage info within that datanode.
Refer to {{DataStreamer.locateFollowingBlock(DatanodeInfo[] excludedNodes)}} 
method for more details.
When the client started writing to datanode, the datanode selects a volume 
accordingto  RoundRobinVolumeChoosingPolicy policy and it can select a storage 
which is different than what namenode has stored in its triplets.
When the datanode sends an IBR (with RECIEVING_BLOCK), the namenode will change 
the storage info in its triplets with the storage info which datanode reported.
But the change in storage info is not propogated back to client.
So the client still has stale storage info.
When the client tried to close the file, the datanode threw an exception (since 
the volume has gone bad) but since the client has stale storage info, it  saved 
the exception with the old storage info.
This is the reason why the test was flaky in the first place.
In my machine, the test finishes within 2 seconds. So the datanode didn't send 
any IBR and the storage info was not changed in namenode. 
But in the  jenkins build machines, the test ran for more than 8 seconds which 
gave datanode ample of time  to send an IBR.
[~iwasakims]: I hope this answers your question.



> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-10381) DataNode exclusion log message should be a warning

2016-05-09 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-10381 started by John Zhuge.
-
> DataNode exclusion log message should be a warning
> --
>
> Key: HDFS-10381
> URL: https://issues.apache.org/jira/browse/HDFS-10381
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> When adding a DN to {{excludedNodes}}, it should log a warning message 
> instead of info.
> {code}
>   success = createBlockOutputStream(nodes, storageTypes, 0L, false);
>   if (!success) {
> LOG.info("Abandoning " + block);
> dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
> dfsClient.clientName);
> block = null;
> final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
> LOG.info("Excluding datanode " + badNode);
> excludedNodes.put(badNode, badNode);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10381) DataNode exclusion log message should be a warning

2016-05-09 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10381:
-

 Summary: DataNode exclusion log message should be a warning
 Key: HDFS-10381
 URL: https://issues.apache.org/jira/browse/HDFS-10381
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Minor


When adding a DN to {{excludedNodes}}, it should log a warning message instead 
of info.
{code}
  success = createBlockOutputStream(nodes, storageTypes, 0L, false);

  if (!success) {
LOG.info("Abandoning " + block);
dfsClient.namenode.abandonBlock(block, stat.getFileId(), src,
dfsClient.clientName);
block = null;
final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()];
LOG.info("Excluding datanode " + badNode);
excludedNodes.put(badNode, badNode);
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276767#comment-15276767
 ] 

Hadoop QA commented on HDFS-10287:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 
new + 208 unchanged - 3 fixed = 208 total (was 211) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 37s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 58s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 136m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12801035/HDFS-10287.01.patch |
| JIRA Issue | HDFS-10287 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5cdb611217f9 3.13.0-36-lowlatency #63-Ubu

[jira] [Updated] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection

2016-05-09 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10380:
---
Status: Patch Available  (was: Open)

> libhdfs++:  Get rid of socket template parameter in RpcConnection
> -
>
> Key: HDFS-10380
> URL: https://issues.apache.org/jira/browse/HDFS-10380
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10380.HDFS-8707.000.patch
>
>
> RpcConnection is always templated on asio::ip::tcp::socket except for in 
> rpc_engine_test.cc.  My understanding is the original reason for using this 
> as a template parameter was to be able to use trait injection in gmock tests.
> This is useful for testing but makes debugging a lot more tricky due to the 
> extra stuff that shows up on the stack.  Heavily templated code also tends to 
> produce very unhelpful compile errors e.g. when a missing semicolon in a 
> templated class turns into pages of errors coming out of stuff that depended 
> on the instantiation.
> This sort of work was already accomplished elsewhere by HDFS-9144, it looks 
> like RpcConnection was one of the few areas they were left in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection

2016-05-09 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10380:
---
Attachment: HDFS-10380.HDFS-8707.000.patch

First quick pass:
-got rid of RpcConnection template parameter
-typedefed asio::ip::tcp::socket as rpc_socket_t, could just be socket_t.
-renamed RpcConnectionImpl::next_layer_ to RpcConnectionImpl::socket_ to make 
the code a bit more self explanatory.

todo:
-Was in a hurry to get this building so just declared all of the methods in 
rpc_connection.h as inline.  Will move to rpc_connection.cc.
-rpc_engine_test (the test the template was there for) needs a lot of work to 
be able to run with this change.  It's fairly dense and has nearly no comments 
so it might be easier to just rewrite it.

> libhdfs++:  Get rid of socket template parameter in RpcConnection
> -
>
> Key: HDFS-10380
> URL: https://issues.apache.org/jira/browse/HDFS-10380
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10380.HDFS-8707.000.patch
>
>
> RpcConnection is always templated on asio::ip::tcp::socket except for in 
> rpc_engine_test.cc.  My understanding is the original reason for using this 
> as a template parameter was to be able to use trait injection in gmock tests.
> This is useful for testing but makes debugging a lot more tricky due to the 
> extra stuff that shows up on the stack.  Heavily templated code also tends to 
> produce very unhelpful compile errors e.g. when a missing semicolon in a 
> templated class turns into pages of errors coming out of stuff that depended 
> on the instantiation.
> This sort of work was already accomplished elsewhere by HDFS-9144, it looks 
> like RpcConnection was one of the few areas they were left in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection

2016-05-09 Thread James Clampffer (JIRA)
James Clampffer created HDFS-10380:
--

 Summary: libhdfs++:  Get rid of socket template parameter in 
RpcConnection
 Key: HDFS-10380
 URL: https://issues.apache.org/jira/browse/HDFS-10380
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


RpcConnection is always templated on asio::ip::tcp::socket except for in 
rpc_engine_test.cc.  My understanding is the original reason for using this as 
a template parameter was to be able to use trait injection in gmock tests.

This is useful for testing but makes debugging a lot more tricky due to the 
extra stuff that shows up on the stack.  Heavily templated code also tends to 
produce very unhelpful compile errors e.g. when a missing semicolon in a 
templated class turns into pages of errors coming out of stuff that depended on 
the instantiation.

This sort of work was already accomplished elsewhere by HDFS-9144, it looks 
like RpcConnection was one of the few areas they were left in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7212) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2016-05-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276614#comment-15276614
 ] 

Chris Nauroth commented on HDFS-7212:
-

Hello [~rajeshhadoop].  Based on these symptoms, you'll probably be interested 
in multiple bug fixes in short-circuit read that went in after Apache Hadoop 
2.4.0: HADOOP-11333, HADOOP-11604, HADOOP-11648, HADOOP-11802 and HDFS-8429.

> FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
> very long time
> -
>
> Key: HDFS-7212
> URL: https://issues.apache.org/jira/browse/HDFS-7212
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
> Environment: PROD
>Reporter: Istvan Szukacs
>
> There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM 
> and rendering the service unusable, missing heartbeats and stopping data 
> access. The threads look like this:
> {code}
> 3415 (state = BLOCKED)
> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
> be imprecise)
> - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=186 (Compiled frame)
> - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=834 (Interpreted frame)
> - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=867 (Interpreted frame)
> - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, 
> line=1197 (Interpreted frame)
> - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, 
> line=214 (Compiled frame)
> - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled 
> frame)
> - 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket,
>  org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 
> (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String,
>  org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted 
> frame)
> - 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String)
>  @bci=212, line=413 (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream)
>  @bci=13, line=172 (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op)
>  @bci=149, line=92 (Compiled frame)
> - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 
> (Compiled frame)
> - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame)
> {code}
> Has anybody seen this before?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10322) DomainSocket error lead to more and more DataNode thread waiting

2016-05-09 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10322:
-
Summary: DomainSocket error lead to more and more DataNode thread waiting   
(was: DomianSocket error lead to more and more DataNode thread waiting )

> DomainSocket error lead to more and more DataNode thread waiting 
> -
>
> Key: HDFS-10322
> URL: https://issues.apache.org/jira/browse/HDFS-10322
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: ChenFolin
> Fix For: 2.6.4
>
>
> When open short read and  a DomianSoket broken pipe error happened,The 
> Datanode will produce more and more waiting threads.
>  It is similar to Bug HADOOP-11802, but i do not think they are same problem, 
> because the DomainSocket thread is in Running state.
> stack log:
> "DataXceiver for client unix:/var/run/hadoop-hdfs/dn.50010 [Waiting for 
> operation #1]" daemon prio=10 tid=0x0278e000 nid=0x2bc6 waiting on 
> condition [0x7f2d6e4a5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00061c493500> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:316)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:394)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:93)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
>   at java.lang.Thread.run(Thread.java:745)
> =DomianSocketWatcher
> "Thread-759187" daemon prio=10 tid=0x0219c800 nid=0x8c56 runnable 
> [0x7f2dbe4cb000]
>java.lang.Thread.State: RUNNABLE
>   at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native Method)
>   at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52)
>   at 
> org.apache.hadoop.net.unix.DomainSocketWatcher$1.run(DomainSocketWatcher.java:474)
>   at java.lang.Thread.run(Thread.java:745)
> ===datanode error log
> ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
> datanode-:50010:DataXceiver error processing REQUEST_SHORT_CIRCUIT_SHM 
> operation src: unix:/var/run/hadoop-hdfs/dn.50010 dst: 
> java.net.SocketException: write(2) error: Broken pipe
> at org.apache.hadoop.net.unix.DomainSocket.writeArray0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.access$300(DomainSocket.java:45)
> at 
> org.apache.hadoop.net.unix.DomainSocket$DomainOutputStream.write(DomainSocket.java:601)
> at 
> com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833)
> at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843)
> at 
> com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:91)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.sendShmSuccessResponse(DataXceiver.java:371)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:409)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:178)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:93)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-05-09 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-6489:
---
Attachment: HDFS-6489.007.patch

Andrew! Thanks for the review. Here's a patch with the changes you suggest. I 
am sticking with conditional for RBW for now because 1. Its the common case, 2. 
RUR doesn't have {{originalBytesReserved}}.

Even with this, {{dfsUsed}} and {{numblocks}} counting is all messed up. e.g. 
[FsDatasetImpl.removeOldBlock|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L2886]
 calls {{decDfsUsedAndNumBlocks}} twice (so even though {{dfsUsed}} is 
correctly decremented, {{numBlocks}} is not) . [~brahmareddy], [~vinayrpet] am 
I reading this right?

To really get this sorted out, we should probably have a unit test framework 
that tests DN side accounting is proper for several different operations (block 
creation, appends, block transfer (e.g. from 1 storage to another), etc.). 
Unfortunately, I don't think I'll have the cycles for that. Sorry :(
Arpit has seemed amenable to [removing 
reservations|https://issues.apache.org/jira/browse/HDFS-9530?focusedCommentId=15248968&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15248968]
 if there is some other alternative. I think we should disable the rejection of 
writes by DNs based on reservations until we can be sure that our accounting is 
correct. Just my $0.02

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, 
> HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor reassigned HDFS-10287:
---

Assignee: Andras Bokor  (was: John Zhuge)

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: Andras Bokor
>Priority: Trivial
> Attachments: HDFS-10287.01.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10287:

Status: Patch Available  (was: Open)

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
> Attachments: HDFS-10287.01.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable

2016-05-09 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276434#comment-15276434
 ] 

Andras Bokor commented on HDFS-10287:
-

I see your point about the tests.
These tests were added with HDFS-2209. Before this patch {{MiniDFSCluster}} 
used only {{test.build.data}} system property to determine the base directory. 
Now the base directory can be set through config object. Based on the comments 
on HDFS-2209 before the patch to create two instance in the same JVM was not 
possible (that is not 100% clear to me why. For me it seems setting the system 
property between the two cluster initialization should work.). It seems to me 
{{testDualCluster}} is a proof of concept that the new feature works. But 
indeed, the first test proves that the {{MiniDFSCluster}} uses 
{{hdfs.minidfs.basedir}} property well.
What is your suggestion? Leave as it is or modify/remove?

> MiniDFSCluster should implement AutoCloseable
> -
>
> Key: HDFS-10287
> URL: https://issues.apache.org/jira/browse/HDFS-10287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
> Attachments: HDFS-10287.01.patch
>
>
> {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support 
> [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html].
>  It will make test code a little cleaner and more reliable.
> Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be 
> backported to Hadoop version prior to 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276409#comment-15276409
 ] 

Hadoop QA commented on HDFS-10333:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 3s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 54m 45s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 136m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802974/HDFS-10333.001.patch |
| JIRA Issue | HDFS-10333 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux beb0d668ac24 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git re

[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume

2016-05-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276350#comment-15276350
 ] 

Kihwal Lee commented on HDFS-10372:
---

[~shahrs87], it's up to you to decide whether to improve it as [~iwasakims] 
suggested.

> Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
> ---
>
> Key: HDFS-10372
> URL: https://issues.apache.org/jira/browse/HDFS-10372
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-10372.patch
>
>
> TestFsDatasetImpl#testCleanShutdownOfVolume fails very often.
> We added more debug information in HDFS-10260 to find out why this test is 
> failing.
> Now I think I know the root cause of failure.
> I thought that {{LocatedBlock#getLocations()}} returns an array of 
> DatanodeInfo but now I realized that it returns an array of 
> DatandeStorageInfo (which is subclass of DatanodeInfo).
> In the test I intended to check whether the exception contains the xfer 
> address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns 
> the xfer address, I checked whether exception contains 
> {{DatanodeInfo#toString}} or not.
> But since  {{LocatedBlock#getLocations()}} returned an array of 
> DatanodeStorageInfo, it has storage info in the toString() implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276285#comment-15276285
 ] 

Yiqun Lin commented on HDFS-10333:
--

Sorry, I correct the opinion before:
{quote}
The test will failed once the socket io timeout happens
{quote}
This should be modify as The test will has a more chance to fail once socket io 
timeout frquently happen over a period of time.

> Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
> ---
>
> Key: HDFS-10333
> URL: https://issues.apache.org/jira/browse/HDFS-10333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10333.001.patch
>
>
> Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25):
> {code}
> --
>  T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.hdfs.TestFileAppend
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 3.674 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166)
> at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
> at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> However, when I run with Java1.7, the test is sometimes successful, and it 
> sometimes fails with 
> {code}
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 9.099 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> The failure of this test is intermittent, but it fails pretty often.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin reassigned HDFS-10333:


Assignee: Yiqun Lin

> Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
> ---
>
> Key: HDFS-10333
> URL: https://issues.apache.org/jira/browse/HDFS-10333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10333.001.patch
>
>
> Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25):
> {code}
> --
>  T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.hdfs.TestFileAppend
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 3.674 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166)
> at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
> at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> However, when I run with Java1.7, the test is sometimes successful, and it 
> sometimes fails with 
> {code}
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 9.099 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> The failure of this test is intermittent, but it fails pretty often.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10333:
-
Status: Patch Available  (was: Open)

> Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
> ---
>
> Key: HDFS-10333
> URL: https://issues.apache.org/jira/browse/HDFS-10333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10333.001.patch
>
>
> Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25):
> {code}
> --
>  T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.hdfs.TestFileAppend
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 3.674 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166)
> at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
> at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> However, when I run with Java1.7, the test is sometimes successful, and it 
> sometimes fails with 
> {code}
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 9.099 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> The failure of this test is intermittent, but it fails pretty often.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10333:
-
Attachment: HDFS-10333.001.patch

> Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
> ---
>
> Key: HDFS-10333
> URL: https://issues.apache.org/jira/browse/HDFS-10333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yiqun Lin
> Attachments: HDFS-10333.001.patch
>
>
> Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25):
> {code}
> --
>  T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.hdfs.TestFileAppend
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 3.674 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166)
> at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
> at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> However, when I run with Java1.7, the test is sometimes successful, and it 
> sometimes fails with 
> {code}
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 9.099 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599)
> {code}
> The failure of this test is intermittent, but it fails pretty often.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk

2016-05-09 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276253#comment-15276253
 ] 

Yiqun Lin commented on HDFS-10333:
--

I tested this test and other similar tests in {{TestFileAppend}} in my local. I 
found connection refused happed sometimes due to socket io timeout. This are 
logs
{code}
2016-05-09 17:33:49,828 [DataXceiver for client 
DFSClient_NONMAPREDUCE_140749666_1 at /127.0.0.1:58040 [Receiving block 
BP-2032095287-127.0.0.1-1462786332089:blk_1073741827_1003]] ERROR 
datanode.DataNode (DataXceiver.java:run(316)) - 127.0.0.1:58021:DataXceiver 
error processing WRITE_BLOCK operation  src: /127.0.0.1:58040 dst: 
/127.0.0.1:58021
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:746)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:171)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:105)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:289)
at java.lang.Thread.run(Thread.java:745)
{code}
And then this will cause IOException and set first target node as a bad 
node(actually it means  datanode that failed in connection setup)into the 
response.
{code}
} catch (IOException e) {
  if (isClient) {
BlockOpResponseProto.newBuilder()
  .setStatus(ERROR)
   // NB: Unconditionally using the xfer addr w/o hostname
  .setFirstBadLink(targets[0].getXferAddr())
  .build()
  .writeDelimitedTo(replyOut);
replyOut.flush();
  }
{code}
And this node index will be set into errorState and then marked this node as 
bad node. The code will not return false here
{code}
if (!errorState.hasDatanodeError() && !shouldHandleExternalError()) {
  return false;
}
{code}
,then continue to execute {{setupPipelineForAppendOrRecovery}} in 
{{processDatanodeOrExternalError}}. And finally it will print the msg "no more 
good datanodes being available to replace a bad datanode on the existing 
pipeline".

So we should disable the property 
{{dfs.client.block.write.replace-datanode-on-failure.enable}}, and there is no 
need to set policy of 
{{dfs.client.block.write.replace-datanode-on-failure.policy}} here. The test 
will failed once the socket io timeout happens. In addition, the similar test 
{{TestFileAppend#testMultiAppend2}} has already disabled this.

I will post a patch for this later, thanks review.

> Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
> ---
>
> Key: HDFS-10333
> URL: https://issues.apache.org/jira/browse/HDFS-10333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>
> Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25):
> {code}
> --
>  T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.hdfs.TestFileAppend
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend
> testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend)  Time elapsed: 
> 3.674 sec  <<< ERROR!
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166)
> at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232)
> at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatano

[jira] [Commented] (HDFS-9803) Proactively refresh ShortCircuitCache entries to avoid latency spikes

2016-05-09 Thread Jery ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276176#comment-15276176
 ] 

Jery ma commented on HDFS-9803:
---

Let's say for example that the application does not read all the way through to 
the tenth block within the block access token expiration.  At that point, the 
client would recover internally by issuing another RPC to the NameNode to fetch 
a fresh block access token with a new expiration.  Then, the read would 
continue as normal
in fact,the regionerver doesn't  fetch a fresh token

> Proactively refresh ShortCircuitCache entries to avoid latency spikes
> -
>
> Key: HDFS-9803
> URL: https://issues.apache.org/jira/browse/HDFS-9803
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Nick Dimiduk
>
> My region server logs are flooding with messages like 
> "SecretManager$InvalidToken: access control error while attempting to set up 
> short-circuit access to  ... is expired". These logs 
> correspond with responseTooSlow WARNings from the region server.
> {noformat}
> 2016-01-19 22:10:14,432 INFO  
> [B.defaultRpcServer.handler=4,queue=1,port=16020] 
> shortcircuit.ShortCircuitCache: ShortCircuitCache(0x71bdc547): could not load 
> 1074037633_BP-1145309065-XXX-1448053136416 due to InvalidToken exception.
> org.apache.hadoop.security.token.SecretManager$InvalidToken: access control 
> error while attempting to set up short-circuit access to  token 
> with block_token_identifier (expiryDate=1453194430724, keyId=1508822027, 
> userId=hbase, blockPoolId=BP-1145309065-XXX-1448053136416, 
> blockId=1074037633, access modes=[READ]) is expired.
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
> ...
> {noformat}
> A potential solution could be to have a background thread that makes a best 
> effort to proactively refreshes tokens in the cache before they expire, so as 
> to minimize latency impact on the critical path.
> Thanks to [~cnauroth] for providing an explaination and suggesting a solution 
> over on the [user 
> list|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-09 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Attachment: HDFS-8449-v11.patch

v11 fix checkstyle problems

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, 
> HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, 
> HDFS-8449-v10.patch, HDFS-8449-v11.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-05-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276084#comment-15276084
 ] 

Hadoop QA commented on HDFS-8449:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 17 new + 
97 unchanged - 3 fixed = 114 total (was 100) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 53s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 19s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 152m 40s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestAsyncDFSRename |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802921/HDFS-8449-v10.patch |
| JIRA Issue | HDFS-8449 |
| Optional Tests |  as

[jira] [Commented] (HDFS-7212) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2016-05-09 Thread Rajesh Chandramohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276037#comment-15276037
 ] 

Rajesh Chandramohan commented on HDFS-7212:
---

Hi ,

 We have this similar issue , where DN sockets getting stuck. ( Hadoop version 
2.4.0)
===
# lsof /var/run/hadoop-hdfs/dn | wc -l 
6876 

jstack shows ==
Thread 5295: (state = BLOCKED) 
- sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
be imprecise) 
- java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
line=186 (Compiled frame) 
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() 
@bci=42, line=2043 (Compiled frame) 
- 
org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket,
 org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=99, line=316 
(Interpreted frame) 
- 
org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String,
 org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted 
frame) 
- 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String)
 @bci=212, line=413 (Interpreted frame) 
- 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream)
 @bci=13, line=172 (Interpreted frame) 
- 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op)
 @bci=149, line=92 (Compiled frame) 
- org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 
(Compiled frame) 
- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

 All most 6k threads in this state, after a while DN gets hung. Do we have 
details regarding this issue. Any patch available for it. Looks like similar 
kind of issue.

> FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
> very long time
> -
>
> Key: HDFS-7212
> URL: https://issues.apache.org/jira/browse/HDFS-7212
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
> Environment: PROD
>Reporter: Istvan Szukacs
>
> There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM 
> and rendering the service unusable, missing heartbeats and stopping data 
> access. The threads look like this:
> {code}
> 3415 (state = BLOCKED)
> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
> be imprecise)
> - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=186 (Compiled frame)
> - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=834 (Interpreted frame)
> - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=867 (Interpreted frame)
> - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, 
> line=1197 (Interpreted frame)
> - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, 
> line=214 (Compiled frame)
> - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled 
> frame)
> - 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket,
>  org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 
> (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String,
>  org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted 
> frame)
> - 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String)
>  @bci=212, line=413 (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream)
>  @bci=13, line=172 (Interpreted frame)
> - 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op)
>  @bci=149, line=92 (Compiled frame)
> - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 
> (Compiled frame)
> - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame)
> {code}
> Has anybody seen this before?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org