[jira] [Commented] (HDFS-7419) Improve error messages for DataNode hot swap drive feature

2014-11-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222753#comment-14222753
 ] 

Hadoop QA commented on HDFS-7419:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683262/HDFS-7419.003.patch
  against trunk revision a4df9ee.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.security.ssl.TestReloadingX509TrustManager

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8815//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8815//console

This message is automatically generated.

> Improve error messages for DataNode hot swap drive feature
> --
>
> Key: HDFS-7419
> URL: https://issues.apache.org/jira/browse/HDFS-7419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7419.000.patch, HDFS-7419.001.patch, 
> HDFS-7419.002.patch, HDFS-7419.003.patch
>
>
> When DataNode fails to add a volume, it adds one failure message to 
> {{errorMessageBuilder}} in {{DataNode#refreshVolumes}}. However, the detailed 
> error messages are not logged in DataNode's log and they are emitted from 
> clients. 
> This JIRA makes {{DataNode}} reports detailed failure in its log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu reassigned HDFS-7392:


Assignee: Yi Liu

> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> -
>
> Key: HDFS-7392
> URL: https://issues.apache.org/jira/browse/HDFS-7392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Frantisek Vacek
>Assignee: Yi Liu
>Priority: Critical
> Attachments: 1.png, 2.png
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.example.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> {quote}
> [~/proj/quickbox]$ nslookup share.example.com
> Server: 127.0.1.1
> Address:127.0.1.1#53
> share.example.com canonical name = 
> internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com.
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.223
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.65
> {quote}
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7392:
-
Priority: Major  (was: Critical)

> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> -
>
> Key: HDFS-7392
> URL: https://issues.apache.org/jira/browse/HDFS-7392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Frantisek Vacek
>Assignee: Yi Liu
> Attachments: 1.png, 2.png
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.example.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> {quote}
> [~/proj/quickbox]$ nslookup share.example.com
> Server: 127.0.1.1
> Address:127.0.1.1#53
> share.example.com canonical name = 
> internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com.
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.223
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.65
> {quote}
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Attachment: HDFS-7431.001.patch

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7431.001.patch
>
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown and we write log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Status: Patch Available  (was: Open)

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7431.001.patch
>
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown and we write log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222737#comment-14222737
 ] 

Yongjun Zhang commented on HDFS-4882:
-

Hi [~cmccabe] and  [~vinayrpet],

Thanks for your comments.

{quote}
 can you comment on whether you have also observed this bug?
{quote}
Yes, I did observe a similar infinite loop, and by studying the code, I 
concluded that the case I was looking at has exactly the same root cause as the 
one reported here. Please see details described in my earlier comment at 
https://issues.apache.org/jira/browse/HDFS-4882?focusedCommentId=14213992&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14213992,
 the several comments after that.

In short, when the penultimate block is COMMITTED and the last block is 
COMPLETE, the following block of code will be executed
{code}
   switch(lastBlockState) {
case COMPLETE:
  assert false : "Already checked that the last block is incomplete";
  break;
{code}
and return back to LeaseManager without releasing the corresponding lease, 
which stays as the first element in {{sortedLeases}}. The leaseManager keeps 
examining the first entry in sortedLease again and again, while holding the 
FSNamesystem#writeLock, thus causing the infinite loop.

{quote}
Yes, you are right. Even though I don't see the possibility of infinite loop by 
considering existing code in trunk, changes made in the patch looks pretty cool.
{quote}
See above for the explanation about infinite loop in the existing code.

{quote}
Yes, lets continue this discussion in HDFS-7342
{quote}
In HDFS-7342, Ravi worked out a testcase to demonstrate the problem and I 
suggested a solution. Thanks in advance for your review and comments there. 
Hope we can get to a converged solution soon. Since avoiding infinite loop is 
just part of the complete solution, and the other part is to get the lease 
released,  which is what HDFS-7342 tries to address.

Thanks.


> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.3.patch, HDFS-4882.4.patch, HDFS-4882.5.patch, 
> HDFS-4882.6.patch, HDFS-4882.7.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7427) [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only

2014-11-23 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7427:
---
Description: 
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage /srv/image
No GC_PROFILE is given. Defaults to medium.
fetchImage: FileSystem file:/// is not an HDFS file system
Usage: java DFSAdmin [-fetchImage ]

{code}
public int fetchImage(final String[] argv, final int idx) throws IOException {
Configuration conf = getConf();
final URL infoServer = DFSUtil.getInfoServer(
HAUtil.getAddressOfActive(getDFS()), conf,
DFSUtil.getHttpClientScheme(conf)).toURL();
SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
  @Override
  public Void run() throws Exception {
TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
new File(argv[idx]));
return null;
  }
});
return 0;
  }

{code}

  was:
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 *
 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage file:///srv/Bigdata* 
No GC_PROFILE is given. Defaults to medium.
log4j:WARN Failed to set property [maxBackupIndex] to value "". 
OutPut : 123456
{color:red}fetchImage: Unable to download to any storage directory{color}


> [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only
> --
>
> Key: HDFS-7427
> URL: https://issues.apache.org/jira/browse/HDFS-7427
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Priority: Critical
>
> Scenario:
> Start cluster in securemode and enable only HTTPS
> Run fectchimage command 
>  [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage /srv/image
> No GC_PROFILE is given. Defaults to medium.
> fetchImage: FileSystem file:/// is not an HDFS file system
> Usage: java DFSAdmin [-fetchImage ]
> {code}
> public int fetchImage(final String[] argv, final int idx) throws IOException {
> Configuration conf = getConf();
> final URL infoServer = DFSUtil.getInfoServer(
> HAUtil.getAddressOfActive(getDFS()), conf,
> DFSUtil.getHttpClientScheme(conf)).toURL();
> SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
>   @Override
>   public Void run() throws Exception {
> TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
> new File(argv[idx]));
> return null;
>   }
> });
> return 0;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7427) [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only

2014-11-23 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7427:
---
Description: 
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 *
 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage file:///srv/Bigdata* 
No GC_PROFILE is given. Defaults to medium.
log4j:WARN Failed to set property [maxBackupIndex] to value "". 
OutPut : 123456
{color:red}fetchImage: Unable to download to any storage directory{color}

  was:
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage /srv/image
No GC_PROFILE is given. Defaults to medium.
fetchImage: FileSystem file:/// is not an HDFS file system
Usage: java DFSAdmin [-fetchImage ]

{code}
public int fetchImage(final String[] argv, final int idx) throws IOException {
Configuration conf = getConf();
final URL infoServer = DFSUtil.getInfoServer(
HAUtil.getAddressOfActive(getDFS()), conf,
DFSUtil.getHttpClientScheme(conf)).toURL();
SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
  @Override
  public Void run() throws Exception {
TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
new File(argv[idx]));
return null;
  }
});
return 0;
  }

{code}


> [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only
> --
>
> Key: HDFS-7427
> URL: https://issues.apache.org/jira/browse/HDFS-7427
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Priority: Critical
>
> Scenario:
> Start cluster in securemode and enable only HTTPS
> Run fectchimage command 
>  *
>  [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage file:///srv/Bigdata* 
> No GC_PROFILE is given. Defaults to medium.
> log4j:WARN Failed to set property [maxBackupIndex] to value "". 
> OutPut : 123456
> {color:red}fetchImage: Unable to download to any storage directory{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7403) Inaccurate javadoc of BlockUCState#COMPLETE state

2014-11-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222735#comment-14222735
 ] 

Yongjun Zhang commented on HDFS-7403:
-

Thanks a lot Yi!


> Inaccurate javadoc of  BlockUCState#COMPLETE state
> --
>
> Key: HDFS-7403
> URL: https://issues.apache.org/jira/browse/HDFS-7403
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7403.001.patch, HDFS-7403.002.patch, 
> HDFS-7403.003.patch
>
>
> The current javadoc says 
> {code}
>  /**
>* States, which a block can go through while it is under construction.
>*/
>   static public enum BlockUCState {
> /**
>  * Block construction completed.
>  * The block has at least one {@link ReplicaState#FINALIZED} replica,
>  * and is not going to be modified.
>  */
> COMPLETE,
> {code}
> However, COMPLETE blocks mean those that have reached minimal replication  
> "dfs.namenode.replication.min", which could be different than one.
>  
> Creating this jira to fix the javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222731#comment-14222731
 ] 

Yi Liu edited comment on HDFS-7431 at 11/24/14 7:06 AM:


So there are two cases for {{InvalidMagicNumberException}}:
*. failed handshake for encryption.
*. failed handshake for data transfer protection (only available after 2.6.0)

A simple way is change the log message to something like:
{quote}
Failed to read expected handshake from client at  Perhaps the client is 
running an older version of Hadoop which does not support the correct handshake.
{quote}

Or we can distinguish these two kinds of failure and give different log 
messages.
What's your suggestion, [~cnauroth]? 


was (Author: hitliuyi):
So there are two cases for {{InvalidMagicNumberException}}:
*. failed handshake for encryption.
*. failed handshake for data transfer protection (only available after 2.6.0)

A simply way is change the log message to something like:
{quote}
Failed to read expected handshake from client at  Perhaps the client is 
running an older version of Hadoop which does not support the correct handshake.
{quote}

Or we can distinguish these two kinds of failure and give different log 
messages.
What's your suggestion, [~cnauroth]? 

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown and we write log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222731#comment-14222731
 ] 

Yi Liu commented on HDFS-7431:
--

So there are two cases for {{InvalidMagicNumberException}}:
*. failed handshake for encryption.
*. failed handshake for data transfer protection (only available after 2.6.0)

A simply way is change the log message to something like:
{quote}
Failed to read expected handshake from client at  Perhaps the client is 
running an older version of Hadoop which does not support the correct handshake.
{quote}

Or we can distinguish these two kinds of failure and give different log 
messages.
What's your suggestion, [~cnauroth]? 

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown and we write log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Description: 
For security mode, HDFS now supports that Datanodes don't require root or jsvc 
if {{dfs.data.transfer.protection}} is configured.

Log message for {{InvalidMagicNumberException}}, we miss one case: 
when the datanodes run on unprivileged port and 
{{dfs.data.transfer.protection}} is configured to {{authentication}} but 
{{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required and 
a low version dfs client is used, then {{InvalidMagicNumberException}} is 
thrown and we write log:
{quote}
Failed to read expected encryption handshake from client at  Perhaps the 
client is running an older version of Hadoop which does not support encryption
{quote}

Recently I run HDFS built on trunk and security is enabled, but the client is 
2.5.1 version. Then I got the above log message, but actually I have not 
configured encryption.

  was:
For security mode, HDFS now supports that Datanodes don't require root or jsvc 
if {{dfs.data.transfer.protection}} is configured.

Log message for {{InvalidMagicNumberException}}, we miss one case: 
when the datanodes run on unprivileged port and 
{{dfs.data.transfer.protection}} is configured to {{authentication}} but 
{{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required and 
a low version dfs client is used, then {{InvalidMagicNumberException}} is 
thrown with log:
{quote}
Failed to read expected encryption handshake from client at  Perhaps the 
client is running an older version of Hadoop which does not support encryption
{quote}

Recently I run HDFS built on trunk and security is enabled, but the client is 
2.5.1 version. Then I got the above log message, but actually I have not 
configured encryption.


> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown and we write log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Description: 
For security mode, HDFS now supports that Datanodes don't require root or jsvc 
if {{dfs.data.transfer.protection}} is configured.

Log message for {{InvalidMagicNumberException}}, we miss one case: 
when the datanodes run on unprivileged port and 
{{dfs.data.transfer.protection}} is configured to {{authentication}} but 
{{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required and 
a low version dfs client is used, then {{InvalidMagicNumberException}} is 
thrown with log:
{quote}
Failed to read expected encryption handshake from client at  Perhaps the 
client is running an older version of Hadoop which does not support encryption
{quote}

Recently I run HDFS built on trunk and security is enabled, but the client is 
2.5.1 version. Then I got the above log message, but actually I have not 
configured encryption.

  was:
For security mode, HDFS now supports that Datanodes don't require root or jsvc 
if {{dfs.data.transfer.protection}} is configured.

Log message for {{InvalidMagicNumberException}}, we miss one case: 
when the datanodes run on unprivileged port and 
{{dfs.data.transfer.protection}} is configured to {{authentication}} but 
{{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required. 
But a low version dfs client is used, then {{InvalidMagicNumberException}} is 
thrown with log:
{quote}
Failed to read expected encryption handshake from client at  Perhaps the 
client is running an older version of Hadoop which does not support encryption
{quote}

Recently I run HDFS built on trunk and security is enabled, but the client is 
2.5.1 version. Then I got the above log message, but actually I have not 
configured encryption.


> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
> and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown with log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Priority: Minor  (was: Major)
Target Version/s: 2.7.0

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required. 
> But a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown with log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7431:
-
Target Version/s:   (was: 2.7.0)

> log message for InvalidMagicNumberException may be incorrect
> 
>
> Key: HDFS-7431
> URL: https://issues.apache.org/jira/browse/HDFS-7431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Yi Liu
>Assignee: Yi Liu
>
> For security mode, HDFS now supports that Datanodes don't require root or 
> jsvc if {{dfs.data.transfer.protection}} is configured.
> Log message for {{InvalidMagicNumberException}}, we miss one case: 
> when the datanodes run on unprivileged port and 
> {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
> {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required. 
> But a low version dfs client is used, then {{InvalidMagicNumberException}} is 
> thrown with log:
> {quote}
> Failed to read expected encryption handshake from client at  Perhaps the 
> client is running an older version of Hadoop which does not support encryption
> {quote}
> Recently I run HDFS built on trunk and security is enabled, but the client is 
> 2.5.1 version. Then I got the above log message, but actually I have not 
> configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-23 Thread Yi Liu (JIRA)
Yi Liu created HDFS-7431:


 Summary: log message for InvalidMagicNumberException may be 
incorrect
 Key: HDFS-7431
 URL: https://issues.apache.org/jira/browse/HDFS-7431
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu


For security mode, HDFS now supports that Datanodes don't require root or jsvc 
if {{dfs.data.transfer.protection}} is configured.

Log message for {{InvalidMagicNumberException}}, we miss one case: 
when the datanodes run on unprivileged port and 
{{dfs.data.transfer.protection}} is configured to {{authentication}} but 
{{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required. 
But a low version dfs client is used, then {{InvalidMagicNumberException}} is 
thrown with log:
{quote}
Failed to read expected encryption handshake from client at  Perhaps the 
client is running an older version of Hadoop which does not support encryption
{quote}

Recently I run HDFS built on trunk and security is enabled, but the client is 
2.5.1 version. Then I got the above log message, but actually I have not 
configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7342) Lease Recovery doesn't happen some times

2014-11-23 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222707#comment-14222707
 ] 

Vinayakumar B commented on HDFS-7342:
-

I got the reason, why [~raviprak] is facing the problem and I am not. 

You have mentioned affected version as 2.0.0-alpha, which came to help.
The possibility of making the last block COMPLETE when the penultimate block 
still in COMMITTED state was before the fix of HDFS-5558 which was fixed in 
2.3.0 version, though the problem faced at that time was not infinite loop, but 
crash of the lease monitor thread.

After the fix, last block cannot be in COMPLETE state when others blocks are 
not COMPLETED, and infinite loop never occurs.

I hope this clears the confusion. And I think, there is no change required in 
this Jira. In that case, can this be closed as Duplicate?

Hi  [~yzhangal], any thoughts?

> Lease Recovery doesn't happen some times
> 
>
> Key: HDFS-7342
> URL: https://issues.apache.org/jira/browse/HDFS-7342
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7342.1.patch, HDFS-7342.2.patch
>
>
> In some cases, LeaseManager tries to recover a lease, but is not able to. 
> HDFS-4882 describes a possibility of that. We should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7403) Inaccurate javadoc of BlockUCState#COMPLETE state

2014-11-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222685#comment-14222685
 ] 

Hudson commented on HDFS-7403:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6592 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6592/])
HDFS-7403. Inaccurate javadoc of  BlockUCState#COMPLETE state. (Yongjun Zhang 
via yliu) (yliu: rev 555fa2d9d0dbb3bf2b209a953eb07a59bbfe3197)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Inaccurate javadoc of  BlockUCState#COMPLETE state
> --
>
> Key: HDFS-7403
> URL: https://issues.apache.org/jira/browse/HDFS-7403
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7403.001.patch, HDFS-7403.002.patch, 
> HDFS-7403.003.patch
>
>
> The current javadoc says 
> {code}
>  /**
>* States, which a block can go through while it is under construction.
>*/
>   static public enum BlockUCState {
> /**
>  * Block construction completed.
>  * The block has at least one {@link ReplicaState#FINALIZED} replica,
>  * and is not going to be modified.
>  */
> COMPLETE,
> {code}
> However, COMPLETE blocks mean those that have reached minimal replication  
> "dfs.namenode.replication.min", which could be different than one.
>  
> Creating this jira to fix the javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7403) Inaccurate javadoc of BlockUCState#COMPLETE state

2014-11-23 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7403:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Commit to trunk and branch-2, thanks Yongjun for the contribution.

> Inaccurate javadoc of  BlockUCState#COMPLETE state
> --
>
> Key: HDFS-7403
> URL: https://issues.apache.org/jira/browse/HDFS-7403
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-7403.001.patch, HDFS-7403.002.patch, 
> HDFS-7403.003.patch
>
>
> The current javadoc says 
> {code}
>  /**
>* States, which a block can go through while it is under construction.
>*/
>   static public enum BlockUCState {
> /**
>  * Block construction completed.
>  * The block has at least one {@link ReplicaState#FINALIZED} replica,
>  * and is not going to be modified.
>  */
> COMPLETE,
> {code}
> However, COMPLETE blocks mean those that have reached minimal replication  
> "dfs.namenode.replication.min", which could be different than one.
>  
> Creating this jira to fix the javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-23 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222680#comment-14222680
 ] 

Vinayakumar B commented on HDFS-4882:
-

bq. This change makes the code more robust because it avoids going into an 
infinite loop in the case where the lease is not removed from 
LeaseManager#leases during the loop body. The change doesn't harm anything... 
things are just as efficient as before, and in the unlikely case that we can't 
remove the lease, we log a warning message so we are aware of the problem
Yes, you are right. Even though I don't see the possibility of infinite loop by 
considering existing code in trunk, changes made in the patch looks pretty cool.

I am +1 for the change.

{quote}Why don't we continue the discussion about the sequence of operations 
that could trigger this over on HDFS-7342? And commit this in the meantime to 
fix the immediate problem for Ravi Prakash. I am +1, any objections to 
committing this tomorrow?
Also, Yongjun Zhang, can you comment on whether you have also observed this 
bug? Vinayakumar seems to be questioning whether this loop can occur, but I 
thought you had seen the LeaseManager thread loop in the field... I apologize 
if I'm putting words in your mouth, though.{quote}
Yes, lets continue this discussion in HDFS-7342

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.3.patch, HDFS-4882.4.patch, HDFS-4882.5.patch, 
> HDFS-4882.6.patch, HDFS-4882.7.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7429) DomainSocketWatcher.doPoll0 stuck

2014-11-23 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HDFS-7429:
---
Attachment: 11241025
11241023
11241021

Upload more stack trace files.

> DomainSocketWatcher.doPoll0 stuck
> -
>
> Key: HDFS-7429
> URL: https://issues.apache.org/jira/browse/HDFS-7429
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhaoyunjiong
> Attachments: 11241021, 11241023, 11241025
>
>
> I found some of our DataNodes will run "exceeds the limit of concurrent 
> xciever", the limit is 4K.
> After check the stack, I suspect that DomainSocketWatcher.doPoll0 stuck:
> {quote}
> "DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
> #1]" daemon prio=10 tid=0x7f55c5576000 nid=0x385d waiting on condition 
> [0x7f558d5d4000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x000740df9c90> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:286)
> at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> --
> "DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
> #1]" daemon prio=10 tid=0x7f55c5575000 nid=0x37b3 runnable 
> [0x7f558d3d2000]
>java.lang.Thread.State: RUNNABLE
> at org.apache.hadoop.net.unix.DomainSocket.writeArray0(Native Method)
> at 
> org.apache.hadoop.net.unix.DomainSocket.access$300(DomainSocket.java:45)
> at 
> org.apache.hadoop.net.unix.DomainSocket$DomainOutputStream.write(DomainSocket.java:589)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.kick(DomainSocketWatcher.java:350)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:303)
> at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> "DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
> #1]" daemon prio=10 tid=0x7f55c5574000 nid=0x377a waiting on condition 
> [0x7f558d7d6000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x000740df9cb0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:306)
> at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
> at java.lang.Thread.

[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222673#comment-14222673
 ] 

Lars Hofhansl commented on HDFS-6735:
-

s/since we never get into that if block if we coming from a called 
synchronized/since we *only* get into that if block if we coming from a caller 
synchronized/


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v5.txt

Looked through the findbugs warning for DFSInputStream:
* indeed currentNode was wrongly synchronized (was so even before the patch). 
In getCurrentDataNode I had added synchronized(infoLock) but getCurrentData 
should just synchronized as currentNode is seek+read state.
* added a synchronized block in getBlockAt around access to pos, blockEnd, 
currentLocatedBlock. As explained in comment that is not needed, since we never 
get into that if block if we coming from a called synchronized on . But 
if that is so the extra synchronized won't hurt and it should make findbugs 
happy. 


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222659#comment-14222659
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683267/HDFS-7430.002.patch
  against trunk revision a4df9ee.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8816//console

This message is automatically generated.

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7430.002.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7430:
---
Attachment: (was: HDFS-7430.001.patch)

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
> Attachments: HDFS-7430.002.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7430:
---
Assignee: Colin Patrick McCabe
  Status: Patch Available  (was: Open)

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-7430.002.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7430:
---
Attachment: HDFS-7430.002.patch

remove debug message

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
> Attachments: HDFS-7430.002.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7430:
---
Attachment: HDFS-7430.001.patch

* added {{dfs.datanode.scan.period.hours}} to {{hdfs-default.xml}} (it wasn't 
there before)

* This patch adds a new configuration key, 
{{dfs.block.scanner.volume.bytes.per.second}}, which is the maximum number of 
bytes per second we should scan on each volume.  It defaults to 1MB/s per disk. 
 Previously, the maximum rate was a hard-coded 8MB/s for the DN as a whole 
(i.e. NOT per disk).

* Moved {{TestDatanodeBlockScanner#changeReplicaLength}} to 
{{DFSTestUtil#changeReplicaLength}}

* Instead of writing out a "verification log entry" for each replica it scans, 
the scanner now keeps track of a "cursor" which represents the last block to be 
scanned in the block pool slice on the volume.  (So a volume with 3 block pool 
slices may have 3 cursors)  The cursor is saved to a file every few minutes, if 
the cursor is changing.  The {{BlockIterator}} interface in {{FsVolumeSpi}} 
implements these cursors.

* Use one thread per disk.  This avoids situations where a slow or stuck disk 
can effectively stop the blockscanner from making any progress.  It also allows 
us to scale effectively (i.e. in high-density nodes with 20 drives).

* Added methods to get block iterators to {{FSDatasetSpi}}; removed 
{{RollingLogs}} methods from {{FSDatasetSpi}}.

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
> Attachments: HDFS-7430.001.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7430:
---
Attachment: memory.png

Here is a screenshot of how much DN heap the BlockScanner uses with about 500k 
replicas.  It is using slightly more than half of the DataNode heap, about ~50 
MB.

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> -
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
> Attachments: memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-7430:
--

 Summary: Refactor the BlockScanner to use O(1) memory and use 
multiple threads
 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe


We should update the BlockScanner to use a constant amount of memory by keeping 
track of what block was scanned last, rather than by tracking the scan status 
of all blocks in memory.  Also, instead of having just one thread, we should 
have a verification thread per hard disk (or other volume), scanning at a 
configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7419) Improve error messages for DataNode hot swap drive feature

2014-11-23 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7419:

Attachment: HDFS-7419.003.patch

[~cmccabe] Thanks for the reviews! I have updated the patch to address your 
comments.



> Improve error messages for DataNode hot swap drive feature
> --
>
> Key: HDFS-7419
> URL: https://issues.apache.org/jira/browse/HDFS-7419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7419.000.patch, HDFS-7419.001.patch, 
> HDFS-7419.002.patch, HDFS-7419.003.patch
>
>
> When DataNode fails to add a volume, it adds one failure message to 
> {{errorMessageBuilder}} in {{DataNode#refreshVolumes}}. However, the detailed 
> error messages are not logged in DataNode's log and they are emitted from 
> clients. 
> This JIRA makes {{DataNode}} reports detailed failure in its log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222629#comment-14222629
 ] 

Colin Patrick McCabe commented on HDFS-4882:


Hi [~vinayrpet],

This change makes the code more robust because it avoids going into an infinite 
loop in the case where the lease is not removed from {{LeaseManager#leases}} 
during the loop body.  The change doesn't harm anything... things are just as 
efficient as before, and in the unlikely case that we can't remove the lease, 
we log a warning message so we are aware of the problem.

Why don't we continue the discussion about the sequence of operations that 
could trigger this over on HDFS-7342?  And commit this in the meantime to fix 
the immediate problem for [~raviprak].  I am +1, any objections to committing 
this tomorrow?

Also, [~yzhangal], can you comment on whether you have also observed this bug?  
Vinayakumar seems to be questioning whether this loop can occur, but I thought 
you had seen the LeaseManager thread loop in the field... I apologize if I'm 
putting words in your mouth, though.

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.3.patch, HDFS-4882.4.patch, HDFS-4882.5.patch, 
> HDFS-4882.6.patch, HDFS-4882.7.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7429) DomainSocketWatcher.doPoll0 stuck

2014-11-23 Thread zhaoyunjiong (JIRA)
zhaoyunjiong created HDFS-7429:
--

 Summary: DomainSocketWatcher.doPoll0 stuck
 Key: HDFS-7429
 URL: https://issues.apache.org/jira/browse/HDFS-7429
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhaoyunjiong


I found some of our DataNodes will run "exceeds the limit of concurrent 
xciever", the limit is 4K.

After check the stack, I suspect that DomainSocketWatcher.doPoll0 stuck:
{quote}
"DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
#1]" daemon prio=10 tid=0x7f55c5576000 nid=0x385d waiting on condition 
[0x7f558d5d4000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x000740df9c90> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:286)
at 
org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
--
"DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
#1]" daemon prio=10 tid=0x7f55c5575000 nid=0x37b3 runnable 
[0x7f558d3d2000]
   java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.net.unix.DomainSocket.writeArray0(Native Method)
at 
org.apache.hadoop.net.unix.DomainSocket.access$300(DomainSocket.java:45)
at 
org.apache.hadoop.net.unix.DomainSocket$DomainOutputStream.write(DomainSocket.java:589)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.kick(DomainSocketWatcher.java:350)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:303)
at 
org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)

"DataXceiver for client unix:/var/run/hadoop-hdfs/dn [Waiting for operation 
#1]" daemon prio=10 tid=0x7f55c5574000 nid=0x377a waiting on condition 
[0x7f558d7d6000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x000740df9cb0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:306)
at 
org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:283)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:413)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:172)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:92)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
at java.lang.Thread.run(Thread.java:745)
 

"Thread-163852" daemon prio=10 tid=0x7f55c811c800 nid=0x6757 runnable 
[0x7f55aef6e000]
   java.lang.Thread.State: RUNNABLE 
at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native Method)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.access$800(DomainSocketWatcher.java:52)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher$1.run(DomainSocketWatcher.java:457)
at java.lang.Thre

[jira] [Commented] (HDFS-7419) Improve error messages for DataNode hot swap drive feature

2014-11-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222616#comment-14222616
 ] 

Colin Patrick McCabe commented on HDFS-7419:


{code}
@@ -559,10 +559,11 @@ public IOException call() {
 if (ioe != null) {
   errorMessageBuilder.append(String.format("FAILED TO ADD: %s: 
%s\n",
   volume.toString(), ioe.getMessage()));
+  LOG.error("Failed to add volume: " + volume, ioe);
 } else {
   effectiveVolumes.add(volume.toString());
+  LOG.info("Storage directory is loaded: " + volume.toString());
 }
-LOG.info("Storage directory is loaded: " + volume.toString());
{code}

The log messages seem a bit inconsistent.  It seems like the success message 
should be "Successfully added volume: " to match the failure one, which talks 
about volumes rather than storage directories.

Also, do we need the toString here?  The failure case doesn't have it, and of 
course Java adds this automatically.

+1 once that's addressed.

> Improve error messages for DataNode hot swap drive feature
> --
>
> Key: HDFS-7419
> URL: https://issues.apache.org/jira/browse/HDFS-7419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7419.000.patch, HDFS-7419.001.patch, 
> HDFS-7419.002.patch
>
>
> When DataNode fails to add a volume, it adds one failure message to 
> {{errorMessageBuilder}} in {{DataNode#refreshVolumes}}. However, the detailed 
> error messages are not logged in DataNode's log and they are emitted from 
> clients. 
> This JIRA makes {{DataNode}} reports detailed failure in its log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7428) Include path in the XML output of oiv

2014-11-23 Thread Gautam Gopalakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gautam Gopalakrishnan updated HDFS-7428:

Description: When generating the XML output using {noformat}hadoop oiv -p 
XML{noformat} the path of a file is not printed, just the file name. While the 
complete path can be derived by parsing INodeDirectorySection and INodeSection 
and their children, it would be convenient to have the "absolute path" present 
directly like {noformat}INodeSection->inode->path{noformat}  (was: When 
generating the XML output using "hadoop oiv -p XML", the path of a file is not 
printed, just the file name. While the complete path can be derived by parsing 
INodeDirectorySection and INodeSection and their children, it would be 
convenient to have the "absolute path" present directly like 
INodeSection->inode->path)

> Include path in the XML output of oiv
> -
>
> Key: HDFS-7428
> URL: https://issues.apache.org/jira/browse/HDFS-7428
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.5.1
>Reporter: Gautam Gopalakrishnan
>Priority: Minor
>
> When generating the XML output using {noformat}hadoop oiv -p XML{noformat} 
> the path of a file is not printed, just the file name. While the complete 
> path can be derived by parsing INodeDirectorySection and INodeSection and 
> their children, it would be convenient to have the "absolute path" present 
> directly like {noformat}INodeSection->inode->path{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7428) Include path in the XML output of oiv

2014-11-23 Thread Gautam Gopalakrishnan (JIRA)
Gautam Gopalakrishnan created HDFS-7428:
---

 Summary: Include path in the XML output of oiv
 Key: HDFS-7428
 URL: https://issues.apache.org/jira/browse/HDFS-7428
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.5.1
Reporter: Gautam Gopalakrishnan
Priority: Minor


When generating the XML output using "hadoop oiv -p XML", the path of a file is 
not printed, just the file name. While the complete path can be derived by 
parsing INodeDirectorySection and INodeSection and their children, it would be 
convenient to have the "absolute path" present directly like 
INodeSection->inode->path



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location

2014-11-23 Thread Joseph Prosser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Prosser updated HDFS-7312:
-
Attachment: HDFS-7312.007.patch

Made all recommended changes.

> Update DistCp v1 to optionally not use tmp location
> ---
>
> Key: HDFS-7312
> URL: https://issues.apache.org/jira/browse/HDFS-7312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.5.1
>Reporter: Joseph Prosser
>Assignee: Joseph Prosser
>Priority: Minor
> Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, 
> HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222544#comment-14222544
 ] 

Hadoop QA commented on HDFS-6735:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683239/HDFS-6735-v4.txt
  against trunk revision a4df9ee.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8814//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8814//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8814//console

This message is automatically generated.

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v4.txt

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: (was: HDFS-6735-v4.txt)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v4.txt

New patch:
* added synchronized back to tryZeroCopyRead
* renamed sharedLock to infoLock
* this time did all the correct indentation - harder to review, but this should 
be committable as is
* surrounded every reference to cachingStrategy with synchronized(infoLock) 
{...}, removed volatile

Looking at this again, we can be better about safe publishing with immutable 
state and avoid some of the locks.
For example FileEncryptionInfo and CachingStrategy are already immutable and 
can be 100% safely handled by just a volatile reference; most the LocatedBlocks 
state is also immutable and for those parts we can avoid the locks as well.

Immutable state is easier to reason about and more efficient.
(volatile still places read and write memory fences - but that is cheaper than 
synchronized). Can do that later :)


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7427) [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only

2014-11-23 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7427:
---
Description: 
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage /srv/image
No GC_PROFILE is given. Defaults to medium.
fetchImage: FileSystem file:/// is not an HDFS file system
Usage: java DFSAdmin [-fetchImage ]

{code}
public int fetchImage(final String[] argv, final int idx) throws IOException {
Configuration conf = getConf();
final URL infoServer = DFSUtil.getInfoServer(
HAUtil.getAddressOfActive(getDFS()), conf,
DFSUtil.getHttpClientScheme(conf)).toURL();
SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
  @Override
  public Void run() throws Exception {
TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
new File(argv[idx]));
return null;
  }
});
return 0;
  }

{code}

  was:
Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage hdfs://10.**.**:25000/
No GC_PROFILE is given. Defaults to medium.
fetchImage: FileSystem file:/// is not an HDFS file system
Usage: java DFSAdmin [-fetchImage ]

{code}
public int fetchImage(final String[] argv, final int idx) throws IOException {
Configuration conf = getConf();
final URL infoServer = DFSUtil.getInfoServer(
HAUtil.getAddressOfActive(getDFS()), conf,
DFSUtil.getHttpClientScheme(conf)).toURL();
SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
  @Override
  public Void run() throws Exception {
TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
new File(argv[idx]));
return null;
  }
});
return 0;
  }

{code}


> [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only
> --
>
> Key: HDFS-7427
> URL: https://issues.apache.org/jira/browse/HDFS-7427
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Priority: Critical
>
> Scenario:
> Start cluster in securemode and enable only HTTPS
> Run fectchimage command 
>  [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage /srv/image
> No GC_PROFILE is given. Defaults to medium.
> fetchImage: FileSystem file:/// is not an HDFS file system
> Usage: java DFSAdmin [-fetchImage ]
> {code}
> public int fetchImage(final String[] argv, final int idx) throws IOException {
> Configuration conf = getConf();
> final URL infoServer = DFSUtil.getInfoServer(
> HAUtil.getAddressOfActive(getDFS()), conf,
> DFSUtil.getHttpClientScheme(conf)).toURL();
> SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
>   @Override
>   public Void run() throws Exception {
> TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
> new File(argv[idx]));
> return null;
>   }
> });
> return 0;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7427) [ HTTPS Only ] Fetchimage will not work when we enable cluster with HTTPS only

2014-11-23 Thread Brahma Reddy Battula (JIRA)
Brahma Reddy Battula created HDFS-7427:
--

 Summary: [ HTTPS Only ] Fetchimage will not work when we enable 
cluster with HTTPS only
 Key: HDFS-7427
 URL: https://issues.apache.org/jira/browse/HDFS-7427
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Priority: Critical


Scenario:

Start cluster in securemode and enable only HTTPS
Run fectchimage command 

 [omm@linux158 bin]$ ./hdfs dfsadmin -fetchImage hdfs://10.**.**:25000/
No GC_PROFILE is given. Defaults to medium.
fetchImage: FileSystem file:/// is not an HDFS file system
Usage: java DFSAdmin [-fetchImage ]

{code}
public int fetchImage(final String[] argv, final int idx) throws IOException {
Configuration conf = getConf();
final URL infoServer = DFSUtil.getInfoServer(
HAUtil.getAddressOfActive(getDFS()), conf,
DFSUtil.getHttpClientScheme(conf)).toURL();
SecurityUtil.doAsCurrentUser(new PrivilegedExceptionAction() {
  @Override
  public Void run() throws Exception {
TransferFsImage.downloadMostRecentImageToDirectory(infoServer,
new File(argv[idx]));
return null;
  }
});
return 0;
  }

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)