[jira] [Updated] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2014-05-28 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-2856:


Status: Patch Available  (was: Open)

> Fix block protocol so that Datanodes don't require root or jsvc
> ---
>
> Key: HDFS-2856
> URL: https://issues.apache.org/jira/browse/HDFS-2856
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, security
>Reporter: Owen O'Malley
>Assignee: Chris Nauroth
> Attachments: Datanode-Security-Design.pdf, 
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
> HDFS-2856.1.patch, HDFS-2856.prototype.patch
>
>
> Since we send the block tokens unencrypted to the datanode, we currently 
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and 
> the sends an hmac of the nonce back instead of the block token it won't 
> reveal any secrets. Thus, we wouldn't require a secure port and would not 
> require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2014-05-28 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-2856:


Attachment: HDFS-2856.1.patch

I'm uploading a patch that implements the ideas described in the past several 
comments.  I'm still in progress on more tests and several TODOs, but any 
feedback at this point is welcome.  Pinging [~owen.omalley], [~lmccay], [~jnp] 
and [~atm] for potential feedback.

It's a big patch.  I did a lot of refactoring to avoid code duplication between 
the general-purpose SASL flow and our existing specialized encrypted SASL flow. 
 If this is too cumbersome to review at once, then I can split some of the 
refactorings into separate patches on request.

Summary of changes:
* {{DataTransferEncryptor}}: I deleted this class.  The code has been 
refactored into various new classes in a new 
{{org.apache.hadoop.hdfs.protocol.datatransfer.sasl}} sub-package.  The 
presence of the word "encrypt" in this class name would have been potentially 
misleading, because we're now allowing DataTransferProtocol to support a 
quality of protection different from auth-conf.
* {{SaslDataTransferClient}}: This class now implements the client side of SASL 
negotiation, whether using the general-purpose SASL handshake or our existing 
specialized encrypted handshake.  This class is called by the HDFS client and 
also by the DataNode when acting as a client to another DataNode.  The logic 
for deciding whether or not to do a SASL handshake, and if so which kind of 
handshake, has become somewhat complex.  By encapsulating it behind this class, 
we avoid repeating that logic at multiple points in the rest of the code.
* {{SaslDataTransferServer}}: This class now implements the server side of SASL 
negotiation.  This is only called by the DataNode when receiving new 
connections.  Similar to the above, this is a single point for encapsulating 
the logic of deciding which SASL handshake to use.
* {{DataTransferSaslUtil}}: This contains various helper functions needed by 
the SASL classes.
* Various classes of the HDFS client and the DataNode have mechanical changes 
to wire in the new SASL classes and call them.
* {{DateNode#checkSecureConfig}}: This is a new method for checking whether the 
DataNode is starting in an acceptable secure configuration, either via 
privileged ports or configuring SASL.
* hdfs-default.xml: I added documentation of the new properties for configuring 
SASL on DataTransferProtocol.
* {{TestSaslDataTransfer}}: This is a new test that runs an embedded KDC, 
starts a secured cluster and demonstrates that a client can request any of the 
3 QOPs.

Here are a few discussion points I'd like to bring up:
* Our discussion up to this point has focused on the privileged port for 
DataTransferProtocol.  There is also the HTTP port to consider.  My thinking on 
this is that use of the new SASL configuration on a non-privileged port is only 
acceptable if the configuration also uses SPNEGO for HTTP authentication.  If 
it was using token-based auth, then we'd be back to the same problem of sending 
secret block access tokens to an unauthenticated process.  (See TODO comment in 
{{DataNode#checkSecureConfig}}.)  My understanding is that SPNEGO establishes 
mutual authentication, so checking for this ought to work fine.  I'd love if 
someone could confirm that independently.
* Previously, I mentioned renegotiating SASL between multiple block operations. 
 On further reflection, I no longer think this is necessary.  The initial SASL 
handshake establishes authentication of the server.  For subsequent operations 
on the same connection/underlying socket, I expect authentication of the remote 
process wouldn't change.  The privileged port check was intended to protect 
against an attacker binding to the data transfer port after a DataNode process 
stops.  For an existing previously authenticated socket, we know that it's 
still connected to the same process, so I don't think we need to renegotiate 
SASL.  Thoughts?


> Fix block protocol so that Datanodes don't require root or jsvc
> ---
>
> Key: HDFS-2856
> URL: https://issues.apache.org/jira/browse/HDFS-2856
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, security
>Reporter: Owen O'Malley
>Assignee: Chris Nauroth
> Attachments: Datanode-Security-Design.pdf, 
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
> HDFS-2856.1.patch, HDFS-2856.prototype.patch
>
>
> Since we send the block tokens unencrypted to the datanode, we currently 
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and 
> the sends an hmac of the nonce back instead of the block token it won't 
> reveal a

[jira] [Assigned] (HDFS-6462) NFS: fsstat request fails with the secure hdfs

2014-05-28 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li reassigned HDFS-6462:


Assignee: Brandon Li

> NFS: fsstat request fails with the secure hdfs
> --
>
> Key: HDFS-6462
> URL: https://issues.apache.org/jira/browse/HDFS-6462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>Assignee: Brandon Li
>
> Fsstat fails in secure environment with below error.
> Steps to reproduce:
> 1) Create user named UserB and UserA
> 2) Create group named GroupB
> 3) Add root and UserB users to GroupB
> Make sure UserA is not in GroupB
> 4) Set below properties
> {noformat}
> ===
> hdfs-site.xml
> ===
>  
> dfs.nfs.keytab.file
> /tmp/keytab/UserA.keytab
>   
>   
> dfs.nfs.kerberos.principal
> us...@example.com
>   
> ==
> core-site.xml
> ==
> 
> hadoop.proxyuser.UserA.groups
>GroupB
>  
> 
>hadoop.proxyuser.UserA.hosts
>*
>  
> {noformat}
> 4) start nfs server as UserA
> 5) mount nfs as root user
> 6) run below command 
> {noformat}
> [root@host1 ~]# df /tmp/tmp_mnt/
> df: `/tmp/tmp_mnt/': Input/output error
> df: no file systems processed
> {noformat}
> NFS Logs complains as below
> {noformat}
> 2014-05-29 00:09:13,698 DEBUG nfs3.RpcProgramNfs3 
> (RpcProgramNfs3.java:fsstat(1654)) - NFS FSSTAT fileId: 16385
> 2014-05-29 00:09:13,706 WARN  ipc.Client (Client.java:run(672)) - Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2014-05-29 00:09:13,710 WARN  nfs3.RpcProgramNfs3 
> (RpcProgramNfs3.java:fsstat(1681)) - Exception
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "host1/0.0.0.0"; 
> destination host is: "host1":8020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy14.getFsStats(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
> at com.sun.proxy.$Proxy14.getFsStats(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:554)
> at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2165)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsstat(RpcProgramNfs3.java:1659)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1961)
> at 
> org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:162)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
> at 
> org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFir

[jira] [Commented] (HDFS-6375) Listing extended attributes with the search permission

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012048#comment-14012048
 ] 

Hadoop QA commented on HDFS-6375:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647231/HDFS-6375.10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1278 javac 
compiler warnings (more than the trunk's current 1277 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7001//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7001//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7001//console

This message is automatically generated.

> Listing extended attributes with the search permission
> --
>
> Key: HDFS-6375
> URL: https://issues.apache.org/jira/browse/HDFS-6375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Charles Lamb
> Attachments: HDFS-6375.1.patch, HDFS-6375.10.patch, 
> HDFS-6375.2.patch, HDFS-6375.3.patch, HDFS-6375.4.patch, HDFS-6375.5.patch, 
> HDFS-6375.6.patch, HDFS-6375.7.patch, HDFS-6375.8.patch, HDFS-6375.9.patch
>
>
> From the attr(5) manpage:
> {noformat}
>Users with search access to a file or directory may retrieve a list  of
>attribute names defined for that file or directory.
> {noformat}
> This is like doing {{getfattr}} without the {{-d}} flag, which we currently 
> don't support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS

2014-05-28 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012033#comment-14012033
 ] 

Liang Xie commented on HDFS-6450:
-

will dive into it a couple of days later.

> Support non-positional hedged reads in HDFS
> ---
>
> Key: HDFS-6450
> URL: https://issues.apache.org/jira/browse/HDFS-6450
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Liang Xie
>
> HDFS-5776 added support for hedged positional reads.  We should also support 
> hedged non-position reads (aka regular reads).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6450) Support non-positional hedged reads in HDFS

2014-05-28 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie reassigned HDFS-6450:
---

Assignee: Liang Xie

> Support non-positional hedged reads in HDFS
> ---
>
> Key: HDFS-6450
> URL: https://issues.apache.org/jira/browse/HDFS-6450
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Liang Xie
>
> HDFS-5776 added support for hedged positional reads.  We should also support 
> hedged non-position reads (aka regular reads).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6463) Incorrect permission can be created after setting ACLs

2014-05-28 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012013#comment-14012013
 ] 

Chris Nauroth commented on HDFS-6463:
-

Hello, [~atm] and [~szehon].  I reviewed the test case, and it appears to be 
asserting incorrect behavior here:

{code}
  assertEquals("rwxr-xr-x", permission.toString());
{code}

I'd actually expect to see "rwxrwxr-x".  The POSIX ACL model defines the 
concept of the "group class" consisting of the traditional group entry, all 
named user entries, and all named group entries.  By default, the mask entry is 
set to the union of permissions for all entries in the group class.  The mask 
entry is then reported as the group permissions to all APIs/applications that 
are unaware of ACLs, such as ls.  This is an intentional design choice made by 
the POSIX ACL model to deal with the discrepancy that some legacy applications 
inevitably have an incomplete view of ACLs.  More details on this design choice 
are documented here:

http://users.suse.com/~agruen/acl/linux-acls/online/

In this test case, the ACL entries consist of a group entry with read-execute 
permissions, a named user entry with read-write permissions, and a named group 
entry with read-write permissions.  Taking the union of all of those, we have 
read-write-execute.  The ACL does not explicitly set its own mask entry, so 
therefore it uses the union of rwx.

To verify that this is expected behavior, I executed the same test case 
manually using Linux setfacl.  See below for a transcript.  As I expected, the 
resulting stat/ls shows 775 for the permissions, not 755 as asserted in this 
test case.

I'd like to resolve this as Not a Problem, but let me know if you have any 
other questions.

{code}
[cnauroth@ubuntu:pts/0] acltest 

> mkdir foo -m 755

[cnauroth@ubuntu:pts/0] acltest 

> setfacl --set user::rwx,group::r-x,other::r-x,user:foo:rw-,group:foo:rw- foo

[cnauroth@ubuntu:pts/0] acltest 

> stat foo
  File: `foo'
  Size: 4096Blocks: 8  IO Block: 4096   directory
Device: 801h/2049d  Inode: 9791Links: 2
Access: (0775/drwxrwxr-x)  Uid: ( 1000/cnauroth)   Gid: ( 1000/cnauroth)
Access: 2014-05-28 19:57:23.549889726 -0700
Modify: 2014-05-28 19:57:23.549889726 -0700
Change: 2014-05-28 19:58:57.840704104 -0700
 Birth: -

[cnauroth@ubuntu:pts/0] acltest 

> getfacl foo
# file: foo
# owner: cnauroth
# group: cnauroth
user::rwx
user:foo:rw-
group::r-x
group:foo:rw-
mask::rwx
other::r-x

[cnauroth@ubuntu:pts/0] acltest 

> ls -lrt
drwxrwxr-x+ 2 cnauroth 4.0K May 28 19:57 foo/
{code}


> Incorrect permission can be created after setting ACLs
> --
>
> Key: HDFS-6463
> URL: https://issues.apache.org/jira/browse/HDFS-6463
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
> Attachments: HDFS-6463.patch
>
>
> When setting ACLs for a file or directory, it's possible for the resulting 
> FsPermission object's group entry to be set incorrectly, in particular it 
> will be set to the mask entry. More details in the first comment of this JIRA.
> Thanks to [~szehon] for identifying this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-05-28 Thread Hangjun Ye (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012008#comment-14012008
 ] 

Hangjun Ye commented on HDFS-6382:
--

Thanks Chris and Colin for your valuable comments, I'd like to address your 
concern about the "security" problem.

Firstly our scenario is as following:
We have a Hadoop cluster shared by multiple teams for their storage and 
computation requirement and "we" are the dev/supporting team to ensure the 
functionality and availability of the cluster. The cluster is security enabled 
to ensure every team could only access the files that they should. So every 
team is a common user of the cluster and "we" own the superuser.

Currently several teams have the requirement to clean up files based on TTL 
policy. Obviously they could have cron job to do that by themselves but it 
would have many repeated jobs, so we'd better have a mechanism to let them to 
specify/implement their policy easily.

One approach, as you suggested, is we that implement a separate cleanup 
platform and users submit their policy to this platform, and we do the real 
cleanup action to the HDFS on behalf of users (as a superuser or other powerful 
user). But the separate platform has to implement an 
authentication/authorization mechanism to make sure the user is who they claim 
to be and have the permission (authentication is a must, authorization might be 
optional but it'd better have). It's a repeated job as the NameNode has done 
with Kerberos/acl.

If it's implemented inside the NameNode, we could leverage NameNode's 
authentication/authorization mechanism. For example we provide a "./bin/hdfs 
dfs -setttl " command (just like -setrep). Users could specify their 
policy by it and the NameNode should persist it somewhere, maybe as an 
attribute of file like replication number. The implemented mechanism inside the 
NameNode would (maybe periodically) execute all policies specified by users, 
and it would do it as a superuser safely, as authentication/authorization have 
been done when user set their policies to the NameNode.

To address several detailed concerns you raised:
* "buggy or malicious code": The proposed concept (actually Haohui proposed) 
should be pretty similar to HBase's coprocessor 
(http://hbase.apache.org/book.html#cp), it's a plug-in or extension of NameNode 
and most likely enabled at deployment time. A common user can't submit it, the 
cluster owner could do. So the code is not arbitrary and the quality/safety 
could be guaranteed.

* "Who exactly is the effective user running the delete, and how do we manage 
their login and file permission enforcement": the extension is run as 
superuser/system, a specific extension implementation could do any permission 
enforcement if needed. For the "TTL-based cleanup policy executor", no 
permission enforcement is needed at this stage as authentication/authorization 
have been done when user set policy.

I think the idea proposed by Haohui is to have an extensible mechanism in 
NameNode to run jobs which intensively depend on namespace data, and make the 
specific job's code as de-coupled from NameNode's core code as possible. For 
certain it's not easy, as Chris pointed out several problems, like HA and 
concurrency, but it might deserve to be thought about.

> HDFS File/Directory TTL
> ---
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 
> debug program bugs, so we keep all the logs on hdfs and delete logs which are 
> older than 1 month. This is a typical scenario of HDFS TTL. So here we 
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after 
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be 
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent 
> directory's
> 5. A global configuration is needed to configure that whether the deleted 
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory 
> with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassi

[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011984#comment-14011984
 ] 

Hadoop QA commented on HDFS-6422:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646119/HDFS-6422.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHDFSXAttr

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7000//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7000//console

This message is automatically generated.

> getfattr in CLI doesn't throw exception or return non-0 return code when 
> xattr doesn't exist
> 
>
> Key: HDFS-6422
> URL: https://issues.apache.org/jira/browse/HDFS-6422
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch
>
>
> If you do
> hdfs dfs -getfattr -n user.blah /foo
> and user.blah doesn't exist, the command prints
> # file: /foo
> and a 0 return code.
> It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011981#comment-14011981
 ] 

Hadoop QA commented on HDFS-6461:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647234/HDFS-6461.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.datanode.TestBPOfferService
  org.apache.hadoop.hdfs.TestDistributedFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6999//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6999//console

This message is automatically generated.

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.3.patch, HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-05-28 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011956#comment-14011956
 ] 

Yi Liu commented on HDFS-6422:
--

Thanks Andrew, and Charles, I create HDFS-6464 to "add xattr.names parameter 
for WebHDFS getXAttrs.", and will fix it in the two days. Then you can continue.

> getfattr in CLI doesn't throw exception or return non-0 return code when 
> xattr doesn't exist
> 
>
> Key: HDFS-6422
> URL: https://issues.apache.org/jira/browse/HDFS-6422
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch
>
>
> If you do
> hdfs dfs -getfattr -n user.blah /foo
> and user.blah doesn't exist, the command prints
> # file: /foo
> and a 0 return code.
> It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6463) Incorrect permission can be created after setting ACLs

2014-05-28 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6463:
-

Attachment: HDFS-6463.patch

I'm attaching a test case which demonstrates the issue. I believe the trouble 
is in this method in {{AclStorage}}:

{code}
  /**
   * Creates the new FsPermission for an inode that is receiving an extended
   * ACL, based on its access ACL entries.  For a correctly sorted ACL, the
   * first entry is the owner and the last 2 entries are the mask and other
   * entries respectively.  Also preserve sticky bit and toggle ACL bit on.
   *
   * @param accessEntries List access ACL entries
   * @param existingPerm FsPermission existing permissions
   * @return FsPermission new permissions
   */
  private static FsPermission createFsPermissionForExtendedAcl(
  List accessEntries, FsPermission existingPerm) {
return new FsPermission(accessEntries.get(0).getPermission(),
  accessEntries.get(accessEntries.size() - 2).getPermission(),
  accessEntries.get(accessEntries.size() - 1).getPermission(),
  existingPerm.getStickyBit());
  }
{code}

While the comment seems to be correct that the mask and other entries are the 
last two entries in a correctly-sorted list, I believe the bug is that we 
should not be using the mask entry of the ACL at all, and instead should be 
using the group base entry to create the new {{FsPermission}}.

> Incorrect permission can be created after setting ACLs
> --
>
> Key: HDFS-6463
> URL: https://issues.apache.org/jira/browse/HDFS-6463
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
> Attachments: HDFS-6463.patch
>
>
> When setting ACLs for a file or directory, it's possible for the resulting 
> FsPermission object's group entry to be set incorrectly, in particular it 
> will be set to the mask entry. More details in the first comment of this JIRA.
> Thanks to [~szehon] for identifying this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6463) Incorrect permission can be created after setting ACLs

2014-05-28 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated HDFS-6463:
-

Description: 
When setting ACLs for a file or directory, it's possible for the resulting 
FsPermission object's group entry to be set incorrectly, in particular it will 
be set to the mask entry. More details in the first comment of this JIRA.

Thanks to [~szehon] for identifying this issue.

  was:
When setting ACLs for a file or directory, it's possible for the resulting 
FsPermission object's group entry to be set incorrectly, in particular it will 
be set to the mask entry. More details in the first comment of this JIRA.

Thanks to Szehon Ho for identifying this issue.


> Incorrect permission can be created after setting ACLs
> --
>
> Key: HDFS-6463
> URL: https://issues.apache.org/jira/browse/HDFS-6463
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
> Attachments: HDFS-6463.patch
>
>
> When setting ACLs for a file or directory, it's possible for the resulting 
> FsPermission object's group entry to be set incorrectly, in particular it 
> will be set to the mask entry. More details in the first comment of this JIRA.
> Thanks to [~szehon] for identifying this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6464) Add xattr.names parameter for WebHDFS getXAttrs.

2014-05-28 Thread Yi Liu (JIRA)
Yi Liu created HDFS-6464:


 Summary: Add xattr.names parameter for WebHDFS getXAttrs.
 Key: HDFS-6464
 URL: https://issues.apache.org/jira/browse/HDFS-6464
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


For WebHDFS getXAttrs through names, right now the entire list is passed to the 
client side and then filtered, which is not the best choice since it's 
inefficient and precludes us from doing server-side smarts on par with the Java 
APIs. 
Furthermore, if some xattrs doesn't exist, server side should return error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6463) Incorrect permission can be created after setting ACLs

2014-05-28 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-6463:


 Summary: Incorrect permission can be created after setting ACLs
 Key: HDFS-6463
 URL: https://issues.apache.org/jira/browse/HDFS-6463
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Aaron T. Myers


When setting ACLs for a file or directory, it's possible for the resulting 
FsPermission object's group entry to be set incorrectly, in particular it will 
be set to the mask entry. More details in the first comment of this JIRA.

Thanks to Szehon Ho for identifying this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts

2014-05-28 Thread Zesheng Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011937#comment-14011937
 ] 

Zesheng Wu commented on HDFS-6442:
--

Thanks [~arpitagarwal].

> Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port 
> conficts
> --
>
> Key: HDFS-6442
> URL: https://issues.apache.org/jira/browse/HDFS-6442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: HDFS-6442.1.patch, HDFS-6442.patch
>
>
> TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to 
> set up the mini-cluster, this may result in occasionally test failure when 
> run test with -Pparallel-tests. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking

2014-05-28 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011932#comment-14011932
 ] 

Yi Liu commented on HDFS-6395:
--

Thanks Andrew, actually I have finished part of it last week, please let me 
continue.  Thanks again for your voluntary.

> Assorted improvements to xattr limit checking
> -
>
> Key: HDFS-6395
> URL: https://issues.apache.org/jira/browse/HDFS-6395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Yi Liu
>
> It'd be nice to print messages during fsimage and editlog loading if we hit 
> either the # of xattrs per inode or the xattr size limits.
> We should also consider making the # of xattrs limit only apply to the user 
> namespace, or to each namespace separately, to prevent users from locking out 
> access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6403) Add metrics for log warnings reported by HADOOP-9618

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011928#comment-14011928
 ] 

Hadoop QA commented on HDFS-6403:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647150/HDFS-6403.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6997//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6997//console

This message is automatically generated.

> Add metrics for log warnings reported by HADOOP-9618
> 
>
> Key: HDFS-6403
> URL: https://issues.apache.org/jira/browse/HDFS-6403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6403.001.patch, HDFS-6403.002.patch
>
>
> HADOOP-9618 logs warnings when there are long GC pauses. If this is exposed 
> as a metric, then they can be monitored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011923#comment-14011923
 ] 

Hadoop QA commented on HDFS-6447:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646988/HDFS-6447.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6998//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6998//console

This message is automatically generated.

> balancer should timestamp the completion message
> 
>
> Key: HDFS-6447
> URL: https://issues.apache.org/jira/browse/HDFS-6447
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Reporter: Allen Wittenauer
>Assignee: Juan Yu
>Priority: Trivial
>  Labels: newbie
> Attachments: HDFS-6447.002.patch, HDFS-6447.patch.001
>
>
> When the balancer finishes, it doesn't report the time it finished.  It 
> should do this so that users have a better sense of how long it took to 
> complete.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011919#comment-14011919
 ] 

Hadoop QA commented on HDFS-6461:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647228/HDFS-6461.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6996//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6996//console

This message is automatically generated.

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.3.patch, HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6392) Wire crypto streams for encrypted files in DFSClient

2014-05-28 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6392:
---

Attachment: HDFS-6392.1.patch

Taking [~andrew.wang]'s advice, I'm submitting this set of intermediate diffs 
for review. These wire the Crypto{Input/Output}Streams into the HDFS client; 
initialize the KeyProvider in the Namenode; carry the key/iv through the 
relevant client/namenode protocols; add a unit test for the streams.

Since this is only going on the branch, it shouldn't matter too much that all 
accesses to files are guaranteed to be encrypted or that the key/iv is 
hardwired everywhere.


>  Wire crypto streams for encrypted files in DFSClient
> -
>
> Key: HDFS-6392
> URL: https://issues.apache.org/jira/browse/HDFS-6392
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFS-6392.1.patch
>
>
> When the DFS client gets a key material and IV for a file being 
> opened/created, it should wrap the stream with a crypto stream initialized 
> with the key material and IV.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6462) NFS: fsstat request fails with the secure hdfs

2014-05-28 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated HDFS-6462:
-

Description: 
Fsstat fails in secure environment with below error.

Steps to reproduce:
1) Create user named UserB and UserA
2) Create group named GroupB
3) Add root and UserB users to GroupB
Make sure UserA is not in GroupB
4) Set below properties
{noformat}
===
hdfs-site.xml
===
 
dfs.nfs.keytab.file
/tmp/keytab/UserA.keytab
  
  
dfs.nfs.kerberos.principal
us...@example.com
  
==
core-site.xml
==

hadoop.proxyuser.UserA.groups
   GroupB
 

   hadoop.proxyuser.UserA.hosts
   *
 
{noformat}
4) start nfs server as UserA
5) mount nfs as root user
6) run below command 
{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}

NFS Logs complains as below
{noformat}
2014-05-29 00:09:13,698 DEBUG nfs3.RpcProgramNfs3 
(RpcProgramNfs3.java:fsstat(1654)) - NFS FSSTAT fileId: 16385
2014-05-29 00:09:13,706 WARN  ipc.Client (Client.java:run(672)) - Exception 
encountered while connecting to the server : javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
2014-05-29 00:09:13,710 WARN  nfs3.RpcProgramNfs3 
(RpcProgramNfs3.java:fsstat(1681)) - Exception
java.io.IOException: Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: "host1/0.0.0.0"; destination host is: 
"host1":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.getFsStats(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy14.getFsStats(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:554)
at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2165)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.fsstat(RpcProgramNfs3.java:1659)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.handleInternal(RpcProgramNfs3.java:1961)
at 
org.apache.hadoop.oncrpc.RpcProgram.messageReceived(RpcProgram.java:162)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:281)
at 
org.apache.hadoop.oncrpc.RpcUtil$RpcMessageParserStage.messageReceived(RpcUtil.java:132)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
at 
org.jboss.netty.channel.Channels

[jira] [Updated] (HDFS-6462) NFS: fsstat request fails with the secure hdfs

2014-05-28 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated HDFS-6462:
-

Description: 
Fsstat fails in secure environment with below error.

Steps to reproduce:
1) Create user named UserB and UserA
2) Create group named GroupB
3) Add root and UserB users to GroupB
Make sure UserA is not in GroupB
4) Set below properties
{noformat}
===
hdfs-site.xml
===
 
dfs.nfs.keytab.file
/tmp/keytab/UserA.keytab
  
  
dfs.nfs.kerberos.principal
us...@example.com
  
==
core-site.xml
==

hadoop.proxyuser.UserA.groups
   GroupB
 

   hadoop.proxyuser.UserA.hosts
   *
 
{noformat}
4) start nfs server as UserA
5) mount nfs as root user
6) run below command 
{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}

  was:
Fsstat fails in secure environment with below error.

Steps to reproduce:
1) Create user named UserB. 
2) Create group named GroupB
3) Add root and UserB users to GroupB
4) Set below properties
{noformat}
===
hdfs-site.xml
===
 
dfs.nfs.keytab.file
/tmp/keytab/UserA.keytab
  
  
dfs.nfs.kerberos.principal
us...@example.com
  
==
core-site.xml
==

hadoop.proxyuser.UserA.groups
   GroupB
 

   hadoop.proxyuser.UserA.hosts
   *
 
{noformat}
4) start nfs server as UserA
5) mount nfs as root user
6) run below command 
{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}


> NFS: fsstat request fails with the secure hdfs
> --
>
> Key: HDFS-6462
> URL: https://issues.apache.org/jira/browse/HDFS-6462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> Fsstat fails in secure environment with below error.
> Steps to reproduce:
> 1) Create user named UserB and UserA
> 2) Create group named GroupB
> 3) Add root and UserB users to GroupB
> Make sure UserA is not in GroupB
> 4) Set below properties
> {noformat}
> ===
> hdfs-site.xml
> ===
>  
> dfs.nfs.keytab.file
> /tmp/keytab/UserA.keytab
>   
>   
> dfs.nfs.kerberos.principal
> us...@example.com
>   
> ==
> core-site.xml
> ==
> 
> hadoop.proxyuser.UserA.groups
>GroupB
>  
> 
>hadoop.proxyuser.UserA.hosts
>*
>  
> {noformat}
> 4) start nfs server as UserA
> 5) mount nfs as root user
> 6) run below command 
> {noformat}
> [root@host1 ~]# df /tmp/tmp_mnt/
> df: `/tmp/tmp_mnt/': Input/output error
> df: no file systems processed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6462) NFS: fsstat request fails with the secure hdfs

2014-05-28 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated HDFS-6462:
-

Description: 
Fsstat fails in secure environment with below error.

Steps to reproduce:
1) Create user named UserB. 
2) Create group named GroupB
3) Add root and UserB users to GroupB
4) Set below properties
{noformat}
===
hdfs-site.xml
===
 
dfs.nfs.keytab.file
/tmp/keytab/UserA.keytab
  
  
dfs.nfs.kerberos.principal
us...@example.com
  
==
core-site.xml
==

hadoop.proxyuser.UserA.groups
   GroupB
 

   hadoop.proxyuser.UserA.hosts
   *
 
{noformat}
4) start nfs server as UserA
5) mount nfs as root user
6) run below command 
{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}

  was:
Fsstat fails in secure environment with below error.

{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}


> NFS: fsstat request fails with the secure hdfs
> --
>
> Key: HDFS-6462
> URL: https://issues.apache.org/jira/browse/HDFS-6462
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> Fsstat fails in secure environment with below error.
> Steps to reproduce:
> 1) Create user named UserB. 
> 2) Create group named GroupB
> 3) Add root and UserB users to GroupB
> 4) Set below properties
> {noformat}
> ===
> hdfs-site.xml
> ===
>  
> dfs.nfs.keytab.file
> /tmp/keytab/UserA.keytab
>   
>   
> dfs.nfs.kerberos.principal
> us...@example.com
>   
> ==
> core-site.xml
> ==
> 
> hadoop.proxyuser.UserA.groups
>GroupB
>  
> 
>hadoop.proxyuser.UserA.hosts
>*
>  
> {noformat}
> 4) start nfs server as UserA
> 5) mount nfs as root user
> 6) run below command 
> {noformat}
> [root@host1 ~]# df /tmp/tmp_mnt/
> df: `/tmp/tmp_mnt/': Input/output error
> df: no file systems processed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6462) NFS: fsstat request fails with the secure hdfs

2014-05-28 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-6462:


 Summary: NFS: fsstat request fails with the secure hdfs
 Key: HDFS-6462
 URL: https://issues.apache.org/jira/browse/HDFS-6462
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


Fsstat fails in secure environment with below error.

{noformat}
[root@host1 ~]# df /tmp/tmp_mnt/
df: `/tmp/tmp_mnt/': Input/output error
df: no file systems processed
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011732#comment-14011732
 ] 

James Thomas commented on HDFS-6461:


Thanks, made the line length change and uploaded a fresh patch.

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.3.patch, HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Attachment: HDFS-6461.3.patch

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.3.patch, HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6375) Listing extended attributes with the search permission

2014-05-28 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6375:
---

Attachment: HDFS-6375.10.patch

Yes, I see what you mean. I think the attached patch takes care of this. I have 
to confess that I'm not really happy about the cast in JsonUtil#toXAttrNames so 
if you see a better way of doing that, let me know.

> Listing extended attributes with the search permission
> --
>
> Key: HDFS-6375
> URL: https://issues.apache.org/jira/browse/HDFS-6375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Charles Lamb
> Attachments: HDFS-6375.1.patch, HDFS-6375.10.patch, 
> HDFS-6375.2.patch, HDFS-6375.3.patch, HDFS-6375.4.patch, HDFS-6375.5.patch, 
> HDFS-6375.6.patch, HDFS-6375.7.patch, HDFS-6375.8.patch, HDFS-6375.9.patch
>
>
> From the attr(5) manpage:
> {noformat}
>Users with search access to a file or directory may retrieve a list  of
>attribute names defined for that file or directory.
> {noformat}
> This is like doing {{getfattr}} without the {{-d}} flag, which we currently 
> don't support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011712#comment-14011712
 ] 

Andrew Wang commented on HDFS-6422:
---

I talked with Charles about this, and we're blocked until the WebHDFS XAttr 
methods also throw an exception in the case where the names are not present. 
Right now the entire list is passed to the client side and then filtered, which 
is not the best choice since it's inefficient and precludes us from doing 
server-side smarts on par with the Java APIs. We can either fix that here, or 
do it in another JIRA first.

> getfattr in CLI doesn't throw exception or return non-0 return code when 
> xattr doesn't exist
> 
>
> Key: HDFS-6422
> URL: https://issues.apache.org/jira/browse/HDFS-6422
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch
>
>
> If you do
> hdfs dfs -getfattr -n user.blah /foo
> and user.blah doesn't exist, the command prints
> # file: /foo
> and a 0 return code.
> It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011704#comment-14011704
 ] 

Andrew Wang commented on HDFS-6461:
---

Hey, one more real nitty comment: we like to keep lines to max 80 characters. I 
use this Eclipse formatter (works in IDEA too) to do it for me: 
https://github.com/cloudera/blog-eclipse/blob/master/hadoop-format.xml

Another minor comment, we also typically name our patches something like 
"HDFS-6461.2.patch" with a version number so it's easier to download the right 
one. Thanks James!

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011699#comment-14011699
 ] 

James Thomas commented on HDFS-6461:


Thanks for the review, updated the patch.

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Attachment: HDFS-6461.patch

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch, HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Description: The system time can move around for various reasons, so we 
convert to monotonicNow to ensure greater accuracy when computing a duration in 
this method. 

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>
> The system time can move around for various reasons, so we convert to 
> monotonicNow to ensure greater accuracy when computing a duration in this 
> method. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011688#comment-14011688
 ] 

Andrew Wang commented on HDFS-6461:
---

Hey James, nice find here. One review comment though, I think we should also 
switch the now -> monotonicNow lower down where the duration is actually 
computed, else this will be a wonky. Looks fine besides that though, thanks for 
the contribution!

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011688#comment-14011688
 ] 

Andrew Wang edited comment on HDFS-6461 at 5/28/14 10:13 PM:
-

Hey James, nice find here. One review comment though, I think we should also 
switch the now -> monotonicNow lower down where the duration is actually 
computed, else this will be wonky. Looks fine besides that though, thanks for 
the contribution!


was (Author: andrew.wang):
Hey James, nice find here. One review comment though, I think we should also 
switch the now -> monotonicNow lower down where the duration is actually 
computed, else this will be a wonky. Looks fine besides that though, thanks for 
the contribution!

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Status: Open  (was: Patch Available)

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Status: Patch Available  (was: Open)

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Attachment: HDFS-6461.patch

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
> Attachments: HDFS-6461.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6461:
---

Status: Patch Available  (was: Open)

> Time.now -> Time.monotonicNow in DataNode.java#shutDown
> ---
>
> Key: HDFS-6461
> URL: https://issues.apache.org/jira/browse/HDFS-6461
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: James Thomas
>Assignee: James Thomas
>Priority: Trivial
> Fix For: 2.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6109) let sync_file_range() system call run in background

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011663#comment-14011663
 ] 

Andrew Wang commented on HDFS-6109:
---

Maybe we can reuse the FsDatasetAsyncDiskService available in FsDatasetImpl? 
Right now it's used to do async deletes, but it seems apt to do this sort of 
thing there too. It'd be good so we don't overload a single volume with 
different types of I/O work.

> let sync_file_range() system call run in background
> ---
>
> Key: HDFS-6109
> URL: https://issues.apache.org/jira/browse/HDFS-6109
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6109-v2.txt, HDFS-6109.txt
>
>
> Through we passed SYNC_FILE_RANGE_WRITE to sync_file_range, to make it as 
> asynchronous as possible, it still could be blocked, e.g. the os io request 
> queue is full.
> Since we use sync_file_range just as a page cache advisor role:) it doesn't 
> decide or guarantee the real durability, it would be nice if we could run it 
> in  backgroud. At least my test log showed, a few sync_file_range calls still 
> cost tens of ms or more, due to the happened location is in the critical 
> write path(BlockReceiver class), from a upper view, like HBase application, 
> will "hung" tens of ms as well during Hlog syncing.
> Generally speaking, the patch could not improve too much, but, better than 
> before, right ? :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011651#comment-14011651
 ] 

Hudson commented on HDFS-6453:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5618 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5618/])
HDFS-6453. Use Time#monotonicNow to avoid system clock reset. Contributed by 
Liang Xie. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1598144)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java


> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6453-v2.txt, HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6461) Time.now -> Time.monotonicNow in DataNode.java#shutDown

2014-05-28 Thread James Thomas (JIRA)
James Thomas created HDFS-6461:
--

 Summary: Time.now -> Time.monotonicNow in DataNode.java#shutDown
 Key: HDFS-6461
 URL: https://issues.apache.org/jira/browse/HDFS-6461
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Priority: Trivial
 Fix For: 2.5.0






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found

2014-05-28 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011639#comment-14011639
 ] 

Yongjun Zhang commented on HDFS-6268:
-

Thanks Andrew, I created HDFS-6460 for this and I will work on it a bit later 
as you suggested.


> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> --
>
> Key: HDFS-6268
> URL: https://issues.apache.org/jira/browse/HDFS-6268
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-05-28 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-6460:
---

 Summary: To ignore stale/decommissioned nodes in 
NetworkTopology#pseudoSortByDistance
 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor


Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
improve the sorting result and save a bit runtime.






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011619#comment-14011619
 ] 

Andrew Wang commented on HDFS-6395:
---

Hey Yi, I'd like to take a hack at this if you're busy with other things. Is it 
okay if I assign it to myself?

> Assorted improvements to xattr limit checking
> -
>
> Key: HDFS-6395
> URL: https://issues.apache.org/jira/browse/HDFS-6395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Yi Liu
>
> It'd be nice to print messages during fsimage and editlog loading if we hit 
> either the # of xattrs per inode or the xattr size limits.
> We should also consider making the # of xattrs limit only apply to the user 
> namespace, or to each namespace separately, to prevent users from locking out 
> access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6453:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks for your contribution Liang Xie!

> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6453-v2.txt, HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011608#comment-14011608
 ] 

Andrew Wang commented on HDFS-6453:
---

+1 LGTM, will commit shortly

> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6453-v2.txt, HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011557#comment-14011557
 ] 

Andrew Wang commented on HDFS-6268:
---

I think I understood originally, so maybe *I* was unclear :) I agree that your 
proposal will save a bit of work, and I'm fine if you want to file it as a new 
JIRA and work on it even. Thanks Yongjun.

> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> --
>
> Key: HDFS-6268
> URL: https://issues.apache.org/jira/browse/HDFS-6268
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found

2014-05-28 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011501#comment-14011501
 ] 

Yongjun Zhang commented on HDFS-6268:
-

HI Andrew,

Sorry I didn't make it clear earlier, my intention was to exclude the stale 
nodes from the distance sort by specifying the activeLength (stale nodes are 
beyond this length after the decom/stale sort) as a parameter to the distance 
sort, so to allow it to only sort on the active nodes. As I stated, it 
certainly works with me if we make it to another jira. Thanks.


> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> --
>
> Key: HDFS-6268
> URL: https://issues.apache.org/jira/browse/HDFS-6268
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6418) Regression: DFS_NAMENODE_USER_NAME_KEY missing in trunk

2014-05-28 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011486#comment-14011486
 ] 

Steve Loughran commented on HDFS-6418:
--

-there's no other place in the hdfs codebase that defines the properties for 
hdfs as constant strings...anyone who doesn't want to cut and paste values is 
going to link to this. 

Which is preferable?
# people cut and pasting strings like {"dfs.replication"}
# people importing constants defined in the hadoop source, as is done via 
{{YarnConfiguration}} and {{CommonConfigurationKeysPublic}}?

I may have been unusual in that I tried to use the in-source constants. And I 
may have (unintentionally) used them despite them being annotated private -but 
when you do YARN code you end up treating that as a mild hint anyway. 

Options
# do nothing, I fix my code to inline the constant in my own constants class. I 
repeat this for any other imports in my code, as I can no longer be confident 
that they will remain there. Anyone else who uses the constant finds their code 
breaks. 
# Add a deprecated definition of the old name, using the new name as its 
reference. 
# action #2, then extract a stable set of constants into a HDFSPublicKeys class 
for others to use, make this a superclass of the private keys, and encourage 
people to use these constants in future. 

Now -how are static strings imported into other classes in the compiler? Copied 
or linked? If copied, code that imports the old definitions will not fail at 
runtime -only when precompiled. Which would reduce the damage somewhat

> Regression: DFS_NAMENODE_USER_NAME_KEY missing in trunk
> ---
>
> Key: HDFS-6418
> URL: https://issues.apache.org/jira/browse/HDFS-6418
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Steve Loughran
>
> Code i have that compiles against HADOOP 2.4 doesn't build against trunk as 
> someone took away {{DFSConfigKeys.DFS_NAMENODE_USER_NAME_KEY}} -apparently in 
> HDFS-6181.
> I know the name was obsolete, but anyone who has compiled code using that 
> reference -rather than cutting and pasting in the string- is going to find 
> their code doesn't work.
> More subtly: that will lead to a link exception trying to run that code on a 
> 2.5+  cluster.
> This is a regression: the old names need to go back in, even if they refer to 
> the new names and are marked as deprecated



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6458) NFS: stale NFS file handle Error for previous mount point

2014-05-28 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011438#comment-14011438
 ] 

Brandon Li commented on HDFS-6458:
--

Most likely, the NFS client attribute cache is massed up by the empty attribute 
return in some NFS response along with the error status.

> NFS: stale NFS file handle Error for previous mount point
> -
>
> Key: HDFS-6458
> URL: https://issues.apache.org/jira/browse/HDFS-6458
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> Steps to reproduce:
> 1) Set dfs.nfs.exports.allowed.hosts =  rw
> 2) mount nfs on 
> 3) Set dfs.nfs.exports.allowed.hosts =  rw
> 4) mount nfs on 
> Try to access NFS mount point at Gateway. Can't access mount point from 
> Gateway. 
> {noformat}
> bash: ls /tmp/tmp_mnt 
> ls: cannot access /tmp/tmp_mnt: Stale NFS file handle
> {noformat}
> Expected: Mount_point from previous config should be accessible if it is not 
> unmounted before config change.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6459) Add function to refresh export table for NFS gateway

2014-05-28 Thread Brandon Li (JIRA)
Brandon Li created HDFS-6459:


 Summary: Add function to refresh export table for NFS gateway 
 Key: HDFS-6459
 URL: https://issues.apache.org/jira/browse/HDFS-6459
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: nfs
Reporter: Brandon Li


Currently NFS has to restart to refresh the export table configuration. 
This JIRA is to track the effort to provide the function to refresh the export 
table without rebooting NFS gateway.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6456) NFS: NFS server should throw error for invalid entry in dfs.nfs.exports.allowed.hosts

2014-05-28 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011433#comment-14011433
 ] 

Brandon Li edited comment on HDFS-6456 at 5/28/14 6:38 PM:
---

We should log the error to indicate there is invalid export table entry. Real 
user can do "showmount" command to find the valid export.

Even with no valid export, shutdown may not be desirable given we may want to 
introduce a new feature in the future to refresh export table without 
restarting NFS gateway (HDFS-6459). 


was (Author: brandonli):
We should log the error to indicate there is invalid export table entry. Real 
user can do "showmount" command to find the valid export.

Even with no valid export, shutdown may not be desirable given we may want to 
introduce a new feature in the future to refresh export table without 
restarting NFS gateway. 

> NFS: NFS server should throw error for invalid entry in 
> dfs.nfs.exports.allowed.hosts
> -
>
> Key: HDFS-6456
> URL: https://issues.apache.org/jira/browse/HDFS-6456
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> Pass invalid entry in dfs.nfs.exports.allowed.hosts. Use - as separator 
> between hostname and access permission 
> {noformat}
> dfs.nfs.exports.allowed.hostshost1-rw
> {noformat}
> This misconfiguration is not detected by NFS server. It does not print any 
> error message. The host passed in this configuration is also not able to 
> mount nfs. In conclusion, no node can mount the nfs with this value. A format 
> check is required for this property. If the value of this property does not 
> follow the format, an error should be thrown.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6456) NFS: NFS server should throw error for invalid entry in dfs.nfs.exports.allowed.hosts

2014-05-28 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011433#comment-14011433
 ] 

Brandon Li commented on HDFS-6456:
--

We should log the error to indicate there is invalid export table entry. Real 
user can do "showmount" command to find the valid export.

Even with no valid export, shutdown may not be desirable given we may want to 
introduce a new feature in the future to refresh export table without 
restarting NFS gateway. 

> NFS: NFS server should throw error for invalid entry in 
> dfs.nfs.exports.allowed.hosts
> -
>
> Key: HDFS-6456
> URL: https://issues.apache.org/jira/browse/HDFS-6456
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> Pass invalid entry in dfs.nfs.exports.allowed.hosts. Use - as separator 
> between hostname and access permission 
> {noformat}
> dfs.nfs.exports.allowed.hostshost1-rw
> {noformat}
> This misconfiguration is not detected by NFS server. It does not print any 
> error message. The host passed in this configuration is also not able to 
> mount nfs. In conclusion, no node can mount the nfs with this value. A format 
> check is required for this property. If the value of this property does not 
> follow the format, an error should be thrown.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011431#comment-14011431
 ] 

Hudson commented on HDFS-6442:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5616 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5616/])
HDFS-6442. Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by 
port conficts. (Contributed by Zesheng Wu) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1598078)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogAutoroll.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java


> Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port 
> conficts
> --
>
> Key: HDFS-6442
> URL: https://issues.apache.org/jira/browse/HDFS-6442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: HDFS-6442.1.patch, HDFS-6442.patch
>
>
> TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to 
> set up the mini-cluster, this may result in occasionally test failure when 
> run test with -Pparallel-tests. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6448) BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011430#comment-14011430
 ] 

Hudson commented on HDFS-6448:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5616 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5616/])
HDFS-6448. BlockReaderLocalLegacy should set socket timeout based on 
conf.socketTimeout (liangxie via cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1598079)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java


> BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout
> 
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6455) NFS: Exception should be added in NFS log for invalid separator in allowed.hosts

2014-05-28 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011423#comment-14011423
 ] 

Brandon Li commented on HDFS-6455:
--

The code throws IllegalArgumentException so the error is printed into xxx.out 
file instead of xxx.log file.
Here we should throw non-run-time exception since the wrong configuration is a 
user error. 

> NFS: Exception should be added in NFS log for invalid separator in 
> allowed.hosts
> 
>
> Key: HDFS-6455
> URL: https://issues.apache.org/jira/browse/HDFS-6455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Yesha Vora
>
> The error for invalid separator in dfs.nfs.exports.allowed.hosts property 
> should be added in nfs log file instead nfs.out file.
> Steps to reproduce:
> 1. Pass invalid separator in dfs.nfs.exports.allowed.hosts
> {noformat}
> dfs.nfs.exports.allowed.hostshost1  ro:host2 
> rw
> {noformat}
> 2. restart NFS server. NFS server fails to start and print exception console.
> {noformat}
> [hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o 
> UserKnownHostsFile=/dev/null host1 "sudo su - -c 
> \"/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\" hdfs"
> starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
> formatted line 'host1 ro:host2 rw'
>   at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
>   at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
>   at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
>   at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
>   at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
>   at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
> {noformat}
> NFS log does not print any error message. It directly shuts down. 
> {noformat}
> STARTUP_MSG:   java = 1.6.0_31
> /
> 2014-05-27 18:47:13,972 INFO  nfs3.Nfs3Base (SignalLogger.java:register(91)) 
> - registered UNIX signal handlers for [TERM, HUP, INT]
> 2014-05-27 18:47:14,169 INFO  nfs3.IdUserGroup 
> (IdUserGroup.java:updateMapInternal(159)) - Updated user map size:259
> 2014-05-27 18:47:14,179 INFO  nfs3.IdUserGroup 
> (IdUserGroup.java:updateMapInternal(159)) - Updated group map size:73
> 2014-05-27 18:47:14,192 INFO  nfs3.Nfs3Base (StringUtils.java:run(640)) - 
> SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down Nfs3 at 
> {noformat}
> NFS.out file has exception.
> {noformat}
> EPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
> formatted line 'host1 ro:host2 rw'
> at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
> at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
> at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
> ulimit -a for user hdfs
> core file size  (blocks, -c) 409600
> data seg size   (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size   (blocks, -f) unlimited
> pending signals (-i) 188893
> max locked memory   (kbytes, -l) unlimited
> max memory size (kbytes, -m) unlimited
> open files  (-n) 32768
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority  (-r) 0
> stack size  (kbytes, -s) 10240
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 65536
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6458) NFS: stale NFS file handle Error for previous mount point

2014-05-28 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-6458:


 Summary: NFS: stale NFS file handle Error for previous mount point
 Key: HDFS-6458
 URL: https://issues.apache.org/jira/browse/HDFS-6458
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


Steps to reproduce:
1) Set dfs.nfs.exports.allowed.hosts =  rw
2) mount nfs on 
3) Set dfs.nfs.exports.allowed.hosts =  rw
4) mount nfs on 

Try to access NFS mount point at Gateway. Can't access mount point from 
Gateway. 
{noformat}
bash: ls /tmp/tmp_mnt 
ls: cannot access /tmp/tmp_mnt: Stale NFS file handle
{noformat}

Expected: Mount_point from previous config should be accessible if it is not 
unmounted before config change.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6457) Maintain a list of all the Encryption Zones in the file system

2014-05-28 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb reassigned HDFS-6457:
--

Assignee: Charles Lamb

> Maintain a list of all the Encryption Zones in the file system
> --
>
> Key: HDFS-6457
> URL: https://issues.apache.org/jira/browse/HDFS-6457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>
> We need to maintain a list of all encryption zones in the file system so that 
> we can ask questions about what EZ a path belongs to, if any, and let the 
> admin know all the EZs in the system.
> [~andrew.wang] Why not just have a sorted structure with pointers to all the 
> roots of the EZs? We can populate it during metadata loading on startup, and 
> keep it updated during runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6375) Listing extended attributes with the search permission

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011407#comment-14011407
 ] 

Andrew Wang commented on HDFS-6375:
---

Upon looking closer, you're totally right. I think re-using the existing XAttr 
serialization isn't going to work so well for WebHDFS also because of the 
direct REST API; I was okay with the funny handling where we ignore the values 
for the Java clients, since that's an internal implementation detail, but here, 
the on-the-wire format is public. My bad.

Based on this, I think we should serialize a JSON array of just the string 
names on the NN side, and then pull it out and return a plain List in 
WebHDFSFileSystem. This way, REST clients get a nice format (no need to parse 
the array of maps).

Does this sound reasonable to you?

> Listing extended attributes with the search permission
> --
>
> Key: HDFS-6375
> URL: https://issues.apache.org/jira/browse/HDFS-6375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Charles Lamb
> Attachments: HDFS-6375.1.patch, HDFS-6375.2.patch, HDFS-6375.3.patch, 
> HDFS-6375.4.patch, HDFS-6375.5.patch, HDFS-6375.6.patch, HDFS-6375.7.patch, 
> HDFS-6375.8.patch, HDFS-6375.9.patch
>
>
> From the attr(5) manpage:
> {noformat}
>Users with search access to a file or directory may retrieve a list  of
>attribute names defined for that file or directory.
> {noformat}
> This is like doing {{getfattr}} without the {{-d}} flag, which we currently 
> don't support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6457) Maintain a list of all the Encryption Zones in the file system

2014-05-28 Thread Charles Lamb (JIRA)
Charles Lamb created HDFS-6457:
--

 Summary: Maintain a list of all the Encryption Zones in the file 
system
 Key: HDFS-6457
 URL: https://issues.apache.org/jira/browse/HDFS-6457
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Charles Lamb


We need to maintain a list of all encryption zones in the file system so that 
we can ask questions about what EZ a path belongs to, if any, and let the admin 
know all the EZs in the system.

[~andrew.wang] Why not just have a sorted structure with pointers to all the 
roots of the EZs? We can populate it during metadata loading on startup, and 
keep it updated during runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6456) NFS: NFS server should throw error for invalid entry in dfs.nfs.exports.allowed.hosts

2014-05-28 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-6456:


 Summary: NFS: NFS server should throw error for invalid entry in 
dfs.nfs.exports.allowed.hosts
 Key: HDFS-6456
 URL: https://issues.apache.org/jira/browse/HDFS-6456
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


Pass invalid entry in dfs.nfs.exports.allowed.hosts. Use - as separator between 
hostname and access permission 

{noformat}
dfs.nfs.exports.allowed.hostshost1-rw
{noformat}

This misconfiguration is not detected by NFS server. It does not print any 
error message. The host passed in this configuration is also not able to mount 
nfs. In conclusion, no node can mount the nfs with this value. A format check 
is required for this property. If the value of this property does not follow 
the format, an error should be thrown.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6134) Transparent data at rest encryption

2014-05-28 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned HDFS-6134:


Assignee: Alejandro Abdelnur  (was: Charles Lamb)

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6134) Transparent data at rest encryption

2014-05-28 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned HDFS-6134:


Assignee: Charles Lamb  (was: Alejandro Abdelnur)

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Charles Lamb
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6455) NFS: Exception should be added in NFS log for invalid separator in allowed.hosts

2014-05-28 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated HDFS-6455:
-

Description: 
The error for invalid separator in dfs.nfs.exports.allowed.hosts property 
should be added in nfs log file instead nfs.out file.

Steps to reproduce:
1. Pass invalid separator in dfs.nfs.exports.allowed.hosts
{noformat}
dfs.nfs.exports.allowed.hostshost1  ro:host2 
rw
{noformat}

2. restart NFS server. NFS server fails to start and print exception console.
{noformat}
[hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o 
UserKnownHostsFile=/dev/null host1 "sudo su - -c 
\"/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\" hdfs"
starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
formatted line 'host1 ro:host2 rw'
at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
{noformat}

NFS log does not print any error message. It directly shuts down. 
{noformat}
STARTUP_MSG:   java = 1.6.0_31
/
2014-05-27 18:47:13,972 INFO  nfs3.Nfs3Base (SignalLogger.java:register(91)) - 
registered UNIX signal handlers for [TERM, HUP, INT]
2014-05-27 18:47:14,169 INFO  nfs3.IdUserGroup 
(IdUserGroup.java:updateMapInternal(159)) - Updated user map size:259
2014-05-27 18:47:14,179 INFO  nfs3.IdUserGroup 
(IdUserGroup.java:updateMapInternal(159)) - Updated group map size:73
2014-05-27 18:47:14,192 INFO  nfs3.Nfs3Base (StringUtils.java:run(640)) - 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down Nfs3 at 
{noformat}

NFS.out file has exception.
{noformat}
EPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
formatted line 'host1 ro:host2 rw'
at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
ulimit -a for user hdfs
core file size  (blocks, -c) 409600
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 188893
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 32768
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 65536
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited
{noformat}

  was:
The error for invalid separator in dfs.nfs.exports.allowed.hosts property 
should be added in nfs log file instead nfs.out file.

Steps to reproduce:
1. Pass invalid separator in dfs.nfs.exports.allowed.hosts
{noformat}
dfs.nfs.exports.allowed.hostshost1  ro:host2 
rw
{noformat}

2. restart NFS server. NFS server fails to start and print exception console.
{noformat}
[hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o 
UserKnownHostsFile=/dev/null host1 "sudo su - -c 
\"/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\" hdfs"
starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
formatted line 'host1 ro:host2 rw'
at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
{noformat}

NFS log does not print any error message. It directly shuts down. 
{n

[jira] [Created] (HDFS-6455) NFS: Exception should be added in NFS log for invalid separator in allowed.hosts

2014-05-28 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-6455:


 Summary: NFS: Exception should be added in NFS log for invalid 
separator in allowed.hosts
 Key: HDFS-6455
 URL: https://issues.apache.org/jira/browse/HDFS-6455
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


The error for invalid separator in dfs.nfs.exports.allowed.hosts property 
should be added in nfs log file instead nfs.out file.

Steps to reproduce:
1. Pass invalid separator in dfs.nfs.exports.allowed.hosts
{noformat}
dfs.nfs.exports.allowed.hostshost1  ro:host2 
rw
{noformat}

2. restart NFS server. NFS server fails to start and print exception console.
{noformat}
[hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o 
UserKnownHostsFile=/dev/null host1 "sudo su - -c 
\"/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\" hdfs"
starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
formatted line 'host1 ro:host2 rw'
at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
{noformat}

NFS log does not print any error message. It directly shuts down. 
{noformat}
STARTUP_MSG:   java = 1.6.0_31
/
2014-05-27 18:47:13,972 INFO  nfs3.Nfs3Base (SignalLogger.java:register(91)) - 
registered UNIX signal handlers for [TERM, HUP, INT]
2014-05-27 18:47:14,169 INFO  nfs3.IdUserGroup 
(IdUserGroup.java:updateMapInternal(159)) - Updated user map size:259
2014-05-27 18:47:14,179 INFO  nfs3.IdUserGroup 
(IdUserGroup.java:updateMapInternal(159)) - Updated group map size:73
2014-05-27 18:47:14,192 INFO  nfs3.Nfs3Base (StringUtils.java:run(640)) - 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down Nfs3 at 
{noformat}

NFS.out file has exception.
{nofromat}
EPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" java.lang.IllegalArgumentException: Incorrectly 
formatted line 'host1 ro:host2 rw'
at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
at org.apache.hadoop.nfs.NfsExports.(NfsExports.java:151)
at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:176)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:43)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
ulimit -a for user hdfs
core file size  (blocks, -c) 409600
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 188893
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 32768
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 65536
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011383#comment-14011383
 ] 

Andrew Wang commented on HDFS-6268:
---

Hey Yongjun,

Thanks for the review. We definitely could do that, though there might be some 
very minor value in having the stale nodes also sorted by network distance 
(since the stale sort is a stable sort). I could easily be swayed the either 
way though.

ATM, does this new style look like what you had in mind?

> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> --
>
> Key: HDFS-6268
> URL: https://issues.apache.org/jira/browse/HDFS-6268
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6448) BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

2014-05-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6448:
---

Summary: BlockReaderLocalLegacy should set socket timeout based on 
conf.socketTimeout  (was: BlockReaderLocalLegacy should set socket timeout via 
conf.socketTimeout)

> BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout
> 
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6448) BlockReaderLocalLegacy should set socket timeout via conf.socketTimeout

2014-05-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6448:
---

Summary: BlockReaderLocalLegacy should set socket timeout via 
conf.socketTimeout  (was: change BlockReaderLocalLegacy timeout detail)

> BlockReaderLocalLegacy should set socket timeout via conf.socketTimeout
> ---
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6448) BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout

2014-05-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6448:
---

  Resolution: Fixed
   Fix Version/s: 2.5.0
Target Version/s: 2.5.0
  Status: Resolved  (was: Patch Available)

> BlockReaderLocalLegacy should set socket timeout based on conf.socketTimeout
> 
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Fix For: 2.5.0
>
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts

2014-05-28 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6442:


  Resolution: Fixed
   Fix Version/s: 2.5.0
  3.0.0
Target Version/s: 2.5.0  (was: 3.0.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I committed to trunk and branch-2.

Thanks for the contribution [~wuzesheng]!

> Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port 
> conficts
> --
>
> Key: HDFS-6442
> URL: https://issues.apache.org/jira/browse/HDFS-6442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: HDFS-6442.1.patch, HDFS-6442.patch
>
>
> TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to 
> set up the mini-cluster, this may result in occasionally test failure when 
> run test with -Pparallel-tests. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message

2014-05-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011374#comment-14011374
 ] 

Andrew Wang commented on HDFS-6447:
---

+1 pending, though Jenkins seems to be down right now :(

I'll keep my eye on this and re-kick the build if necessary, but please ping me 
if this isn't committed by the end of the week.

> balancer should timestamp the completion message
> 
>
> Key: HDFS-6447
> URL: https://issues.apache.org/jira/browse/HDFS-6447
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Reporter: Allen Wittenauer
>Assignee: Juan Yu
>Priority: Trivial
>  Labels: newbie
> Attachments: HDFS-6447.002.patch, HDFS-6447.patch.001
>
>
> When the balancer finishes, it doesn't report the time it finished.  It 
> should do this so that users have a better sense of how long it took to 
> complete.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-05-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011357#comment-14011357
 ] 

Colin Patrick McCabe commented on HDFS-6382:


I agree with Chris' comments here.  There are just so many advantages to 
running outside the NameNode, that I think that's the design we should start 
with.  If we later find something that would work better with NN support, we 
can think about it then.

Hangjun Ye wrote:
bq. Another benefit to having it inside NN is we don't have to handle the 
authentication/authorization problem in a separate system. For example we have 
a shared HDFS cluster for many internal users, we don't want someone to set TTL 
policy to other one's files. NN could handle it easily by its own 
authentication/authorization mechanism.

The client handles authentication/authorization very well, actually.  You can 
choose to run your cleanup job as superuser (can do anything) or some other 
less powerful user who is limited (safer).  But when you run inside the 
NameNode, there are no safeguards... everything is effectively superuser.  And 
you can destroy or corrupt the entire filesystem very easily that way, 
especially if your cleanup code is buggy.

> HDFS File/Directory TTL
> ---
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 
> debug program bugs, so we keep all the logs on hdfs and delete logs which are 
> older than 1 month. This is a typical scenario of HDFS TTL. So here we 
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after 
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be 
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent 
> directory's
> 5. A global configuration is needed to configure that whether the deleted 
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory 
> with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6448) change BlockReaderLocalLegacy timeout detail

2014-05-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011341#comment-14011341
 ] 

stack commented on HDFS-6448:
-

+1

> change BlockReaderLocalLegacy timeout detail
> 
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6286) adding a timeout setting for local read io

2014-05-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011337#comment-14011337
 ] 

Colin Patrick McCabe commented on HDFS-6286:


Yeah, HDFS-6450 is the more general solution, so probably the first thing to 
try.  Sounds good.

> adding a timeout setting for local read io
> --
>
> Key: HDFS-6286
> URL: https://issues.apache.org/jira/browse/HDFS-6286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
>
> Currently, if a write or remote read requested into a sick disk, 
> DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to 
> return back. but it doesn't work on local read. Take HBase scan for example,
> DFSInputStream.read -> readWithStrategy -> readBuffer -> 
> BlockReaderLocal.read ->  dataIn.read -> FileChannelImpl.read
> if it hits a bad disk, the low read io probably takes tens of seconds,  and 
> what's worse is, the "DFSInputStream.read" hold a lock always.
> Per my knowledge, there's no good mechanism to cancel a running read 
> io(Please correct me if it's wrong), so my opinion is adding a future around 
> the read request, and we could set a timeout there, if the threshold reached, 
> we can add the local node into deadnode probably...
> Any thought?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6448) change BlockReaderLocalLegacy timeout detail

2014-05-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011333#comment-14011333
 ] 

Colin Patrick McCabe commented on HDFS-6448:


+1.  Will commit today or tomorrow unless there's more to say here

> change BlockReaderLocalLegacy timeout detail
> 
>
> Key: HDFS-6448
> URL: https://issues.apache.org/jira/browse/HDFS-6448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6448.txt
>
>
> Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS 
> side, but we also found from HBase side, the dfs client was hung at 
> getBlockReader, after reading code, we found there is a timeout setting in 
> current codebase though, but the default hdfsTimeout value is "-1"  ( from 
> Client.java:getTimeout(conf) )which means no timeout...
> The hung stack trace like following:
> at $Proxy21.getBlockLocalPathInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267)
> at 
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812)
> One feasible fix is replacing the hdfsTimeout with socketTimeout. see 
> attached patch. Most of credit should give [~liushaohui]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-05-28 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011305#comment-14011305
 ] 

Chris Nauroth commented on HDFS-6382:
-

bq. ...run a job (maybe periodically) over the namespace inside the NN...

Please correct me if I misunderstood, but this sounds like execution of 
arbitrary code inside the NN process.  If so, this opens the risk of resource 
exhaustion at the NN by buggy or malicious code.  Even if there is a fork for 
process isolation, it's still sharing machine resources with the NN process.  
If the code is running as the HDFS super-user, then it has access to sensitive 
resources like the fsimage file.  If multiple such "in-process jobs" are 
submitted concurrently, then it would cause resource contention with the main 
work of the NN.  Multiple concurrent jobs also gets into the realm of 
scheduling.  There are lots of tough problems here that would increase the 
complexity of the NN.

Even putting that aside, I see multiple advantages in implementing this 
externally instead of embedded inside the NN.  Here is a list of several 
problems that an embedded design would need to solve, and which I believe are 
already easily addressed by an external design.  This includes/expands on 
issues brought up by others in earlier comments too.

* Trash: The description mentions trash capability as a requirement.  Trash 
functionality is currently implemented as a client-side capability.
** Embedded: We'd need to reimplement trash inside the NN, or heavily refactor 
for code sharing.
** External: The client already has the trash capability, so this problem is 
already solved.
* Integration: Many Hadoop deployments use an alternative file system like S3 
or Azure storage.  In these deployments, there is no NameNode.
** Embedded: The feature is only usable for HDFS-based deployments.  Users of 
alternative file systems can't use the feature.
** External: The client already has the capability to target any Hadoop file 
system implementation, so this problem is already solved.
* HA: In the event of a failover, we must guarantee that the former active NN 
does not drive any expiration activity.
** Embedded: Any background thread or "in-process jobs" running inside the NN 
must coordinate shutdown during a failover.
** External: Thanks to our client-side retry policies, an external process 
automatically transitions to the new active NN after a failover, and there is 
no risk of split-brain scenario, so this problem is already solved.
* Authentication/Authorization: Who exactly is the effective user running the 
delete, and how do we manage their login and file permission enforcement?
** Embedded: You mention there is an advantage to running embedded, but I 
didn't quite understand.  Are you suggesting running the deletes inside a 
{{UserGroupInformation#doAs}} for the specific user?
** External: The client already knows how to authenticate RPC, and the NN 
already knows how to enforce authorization on files for that authenticated 
user, so this problem is already solved.
* Error Handling: How do users find out when the deletes don't work?
** Embedded: There is no mechanism for asynchronous user notification inside 
the NN.  As others have mentioned, there is a lot of complexity in this area.  
If it's email, then you need to solve the problem of reliable email delivery 
(i.e. retries if SMTP gateways are down).  If it's monitoring/alerting, then 
you need to expose new monitoring endpoints to publish sufficient information.
** External: The client's exception messages are sufficient to identify file 
paths that failed during synchronous calls, and the NN audit log is another 
source of troubleshooting information, so this problem is already solved.
* Federation: With federation, the HDFS namespace is split across multiple 
NameNodes.
** Embedded: The design needs to coordinate putting the right expiration work 
on the right NN hosting that part of the namespace.
** External: The client has the capability to configure a client-side mount 
table that joins together multiple federated namespaces, and {{ViewFileSystem}} 
then routes RPC to the correct NN depending on the target file path, so this 
problem is already solved.



> HDFS File/Directory TTL
> ---
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 

[jira] [Commented] (HDFS-6403) Add metrics for log warnings reported by HADOOP-9618

2014-05-28 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011223#comment-14011223
 ] 

Yongjun Zhang commented on HDFS-6403:
-

Uploaded version 002 to address test failure.


> Add metrics for log warnings reported by HADOOP-9618
> 
>
> Key: HDFS-6403
> URL: https://issues.apache.org/jira/browse/HDFS-6403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6403.001.patch, HDFS-6403.002.patch
>
>
> HADOOP-9618 logs warnings when there are long GC pauses. If this is exposed 
> as a metric, then they can be monitored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6403) Add metrics for log warnings reported by HADOOP-9618

2014-05-28 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6403:


Attachment: HDFS-6403.002.patch

> Add metrics for log warnings reported by HADOOP-9618
> 
>
> Key: HDFS-6403
> URL: https://issues.apache.org/jira/browse/HDFS-6403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6403.001.patch, HDFS-6403.002.patch
>
>
> HADOOP-9618 logs warnings when there are long GC pauses. If this is exposed 
> as a metric, then they can be monitored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011185#comment-14011185
 ] 

Hudson commented on HDFS-6227:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1784 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1784/])
HDFS-6227. ShortCircuitCache#unref should purge ShortCircuitReplicas whose 
streams have been closed by java interrupts. Contributed by Colin Patrick 
McCabe. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597829)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java


> ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have 
> been closed by java interrupts
> ---
>
> Key: HDFS-6227
> URL: https://issues.apache.org/jira/browse/HDFS-6227
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Colin Patrick McCabe
> Fix For: 2.5.0
>
> Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, 
> HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is 
> enabled and multiple threads may read the same file concurrently, one of the 
> read got ClosedChannelException and failed. Full exception trace see comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011186#comment-14011186
 ] 

Hudson commented on HDFS-6416:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1784 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1784/])
HDFS-6416. Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid 
system clock bugs. Contributed by Abhiraj Butala (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597868)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtxCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system 
> clock bugs
> 
>
> Key: HDFS-6416
> URL: https://issues.apache.org/jira/browse/HDFS-6416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.4.0
>Reporter: Brandon Li
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HDFS-6416.patch
>
>
> As [~cnauroth]  pointed out in HADOOP-10612,  Time#monotonicNow is a more 
> preferred method to use since this isn't subject to system clock bugs (i.e. 
> Someone resets the clock to a time in the past, and then updates don't happen 
> for a long time.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011184#comment-14011184
 ] 

Hudson commented on HDFS-6411:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1784 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1784/])
HDFS-6411. nfs-hdfs-gateway mount raises I/O error and hangs when a 
unauthorized user attempts to access it. Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597895)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/response/ACCESS3Response.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user 
> attempts to access it
> 
>
> Key: HDFS-6411
> URL: https://issues.apache.org/jira/browse/HDFS-6411
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Zhongyi Xie
>Assignee: Brandon Li
> Fix For: 2.4.1
>
> Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, 
> HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, 
> tcpdump-HDFS-6411-Brandon.out
>
>
> We use the nfs-hdfs gateway to expose hdfs thru nfs.
> 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. 
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> backups  hive  mr-history  system  tmp  user
> 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a 
> proxyuser of nfsserver
> 2) switch to test user: su - nfs-test
> 3) access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot open directory /hdfs: Input/output error
> retry:
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> 4) switch back to root and access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ exit
> logout
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> the nfsserver log indicates we hit an authorization error in the rpc handler; 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>  User: nfsserver is not allowed to impersonate nfs-test
> and NFS3ERR_IO is returned, which explains why we see input/output error. 
> One can catch the authorizationexception and return the correct error: 
> NFS3ERR_ACCES to fix the error message on the client side but that doesn't 
> seem to solve the mount hang issue though. When the mount hang happens, it 
> stops printing nfsserver log which makes it more difficult to figure out the 
> real cause of the hang. According to jstack and debugger, the nfsserver seems 
> to be waiting for client requests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011106#comment-14011106
 ] 

Hudson commented on HDFS-6416:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1757 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1757/])
HDFS-6416. Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid 
system clock bugs. Contributed by Abhiraj Butala (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597868)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtxCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system 
> clock bugs
> 
>
> Key: HDFS-6416
> URL: https://issues.apache.org/jira/browse/HDFS-6416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.4.0
>Reporter: Brandon Li
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HDFS-6416.patch
>
>
> As [~cnauroth]  pointed out in HADOOP-10612,  Time#monotonicNow is a more 
> preferred method to use since this isn't subject to system clock bugs (i.e. 
> Someone resets the clock to a time in the past, and then updates don't happen 
> for a long time.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011104#comment-14011104
 ] 

Hudson commented on HDFS-6411:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1757 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1757/])
HDFS-6411. nfs-hdfs-gateway mount raises I/O error and hangs when a 
unauthorized user attempts to access it. Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597895)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/response/ACCESS3Response.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user 
> attempts to access it
> 
>
> Key: HDFS-6411
> URL: https://issues.apache.org/jira/browse/HDFS-6411
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Zhongyi Xie
>Assignee: Brandon Li
> Fix For: 2.4.1
>
> Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, 
> HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, 
> tcpdump-HDFS-6411-Brandon.out
>
>
> We use the nfs-hdfs gateway to expose hdfs thru nfs.
> 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. 
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> backups  hive  mr-history  system  tmp  user
> 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a 
> proxyuser of nfsserver
> 2) switch to test user: su - nfs-test
> 3) access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot open directory /hdfs: Input/output error
> retry:
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> 4) switch back to root and access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ exit
> logout
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> the nfsserver log indicates we hit an authorization error in the rpc handler; 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>  User: nfsserver is not allowed to impersonate nfs-test
> and NFS3ERR_IO is returned, which explains why we see input/output error. 
> One can catch the authorizationexception and return the correct error: 
> NFS3ERR_ACCES to fix the error message on the client side but that doesn't 
> seem to solve the mount hang issue though. When the mount hang happens, it 
> stops printing nfsserver log which makes it more difficult to figure out the 
> real cause of the hang. According to jstack and debugger, the nfsserver seems 
> to be waiting for client requests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011105#comment-14011105
 ] 

Hudson commented on HDFS-6227:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1757 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1757/])
HDFS-6227. ShortCircuitCache#unref should purge ShortCircuitReplicas whose 
streams have been closed by java interrupts. Contributed by Colin Patrick 
McCabe. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597829)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java


> ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have 
> been closed by java interrupts
> ---
>
> Key: HDFS-6227
> URL: https://issues.apache.org/jira/browse/HDFS-6227
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Colin Patrick McCabe
> Fix For: 2.5.0
>
> Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, 
> HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is 
> enabled and multiple threads may read the same file concurrently, one of the 
> read got ClosedChannelException and failed. Full exception trace see comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011063#comment-14011063
 ] 

Hadoop QA commented on HDFS-6453:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647104/HDFS-6453-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6994//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6994//console

This message is automatically generated.

> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6453-v2.txt, HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011027#comment-14011027
 ] 

Hudson commented on HDFS-6416:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #566 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/566/])
HDFS-6416. Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid 
system clock bugs. Contributed by Abhiraj Butala (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597868)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtxCache.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system 
> clock bugs
> 
>
> Key: HDFS-6416
> URL: https://issues.apache.org/jira/browse/HDFS-6416
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.4.0
>Reporter: Brandon Li
>Assignee: Abhiraj Butala
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HDFS-6416.patch
>
>
> As [~cnauroth]  pointed out in HADOOP-10612,  Time#monotonicNow is a more 
> preferred method to use since this isn't subject to system clock bugs (i.e. 
> Someone resets the clock to a time in the past, and then updates don't happen 
> for a long time.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011026#comment-14011026
 ] 

Hudson commented on HDFS-6227:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #566 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/566/])
HDFS-6227. ShortCircuitCache#unref should purge ShortCircuitReplicas whose 
streams have been closed by java interrupts. Contributed by Colin Patrick 
McCabe. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597829)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java


> ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have 
> been closed by java interrupts
> ---
>
> Key: HDFS-6227
> URL: https://issues.apache.org/jira/browse/HDFS-6227
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jing Zhao
>Assignee: Colin Patrick McCabe
> Fix For: 2.5.0
>
> Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, 
> HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch
>
>
> While running tests in a single node cluster, where short circuit read is 
> enabled and multiple threads may read the same file concurrently, one of the 
> read got ClosedChannelException and failed. Full exception trace see comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it

2014-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011023#comment-14011023
 ] 

Hudson commented on HDFS-6411:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #566 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/566/])
HDFS-6411. nfs-hdfs-gateway mount raises I/O error and hangs when a 
unauthorized user attempts to access it. Contributed by Brandon Li (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597895)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/response/ACCESS3Response.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user 
> attempts to access it
> 
>
> Key: HDFS-6411
> URL: https://issues.apache.org/jira/browse/HDFS-6411
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.2.0
>Reporter: Zhongyi Xie
>Assignee: Brandon Li
> Fix For: 2.4.1
>
> Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, 
> HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, 
> tcpdump-HDFS-6411-Brandon.out
>
>
> We use the nfs-hdfs gateway to expose hdfs thru nfs.
> 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. 
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> backups  hive  mr-history  system  tmp  user
> 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a 
> proxyuser of nfsserver
> 2) switch to test user: su - nfs-test
> 3) access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot open directory /hdfs: Input/output error
> retry:
> [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> 4) switch back to root and access hdfs nfs gateway
> [nfs-test@zhongyi-test-cluster-desktop ~]$ exit
> logout
> [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs
> ls: cannot access /hdfs: Stale NFS file handle
> the nfsserver log indicates we hit an authorization error in the rpc handler; 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>  User: nfsserver is not allowed to impersonate nfs-test
> and NFS3ERR_IO is returned, which explains why we see input/output error. 
> One can catch the authorizationexception and return the correct error: 
> NFS3ERR_ACCES to fix the error message on the client side but that doesn't 
> seem to solve the mount hang issue though. When the mount hang happens, it 
> stops printing nfsserver log which makes it more difficult to figure out the 
> real cause of the hang. According to jstack and debugger, the nfsserver seems 
> to be waiting for client requests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6453:


Attachment: HDFS-6453-v2.txt

v2 makes the test case stable.

> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6453-v2.txt, HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6454) Configurable Block Placement Policy

2014-05-28 Thread Guo Ruijing (JIRA)
Guo Ruijing created HDFS-6454:
-

 Summary: Configurable Block Placement Policy
 Key: HDFS-6454
 URL: https://issues.apache.org/jira/browse/HDFS-6454
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Guo Ruijing


In existing implementation,  block choose priority is localhost/remote 
rack/local rack/ramdon.
in BlockPlacementPolicyDefault, network topology is /rack/host.
In BlockPlacementPolicyWithNodeGroup, network topology is /rack/nodegroup/host.

This JIRA is to propose block choose priority can be configurable as:


  dfs.block.replicator.priority
  0, 2, 1, *
  
   default network topology is /level2/level1
   nodegroup network topology is /level3/level2/level1. choose priority can 
be 0(localhost), 3(remote rack), 2(local rack), *(any host)
  


Another example (one VM includes serveral dockers/containers) so network 
topology can be /rack/nodegroup/container/host. in this case, block replicator 
priority can be
0(localhost), 4(remote rack), 3(local rack), *(any host)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6453) use Time#monotonicNow to avoid system clock reset

2014-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010949#comment-14010949
 ] 

Hadoop QA commented on HDFS-6453:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647072/HDFS-6453.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.datanode.TestDiskError

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6993//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6993//console

This message is automatically generated.

> use Time#monotonicNow to avoid system clock reset
> -
>
> Key: HDFS-6453
> URL: https://issues.apache.org/jira/browse/HDFS-6453
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6453.txt
>
>
> similiar with hadoop-common, let's re-check and replace 
> System#currentTimeMillis with Time#monotonicNow in HDFS project as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-05-28 Thread Hangjun Ye (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010891#comment-14010891
 ] 

Hangjun Ye commented on HDFS-6382:
--

Thanks Haohui, that's clear to us now.
That's interesting and we'd like to pursue the more general approach.
We will take time to work out a rough design and ask you guys to review.

> HDFS File/Directory TTL
> ---
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 
> debug program bugs, so we keep all the logs on hdfs and delete logs which are 
> older than 1 month. This is a typical scenario of HDFS TTL. So here we 
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after 
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be 
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent 
> directory's
> 5. A global configuration is needed to configure that whether the deleted 
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory 
> with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-05-28 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010879#comment-14010879
 ] 

Haohui Mai commented on HDFS-6382:
--

bq. Your suggestion is that we'd better have a general mechanism/framework to 
run a job (maybe periodically) over the namespace inside the NN, and the TTL 
policy is just a specific job that might be implemented by user?

This is correct. There are a couple additional use cases that might be useful 
to keep in mind:

# Archiving data. TTL is one of the use case here.
# Backing up or syncing data between clusters. It's nice to back up / to sync 
data between clusters for disaster recovery, without running a MR job.
# Balancing data between data nodes.

A mechanism that can support the above use cases can be quite powerful and 
improve the state of the art. I'm happy to collaborate if this is the direction 
you guys want to pursue.

bq. We are heavy users of Hadoop and also do some in-house improvements per our 
business requirement. We definitely want to contribute the improvements back to 
community.

This is great to hear. Patches are welcome.

> HDFS File/Directory TTL
> ---
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.4.0
>Reporter: Zesheng Wu
>Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 
> debug program bugs, so we keep all the logs on hdfs and delete logs which are 
> older than 1 month. This is a typical scenario of HDFS TTL. So here we 
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after 
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be 
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent 
> directory's
> 5. A global configuration is needed to configure that whether the deleted 
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory 
> with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)