[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: Patch Available  (was: In Progress)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021718#comment-14021718
 ] 

Hadoop QA commented on HDFS-6379:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648919/jira-HDFS-6379.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1283 javac 
compiler warnings (more than the trunk's current 1277 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-httpfs:

  
org.apache.hadoop.fs.http.client.TestHttpFSFileSystemLocalFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7059//console

This message is automatically generated.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6494) In some case, the hedged read will lead to client infinite wait.

2014-06-09 Thread LiuLei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LiuLei updated HDFS-6494:
-

Attachment: hedged-read-bug.patch

 In some case, the  hedged read will lead to client  infinite wait.
 --

 Key: HDFS-6494
 URL: https://issues.apache.org/jira/browse/HDFS-6494
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: hedged-read-bug.patch


 When I use hedged read, If there is only one live datanode, the reading 
 from  the datanode throw TimeoutException and ChecksumException., the Client 
 will infinite wait.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6494) In some case, the hedged read will lead to client infinite wait.

2014-06-09 Thread LiuLei (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021729#comment-14021729
 ] 

LiuLei commented on HDFS-6494:
--

Hi Liang,
I upload one patch, I hope that is helpful for you.

 In some case, the  hedged read will lead to client  infinite wait.
 --

 Key: HDFS-6494
 URL: https://issues.apache.org/jira/browse/HDFS-6494
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: hedged-read-bug.patch


 When I use hedged read, If there is only one live datanode, the reading 
 from  the datanode throw TimeoutException and ChecksumException., the Client 
 will infinite wait.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5442) Zero loss HDFS data replication for multiple datacenters

2014-06-09 Thread Dian Fu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated HDFS-5442:
--

Attachment: Disaster Recovery Solution for Hadoop.pdf

Updated the design doc, add some detailed implementation.

 Zero loss HDFS data replication for multiple datacenters
 

 Key: HDFS-5442
 URL: https://issues.apache.org/jira/browse/HDFS-5442
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Avik Dey
Assignee: Dian Fu
 Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster 
 Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf


 Hadoop is architected to operate efficiently at scale for normal hardware 
 failures within a datacenter. Hadoop is not designed today to handle 
 datacenter failures. Although HDFS is not designed for nor deployed in 
 configurations spanning multiple datacenters, replicating data from one 
 location to another is common practice for disaster recovery and global 
 service availability. There are current solutions available for batch 
 replication using data copy/export tools. However, while providing some 
 backup capability for HDFS data, they do not provide the capability to 
 recover all your HDFS data from a datacenter failure and be up and running 
 again with a fully operational Hadoop cluster in another datacenter in a 
 matter of minutes. For disaster recovery from a datacenter failure, we should 
 provide a fully distributed, zero data loss, low latency, high throughput and 
 secure HDFS data replication solution for multiple datacenter setup.
 Design and code for Phase-1 to follow soon.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6465) Enable the configuration of multiple clusters

2014-06-09 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021786#comment-14021786
 ] 

Dian Fu commented on HDFS-6465:
---

update some design details about configurations:

requirements:
1.  Existing deployments must be able to use the existing configuration 
without any change.
2.  As many as possible the configurations for different clusters must be 
the same. The special configuration required for different clusters should be 
minimal.

Configurations added:
•   DFS_REGION_ID(dfs.region.id) : the region id of current cluster
•   DFS_REGIONS(dfs.regions) : the region ids of all clusters, including 
both the primary cluster and mirror clusters
•   DFS_REGION_PRIMARY(dfs.region.primary) : the region id of primary 
cluster

Configurations must be suffixed with regionId:
DFS_NAMENODE_RPC_ADDRESS_KEY, DFS_NAMENODE_SERVICE_RPC_ADDRESS_KEY, 
DFS_NAMENODE_HTTP_ADDRESS_KEY, DFS_NAMENODE_HTTPS_ADDRESS_KEY, 
DFS_NAMENODE_SECONDARY_HTTP_ADDRESS_KEY and DFS_NAMENODE_BACKUP_ADDRESS_KEY 

Configurations could be suffixed with regionId or not.
These include all the configurations in NameNode.NAMENODE_SPECIFIC_KEYS and 
NameNode.NAMESERVICE_SPECIFIC_KEYS except the above configurations which must 
be suffixed with regionId:
DFS_NAMENODE_RPC_BIND_HOST_KEY, DFS_NAMENODE_NAME_DIR_KEY, 
DFS_NAMENODE_EDITS_DIR_KEY, DFS_NAMENODE_SHARED_EDITS_DIR_KEY, 
DFS_NAMENODE_CHECKPOINT_DIR_KEY, DFS_NAMENODE_CHECKPOINT_EDITS_DIR_KEY, 
DFS_NAMENODE_SERVICE_RPC_BIND_HOST_KEY, DFS_NAMENODE_HTTP_BIND_HOST_KEY,
DFS_NAMENODE_HTTPS_BIND_HOST_KEY, DFS_NAMENODE_KEYTAB_FILE_KEY, 
DFS_NAMENODE_SECONDARY_HTTPS_ADDRESS_KEY, 
DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY, DFS_NAMENODE_BACKUP_HTTP_ADDRESS_KEY, 
DFS_NAMENODE_BACKUP_SERVICE_RPC_ADDRESS_KEY, 
DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY, 
DFS_NAMENODE_KERBEROS_INTERNAL_SPNEGO_PRINCIPAL_KEY, DFS_HA_FENCE_METHODS_KEY, 
DFS_HA_ZKFC_PORT_KEY and DFS_HA_AUTO_FAILOVER_ENABLED_KEY
The above configurations can be configured in the following format to 
distinguish between clusters:
configuration key.nameservice id.namenode id.region id
If a configuration with a region id as suffix cannot be found, the 
configuration without region id as suffix will be used instead.

All other configurations which aren’t mentioned should not be suffixed with 
regionId.

 Enable the configuration of multiple clusters
 -

 Key: HDFS-6465
 URL: https://issues.apache.org/jira/browse/HDFS-6465
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Dian Fu
Assignee: Dian Fu
 Attachments: HDFS-6465.1.patch, HDFS-6465.2.patch, HDFS-6465.patch


 Tracks the changes required for configuration DR.
 configurations added:
 DFS_REGION_ID(dfs.region.id) : the region id of current cluster
 DFS_REGIONS(dfs.regions) : the region ids of all clusters, including 
 both the primary cluster and mirror cluster
 DFS_REGION_PRIMARY(dfs.region.primary) : the region id of primary 
 cluster
 configurations modified:
 The configurations in NAMENODE.NAMENODE_SPECIFIC_KEYS can be configured 
 in the following format to distinguish between clusters.
 If a configuration with a region id as suffix cannot be found, the 
 configuration without region id as suffix will be used instead:
 configuration key.nameservice id.namenode id.region id
 The configurations in NAMENODE.NAMESERVICE_SPECIFIC_KEYS can be configured in 
 the following format to distinguish between clusters.
 If a configuration with a region id as suffix cannot be found, the 
 configuration without region id as suffix will be used instead:
 configuration key.nameservice id.region id



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6382) HDFS File/Directory TTL

2014-06-09 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6382:
-

Attachment: HDFS-TTL-Design.pdf

An initial version of design doc.

 HDFS File/Directory TTL
 ---

 Key: HDFS-6382
 URL: https://issues.apache.org/jira/browse/HDFS-6382
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, namenode
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-TTL-Design.pdf


 In production environment, we always have scenario like this, we want to 
 backup files on hdfs for some time and then hope to delete these files 
 automatically. For example, we keep only 1 day's logs on local disk due to 
 limited disk space, but we need to keep about 1 month's logs in order to 
 debug program bugs, so we keep all the logs on hdfs and delete logs which are 
 older than 1 month. This is a typical scenario of HDFS TTL. So here we 
 propose that hdfs can support TTL.
 Following are some details of this proposal:
 1. HDFS can support TTL on a specified file or directory
 2. If a TTL is set on a file, the file will be deleted automatically after 
 the TTL is expired
 3. If a TTL is set on a directory, the child files and directories will be 
 deleted automatically after the TTL is expired
 4. The child file/directory's TTL configuration should override its parent 
 directory's
 5. A global configuration is needed to configure that whether the deleted 
 files/directories should go to the trash or not
 6. A global configuration is needed to configure that whether a directory 
 with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Zesheng Wu (JIRA)
Zesheng Wu created HDFS-6503:


 Summary: Fix typo of DFSAdmin restoreFailedStorage
 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor


Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6503:
-

Status: Patch Available  (was: Open)

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6503:
-

Attachment: HDFS-6503.patch

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2014-06-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6481:
-

Assignee: Ted Yu

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 

[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2014-06-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6481:
-

Target Version/s: 2.5.0

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 

[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2014-06-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025216#comment-14025216
 ] 

Kihwal Lee commented on HDFS-6481:
--

We can add sanity checks, but this should not happen unless we have a bug 
somewhere. The root cause needs to be addressed.

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 

[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025281#comment-14025281
 ] 

Hadoop QA commented on HDFS-6503:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649235/HDFS-6503.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7060//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7060//console

This message is automatically generated.

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Zesheng Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025285#comment-14025285
 ] 

Zesheng Wu commented on HDFS-6503:
--

Just fix typo, no need to add new tests.

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6403) Add metrics for log warnings reported by HADOOP-9618

2014-06-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025326#comment-14025326
 ] 

Yongjun Zhang commented on HDFS-6403:
-

HI [~tlipcon], as we chatted earlier,  appreciate if you could help reviewing 
the patch. Thanks.


 Add metrics for log warnings reported by HADOOP-9618
 

 Key: HDFS-6403
 URL: https://issues.apache.org/jira/browse/HDFS-6403
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6403.001.patch, HDFS-6403.002.patch


 HADOOP-9618 logs warnings when there are long GC pauses. If this is exposed 
 as a metric, then they can be monitored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6159) TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails if there is block missing after balancer success

2014-06-09 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025373#comment-14025373
 ] 

Arpit Agarwal commented on HDFS-6159:
-

Hi [~djp], looks unrelated to HDFS-6362 from a quick look. I also took a quick 
look at HDFS-6424 and it appears unrelated.

Please feel free a separate Jira for the test failure and attach the 
logs/analysis.

 TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails if there is block 
 missing after balancer success
 --

 Key: HDFS-6159
 URL: https://issues.apache.org/jira/browse/HDFS-6159
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.3.0
Reporter: Chen He
Assignee: Chen He
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6159-v2.patch, HDFS-6159-v2.patch, HDFS-6159.patch, 
 logs.txt


 The TestBalancerWithNodeGroup.testBalancerWithNodeGroup will report negative 
 false failure if there is(are) data block(s) losing after balancer 
 successfuly finishes. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file

2014-06-09 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025391#comment-14025391
 ] 

Uma Maheswara Rao G commented on HDFS-2006:
---

XAttr support for DistCP(MAPREDUCE-5898) committed now to trunk. So, I plan to 
merge this to branch-2. Do we need separate voting for this?
What do you say [~cnauroth] and [~andrew.wang] ?

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

2014-06-09 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025418#comment-14025418
 ] 

Akira AJISAKA commented on HDFS-6503:
-

+1 (non-binding).

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025438#comment-14025438
 ] 

Andrew Wang commented on HDFS-6257:
---

Idea look good, the current check definitely seems racy. Only q, maybe we 
should try and check more deterministically, e.g. pause DN cache reports and 
wait for a few refresh intervals (1s each) before doing the check.

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025453#comment-14025453
 ] 

Andrew Wang commented on HDFS-2006:
---

Hey Uma,  we haven't done a vote for previous branch-2 merges (e.g. caching, 
ACLs). If you post a patch or a link to a branch, I'd be happy to review. 
Unless you already plan to do something similar, I can also do a full branch-2 
test run on our internal jenkins.

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2014-06-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025455#comment-14025455
 ] 

Daryn Sharp commented on HDFS-2856:
---

Chris asked that I take a look, so I'll try to review this week.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Affects Versions: 3.0.0, 2.4.0
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
 HDFS-2856.1.patch, HDFS-2856.prototype.patch


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025472#comment-14025472
 ] 

Andrew Wang commented on HDFS-6460:
---

Hey Yongjun, thanks for working on this. Just one review comment, two of the 
DNs have the same IP of 11.11.11.11. Otherwise +1 pending Jenkins.

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk

2014-06-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025495#comment-14025495
 ] 

Colin Patrick McCabe commented on HDFS-6257:


The current check should always succeed if the code being tested is correct, so 
it's not racy in that sense.  We could wait for more DN cache reports, but 
since the DNs are full they shouldn't change.  Since we test the cache reports 
elsewhere, I think it's probably fine as-is, what do you think?

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025499#comment-14025499
 ] 

Jing Zhao commented on HDFS-6315:
-

The patch looks good to me in general. Some comments:
# After moving persistBlocks/persistNewBlocks/closeFile from FSDirectory to 
FSNamesystem, we may no longer need to add DIR* FSDirectory into the log 
information.
# Looks like FSNamesystem#persistBlocks(INodeFile, boolean) can be removed. We 
can just call persistBlocks(String, INodeFile, boolean) instead.
# In FSNamesystem#setQuota logSync cannot be called inside of the write lock:
{code}
+  INodeDirectory changed = dir.setQuota(path, nsQuota, dsQuota);
+  if (changed != null) {
+final Quota.Counts q = changed.getQuotaCounts();
+getEditLog().logSetQuota(path,
+q.get(Quota.NAMESPACE), q.get(Quota.DISKSPACE));
+getEditLog().logSync();
+  }
 } finally {
   writeUnlock();
 }
-getEditLog().logSync();
{code}
# A typo in the java comment:
{code}
-  // if src indicates a snapshot file, we need to make sure the 
returned
+  // if src inSicates a snapshot file, we need to make sure the 
returned
{code}

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch, jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: In Progress  (was: Patch Available)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025502#comment-14025502
 ] 

Andrew Wang commented on HDFS-6257:
---

Hmm, I guess good enough. +1 thanks colin.

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: Patch Available  (was: In Progress)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally

2014-06-09 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6257:
---

Summary: TestCacheDirectives#testExceedsCapacity fails occasionally  (was: 
TestCacheDirectives#testExceedsCapacity fails occasionally in trunk)

 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025504#comment-14025504
 ] 

Andrew Wang commented on HDFS-6493:
---

I think we should keep the default value to disabled, but the property and this 
value should be documented in hdfs-default.xml.

It'd also be nice (if not true already) to pretty print this value on NN 
startup, e.g. 30 minutes rather than 1800 seconds. It'd actually be nice 
follow-on work to look for similarly unfriendly values in the logs and pretty 
printing them. There are some time-related function in DFSUtil (e.g. 
durationToString, datetoIso8601String), but feel free to write your own 
functions too.

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial

 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally

2014-06-09 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6257:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

committed, thanks

 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6399) FSNamesystem ACL operations should check isPermissionEnabled

2014-06-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-6399:
-

Assignee: Chris Nauroth  (was: Charles Lamb)

 FSNamesystem ACL operations should check isPermissionEnabled
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6330) Move mkdirs() to FSNamesystem

2014-06-09 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6330:


Summary: Move mkdirs() to FSNamesystem  (was: Move mkdir() to FSNamesystem)

 Move mkdirs() to FSNamesystem
 -

 Key: HDFS-6330
 URL: https://issues.apache.org/jira/browse/HDFS-6330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6330.000.patch, HDFS-6330.001.patch


 Currently mkdir() automatically creates all ancestors for a directory. This 
 is implemented in FSDirectory, by calling unprotectedMkdir() along the path. 
 This jira proposes to move the function to FSNamesystem to simplify the 
 primitive that FSDirectory needs to provide.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: HDFS-6315.005.patch

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario

2014-06-09 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6478:
--

Attachment: HDFS-6478.patch

The patch has the followings:

1. Modify the proxy chain order for NamenodeProtocol and ClientProtocol so that 
NamenodeProtocolTranslatorPB/ClientNamenodeProtocolTranslatorPB directly call  
NamenodeProtocolPB and ClientNamenodeProtocolPB for non-HA case.
2. Update unit test TestFileCreation to verify retry count. This depends on 
HADOOP-10673, thus the patch also include HADOOP-10673 so that the patch can be 
submitted to run unit test.
3. Simplify the remoteException policy setup in NameNodeProxies.
4. Remove unnecessary retry policy for method create in 
DatanodeProtocolClientSideTranslatorPB.
5. DatanodeProtocolClientSideTranslatorPB still has the old proxy order. Leave 
it as it is given DataNodeProtocol doesn't do retry. We can open a separate 
jira to DataNodeProtocol retry if that is necessary.

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario

2014-06-09 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6478:
--

Status: Patch Available  (was: Open)

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6399:
--

Summary: Add note about setfacl in HDFS permissions guide  (was: 
FSNamesystem ACL operations should check isPermissionEnabled)

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: (was: HDFS-6315.005.patch)

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6399:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Chris

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6399) FSNamesystem ACL operations should check isPermissionEnabled

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025509#comment-14025509
 ] 

Andrew Wang commented on HDFS-6399:
---

+1 thanks chris, will commit shortly.

 FSNamesystem ACL operations should check isPermissionEnabled
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally

2014-06-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025541#comment-14025541
 ] 

Hudson commented on HDFS-6257:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5668 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5668/])
HDFS-6257. TestCacheDirectives#testExceedsCapacity fails occasionally (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601473)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCacheDirectives.java


 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025540#comment-14025540
 ] 

Chris Nauroth commented on HDFS-6399:
-

Andrew, thank you for reviewing and committing.

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025516#comment-14025516
 ] 

Haohui Mai commented on HDFS-6315:
--

Thanks Jing for the review. I've uploaded the v5 patch to address Jing's 
comments.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1402#comment-1402
 ] 

Hudson commented on HDFS-6399:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5669 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5669/])
HDFS-6399. Add note about setfacl in HDFS permissions guide. Contributed by 
Chris Nauroth. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601476)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsPermissionsGuide.apt.vm


 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025559#comment-14025559
 ] 

Arpit Agarwal commented on HDFS-6482:
-

{{DFS_DATANODE_NUMBLOCKS_DEFAULT}} is currently 64. I am not sure why the 
default was set so low. It would be good to know the reason before we change 
the behavior. It was quite possibly an arbitrary choice.

After ~4 million blocks we would start putting more than 256 blocks in each 
leaf subdirectory. With every 4M blocks, we'd add 256 files to each leaf. I 
think this is fine since 4 million blocks itself is going to be very unlikely. 
I recall as late as Vista NTFS directory listings would get noticeably slow 
with thousands of files per directory. Is there any performance loss with 
always having three levels of subdirectories, restricting each to 256 children 
at the most?

- Who removes empty subdirectories when blocks are deleted?
- Let's avoid suffixing hex numerals to subdir for consistency with the 
existing naming convention.
- StringBuilder looks unnecessary in {{idToBlockDir}}.
- We should add a release note stating that {{DFS_DATANODE_NUMBLOCKS_DEFAULT}} 
is obsolete.

The approach looks good and a big +1 for removing LDir.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-06-09 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6460:


Attachment: HDFS-6460.002.patch

Hi Andrew,

Thanks a lot for the review and the good catch. I'm uploading new revision to 
address it. 



 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: HDFS-6315.005.patch

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-06-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025585#comment-14025585
 ] 

Colin Patrick McCabe commented on HDFS-6382:


For the MR strategy, it seems like this could be parallelized fairly easily.  
For example, if you have 5 MR tasks, you can calculate the hash of each path, 
and then task 1 can do all the paths that are 0 mod 5, task 2 can do all the 
paths that are 1 mod 5, and so forth.  MR also doesn't introduce extra 
dependencies since HDFS and MR are packaged together.

I don't understand what you mean by the mapreduce strategy will have 
additional overheads.  What overheads are you forseeing?

It is true that you need to avoid overloading the NameNode.  But this is a 
concern with any approach, not just the MR one.  It would be good to see a 
section on this.  I think the simplest way to do it is to rate-limit RPCs to 
the NameNode to a configurable rate.

bq. \[for the standalone daemon\] The major advantage of this approach is that 
we don’t need any extra work to finish the TTL work, all will be done in the 
daemon automatically. 

I don't understand what you mean by this.  What will be done automatically?

How are you going to implement HA for the standalone daemon?  I suppose if all 
the state is kept in HDFS, you can simply restart it when it fails.  However, 
it seems like you need to checkpoint how far along in the FS you are, so that 
if you die and later get restarted, you don't have to redo the whole FS scan.  
This implies reading directories in alphabetical order, or similar.  You also 
need to somehow record when the last scan was, perhaps in a file in HDFS.

I don't see a lot of discussion of logging and monitoring in general.  How is 
the user going to become aware that a file was deleted because of a TTL?  Or if 
there is an error during the delete, how will the user know?  Logging is one 
choice here.  Creating a file in HDFS is another.

The setTtl command seems reasonable.  Does this need to be an administrator 
command?

 HDFS File/Directory TTL
 ---

 Key: HDFS-6382
 URL: https://issues.apache.org/jira/browse/HDFS-6382
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, namenode
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-TTL-Design.pdf


 In production environment, we always have scenario like this, we want to 
 backup files on hdfs for some time and then hope to delete these files 
 automatically. For example, we keep only 1 day's logs on local disk due to 
 limited disk space, but we need to keep about 1 month's logs in order to 
 debug program bugs, so we keep all the logs on hdfs and delete logs which are 
 older than 1 month. This is a typical scenario of HDFS TTL. So here we 
 propose that hdfs can support TTL.
 Following are some details of this proposal:
 1. HDFS can support TTL on a specified file or directory
 2. If a TTL is set on a file, the file will be deleted automatically after 
 the TTL is expired
 3. If a TTL is set on a directory, the child files and directories will be 
 deleted automatically after the TTL is expired
 4. The child file/directory's TTL configuration should override its parent 
 directory's
 5. A global configuration is needed to configure that whether the deleted 
 files/directories should go to the trash or not
 6. A global configuration is needed to configure that whether a directory 
 with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025608#comment-14025608
 ] 

Hadoop QA commented on HDFS-6379:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649411/jira-HDFS-6379.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-httpfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7062//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7062//console

This message is automatically generated.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025693#comment-14025693
 ] 

Colin Patrick McCabe commented on HDFS-6482:


bq. DFS_DATANODE_NUMBLOCKS_DEFAULT is currently 64. I am not sure why the 
default was set so low. It would be good to know the reason before we change 
the behavior. It was quite possibly an arbitrary choice.

So, back in the really old days (think ext2), there were performance issues for 
directories with a large number of files (10,000+).  See wikipedia's page on 
ext2 here: http://en.wikipedia.org/wiki/Ext2.  The LDir subdirectory mechanism 
was intended to alleviate this.

More recent filesystems like ext4 (and recent revisions of ext3) have what's 
called directory indices.  This basically means that there is an index which 
allows you to look up a particular entry in a directory in less than O(N) time. 
 This makes having directories with a huge number of entries possible.

It's still nice to have multiple directories to avoid overloading {{readdir}} 
(when we have to do that-- for example, to find a metadata file without knowing 
its genstamp) and to make inspecting things easier.  Plus, it allows us to stay 
compatible with systems that don't handle giant directories well.

bq. After ~4 million blocks we would start putting more than 256 blocks in each 
leaf subdirectory. With every 4M blocks, we'd add 256 files to each leaf. I 
think this is fine since 4 million blocks itself is going to be very unlikely. 
I recall as late as Vista NTFS directory listings would get noticeably slow 
with thousands of files per directory. Is there any performance loss with 
always having three levels of subdirectories, restricting each to 256 children 
at the most?

It's an interesting idea, but after all, as you pointed out, even to get to 
1,024 blocks per subdirectory (which still isn't thousands but is a single 
thousand) under James' scheme would require 16 million blocks.  At that point, 
it seems like there will be other problems.  We can always evolve the directory 
and metadata naming structure again once 16 million blocks is on the horizon 
(and we probably will have to do other things too, like investigate off-heap 
memory storage)

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025713#comment-14025713
 ] 

Daryn Sharp commented on HDFS-6315:
---

Catching up from summit, will look at this soon.  It's sadly conflicting with 
the single path resolution patch I keep working on.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025740#comment-14025740
 ] 

Jing Zhao commented on HDFS-6315:
-

bq. It's sadly conflicting with the single path resolution patch I keep working 
on.

Thanks for the comments, [~daryn]. This patch only makes limited changes in 
FSDirectory. Most changes just move the FSEditLog#logxxx call into 
FSNamesystem. Thus the rebase should not be complicated I guess.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6330) Move mkdirs() to FSNamesystem

2014-06-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025730#comment-14025730
 ] 

Jing Zhao commented on HDFS-6330:
-

The patch looks good to me. Some minors:
# Let's use this chance to remove the empty javadoc of FSDirectory#normalizePath
# The following change may be unnecessary?
{code}
-  blockManager.getDatanodeManager().clearPendingCachingCommands();
-  blockManager.getDatanodeManager().setShouldSendCachingCommands(false);
-  // Don't want to keep replication queues when not in Active.
-  blockManager.clearQueues();
+  if (blockManager != null) {
+blockManager.getDatanodeManager().clearPendingCachingCommands();
+blockManager.getDatanodeManager().setShouldSendCachingCommands(false);
+// Don't want to keep replication queues when not in Active.
+blockManager.clearQueues();
+  }
{code}
# Nit: Some lines exceed the 80 character limit (e.g., mkdirsRecursively and 
addSymlink).
# We may need to update the log information in mkdirsRecursively since it's no 
longer a FSDirectory call.


 Move mkdirs() to FSNamesystem
 -

 Key: HDFS-6330
 URL: https://issues.apache.org/jira/browse/HDFS-6330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6330.000.patch, HDFS-6330.001.patch


 Currently mkdir() automatically creates all ancestors for a directory. This 
 is implemented in FSDirectory, by calling unprotectedMkdir() along the path. 
 This jira proposes to move the function to FSNamesystem to simplify the 
 primitive that FSDirectory needs to provide.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025761#comment-14025761
 ] 

Kihwal Lee commented on HDFS-6482:
--

BlockIDs are sequential nowadays. With the proposed block distribution method,  
leaf dirs can get severely unbalanced, especially in smaller clusters.  Besides 
the cost of looking up entries in a directory, directory lock contention can 
become high and hurt performance if many files are created and read from a 
small set of directories. I think limiting the number to 64 kind of imposed a 
cap on how contentious it can be.  We might do better by more evenly 
distributing blocks. 

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025773#comment-14025773
 ] 

Hadoop QA commented on HDFS-6315:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649417/HDFS-6315.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7061//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7061//console

This message is automatically generated.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025783#comment-14025783
 ] 

Daryn Sharp commented on HDFS-6315:
---

Maybe it's ok, but I'll apply the patch and comment in the morning.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025786#comment-14025786
 ] 

James Thomas commented on HDFS-6482:


Thanks for the review, Arpit, and thanks for the follow-up, Colin. I want to 
clarify one thing -- the numbers 4 million and 16 million that both of you 
mention are, as far as I understand, actually numbers of blocks for the ENTIRE 
cluster, not just a single DN. Suppose we had a cluster of 16 million blocks 
(with sequential block IDs), we could in theory have a single DN with a 
directory as large as 1024 entries, if we got unlucky with the assignment of 
blocks to DNs. Assuming uniform distribution of blocks across the DNs available 
in the cluster and a maximum # of blocks per DN of 2^24, we have an expected # 
of blocks per directory of 256. I don't know how accurate this assumption is.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025802#comment-14025802
 ] 

James Thomas commented on HDFS-6482:


Kihwal, we were considering using some sort of deterministic probing (as in 
hash tables) to find less full directories if the initial directory for a block 
is full. Do you think the cost (and additional complexity) of this sort of 
scheme is justified given the relatively low probability (given the uniform 
block distribution assumption, at least) of directory blowup?

Additionally, I want to note that if the total number of blocks in the cluster 
is N, N/2^16 is a strict upper bound on the number of blocks in a single 
directory on any DN, assuming completely sequential block IDs. So for a small 
cluster we can't see any blowup.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025803#comment-14025803
 ] 

Hadoop QA commented on HDFS-6315:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649417/HDFS-6315.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7063//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7063//console

This message is automatically generated.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file

2014-06-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025823#comment-14025823
 ] 

Chris Nauroth commented on HDFS-2006:
-

I agree with Andrew on the plan for merging to branch-2.  Thank you, Uma.

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025843#comment-14025843
 ] 

Hadoop QA commented on HDFS-6478:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649415/HDFS-6478.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7064//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7064//console

This message is automatically generated.

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025855#comment-14025855
 ] 

Andrew Wang commented on HDFS-6395:
---

I should have realized this earlier, considering I worked on something pretty 
similar with the fs-limits and the edit log before. I agree that it's difficult 
to do this without some serious code gymnastics, so let's just table the entire 
thing for now. Please resolve this if you agree, thanks again [~hitliuyi].

 Assorted improvements to xattr limit checking
 -

 Key: HDFS-6395
 URL: https://issues.apache.org/jira/browse/HDFS-6395
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Yi Liu
 Attachments: HDFS-6395.patch


 It'd be nice to print messages during fsimage and editlog loading if we hit 
 either the # of xattrs per inode or the xattr size limits.
 We should also consider making the # of xattrs limit only apply to the user 
 namespace, or to each namespace separately, to prevent users from locking out 
 access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025943#comment-14025943
 ] 

Alejandro Abdelnur commented on HDFS-6379:
--

[~michaelbyoder], nice work. Would you mind adding a testcase where ACLs are 
disabled in HDFS to verify that being disable do not break file status and list 
status? After that I think it is ready to go.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025944#comment-14025944
 ] 

Hadoop QA commented on HDFS-6460:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649427/HDFS-6460.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7066//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7066//console

This message is automatically generated.

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Mike Yoder (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025956#comment-14025956
 ] 

Mike Yoder commented on HDFS-6379:
--

Looks like there's another me out there!  I'm [~yoderme], not that 
other...uh...guy with my name. :-)

[~tucu00], I totally agree with that test case, but can you send a quick 
pointer as to how to do that in an automated fashion?  All the test cases I've 
seen fire up the server part once at the start and leave it running for all 
tests.  Any way to change the server conf dynamically?

Thanks, -Mike

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6504) NFS: invalid Keytab/principal entry should shutdown nfs server

2014-06-09 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-6504:


 Summary: NFS: invalid Keytab/principal entry should shutdown nfs 
server
 Key: HDFS-6504
 URL: https://issues.apache.org/jira/browse/HDFS-6504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


Invalid value in 'dfs.nfs.keytab.file' and 'dfs.nfs.kerberos.principal' should 
shutdown nfs.

Currently NFS does not throw any error or shutdown nfs if invalid value is 
entered in any of the above properties.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-06-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025964#comment-14025964
 ] 

Yongjun Zhang commented on HDFS-6460:
-

The failed test is irrelevant, and it was reported as HADOOP-10668.


 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025970#comment-14025970
 ] 

Andrew Wang commented on HDFS-6460:
---

+1 will commit shortly, thanks Yongjun

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

2014-06-09 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025972#comment-14025972
 ] 

Brandon Li commented on HDFS-6439:
--

[~atm], are you still working on this? If you are distracted by other tasks, I 
will upload a new patch based on yours.

 NFS should not reject NFS requests to the NULL procedure whether port 
 monitoring is enabled or not
 --

 Key: HDFS-6439
 URL: https://issues.apache.org/jira/browse/HDFS-6439
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.4.0
Reporter: Brandon Li
Assignee: Aaron T. Myers
 Attachments: HDFS-6439.patch, HDFS-6439.patch, 
 linux-nfs-disallow-request-from-nonsecure-port.pcapng, 
 mount-nfs-requests.pcapng


 As discussed in HDFS-6406, this JIRA is to track the follow update:
 1. Port monitoring is the feature name with traditional NFS server and we may 
 want to make the config property (along with related variable 
 allowInsecurePorts) something as dfs.nfs.port.monitoring. 
 2 . According to RFC2623 (http://www.rfc-editor.org/rfc/rfc2623.txt):
 {quote}Whether port monitoring is enabled or not, NFS servers SHOULD NOT 
 reject NFS requests to the NULL procedure (procedure number 0). See 
 subsection 2.3.1, NULL procedure for a complete explanation. {quote}
 I do notice that NFS clients (most time) send mount NULL and nfs NULL from 
 non-privileged port. If we deny NULL call in mountd or nfs server, the client 
 can't mount the export even as user root.
 3. it would be nice to have the user guide updated for the port monitoring 
 feature.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025977#comment-14025977
 ] 

Hadoop QA commented on HDFS-6439:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12646408/linux-nfs-disallow-request-from-nonsecure-port.pcapng
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7067//console

This message is automatically generated.

 NFS should not reject NFS requests to the NULL procedure whether port 
 monitoring is enabled or not
 --

 Key: HDFS-6439
 URL: https://issues.apache.org/jira/browse/HDFS-6439
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.4.0
Reporter: Brandon Li
Assignee: Aaron T. Myers
 Attachments: HDFS-6439.patch, HDFS-6439.patch, 
 linux-nfs-disallow-request-from-nonsecure-port.pcapng, 
 mount-nfs-requests.pcapng


 As discussed in HDFS-6406, this JIRA is to track the follow update:
 1. Port monitoring is the feature name with traditional NFS server and we may 
 want to make the config property (along with related variable 
 allowInsecurePorts) something as dfs.nfs.port.monitoring. 
 2 . According to RFC2623 (http://www.rfc-editor.org/rfc/rfc2623.txt):
 {quote}Whether port monitoring is enabled or not, NFS servers SHOULD NOT 
 reject NFS requests to the NULL procedure (procedure number 0). See 
 subsection 2.3.1, NULL procedure for a complete explanation. {quote}
 I do notice that NFS clients (most time) send mount NULL and nfs NULL from 
 non-privileged port. If we deny NULL call in mountd or nfs server, the client 
 can't mount the export even as user root.
 3. it would be nice to have the user guide updated for the port monitoring 
 feature.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6460:
--

Summary: Ignore stale and decommissioned nodes in 
NetworkTopology#sortByDistance  (was: To ignore stale/decommissioned nodes in 
NetworkTopology#pseudoSortByDistance)

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025980#comment-14025980
 ] 

Colin Patrick McCabe commented on HDFS-6482:


bq. Suppose we had a cluster of 16 million blocks (with sequential block IDs), 
we could in theory have a single DN with a directory as large as 1024 entries, 
if we got unlucky with the assignment of blocks to DNs.

I don't think this calculation is right.

Even if all the blocks end up on a single DN (maximally unbalanced), in a 16 
million block cluster, you have  (16 * 1024 * 1024) / (256 * 256) = 256 entries 
per directory.

To confirm this calculation, I ran this test program:
{code}
#include inttypes.h
#include stdio.h

#define MAX_A 256
#define MAX_B 256

uint64_t dir_entries[MAX_A][MAX_B];

int main(void)
{
  uint64_t i, j, l, a, b, c;
  uint64_t max = (16LL * 1024LL * 1024LL);

  for (i = 0; i  max; i++) {
l = (i  0x00ffLL);
a = (i  0xff00LL)  8LL;
b = (i  0x00ffLL)  16LL;
c = (i  0xff00LL)  16LL;
c |= l;
//printf(%02PRIx64/%02PRIx64/%012PRIx64\n, a, b, c);
dir_entries[a][b]++;
  }
  max = 0;
  for (i = 0; i  MAX_A; i++) {
for (j = 0; j  MAX_B; j++) {
  if (max  dir_entries[i][j]) {
max = dir_entries[i][j];
  }
}
  }
  printf(max entries per directory = %PRId64\n, max);
  return 0;
}
{code}

bq. we were considering using some sort of deterministic probing (as in hash 
tables) to find less full directories if the initial directory for a block is 
full...

I don't think probing is a good idea.  It's going to slow things down in the 
common case when we're reading a block.

Maybe we should add another layer in the hierarchy so that we know we won't get 
big directories even on huge clusters.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025981#comment-14025981
 ] 

Andrew Wang commented on HDFS-6460:
---

Committed this to trunk. Yongjun, do you mind prepping a branch-2 patch too? 
There's another test that needs to be updated.

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025992#comment-14025992
 ] 

Hudson commented on HDFS-6460:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5671 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5671/])
HDFS-6460. Ignore stale and decommissioned nodes in 
NetworkTopology#sortByDistance. Contributed by Yongjun Zhang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601535)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopologyWithNodeGroup.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestNetworkTopologyWithNodeGroup.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6470) TestBPOfferService.testBPInitErrorHandling is flaky

2014-06-09 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6470:
--

Attachment: HDFS-6470.patch

It seems the test has the following issues.

1. It asserts the size of BPServiceActor is 2 after BPOfferService started. One 
of the BPServiceActors could have shutdown due to initBlockPool failure by the 
time the assert is called.

2. It assumes the first BPServiceActor is healthy and uses that for blockReport 
verification. It is possible the second BPServiceActor is healthy.

The patch moves the size check before BPOfferService starts. In addition, as 
long as one of the BPServiceActors can send blockReport, the test is considered 
passed.

 TestBPOfferService.testBPInitErrorHandling is flaky
 ---

 Key: HDFS-6470
 URL: https://issues.apache.org/jira/browse/HDFS-6470
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
 Attachments: HDFS-6470.patch


 Saw some test flakage in a test-patch run, stacktrace:
 {code}
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBPInitErrorHandling(TestBPOfferService.java:334)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6470) TestBPOfferService.testBPInitErrorHandling is flaky

2014-06-09 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6470:
--

Status: Patch Available  (was: Open)

 TestBPOfferService.testBPInitErrorHandling is flaky
 ---

 Key: HDFS-6470
 URL: https://issues.apache.org/jira/browse/HDFS-6470
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
 Attachments: HDFS-6470.patch


 Saw some test flakage in a test-patch run, stacktrace:
 {code}
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBPInitErrorHandling(TestBPOfferService.java:334)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-6493 started by Juan Yu.

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial
 Attachments: HDFS-6493.001.patch


 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6493:
--

Attachment: HDFS-6493.001.patch

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial
 Attachments: HDFS-6493.001.patch


 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6493:
--

Attachment: (was: HDFS-6493.001.patch)

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial

 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6502) incorrect description in distcp2 document

2014-06-09 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned HDFS-6502:
---

Assignee: Akira AJISAKA

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA

 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6502) incorrect description in distcp2 document

2014-06-09 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6502:


Attachment: HDFS-6502.patch

Thanks [~yzhangal] for the report. Attaching a patch for trunk and branch-2.

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA
 Attachments: HDFS-6502.patch


 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6460:


Attachment: HDFS-6460-branch2.001.patch

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6502) incorrect description in distcp2 document

2014-06-09 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6502:


   Labels: newbie  (was: )
 Target Version/s: 2.5.0
Affects Version/s: (was: 2.4.0)
   2.5.0
   1.2.1
   Status: Patch Available  (was: Open)

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.2.1, 2.5.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-6502.patch


 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026024#comment-14026024
 ] 

Yongjun Zhang commented on HDFS-6460:
-

Many thanks Andrew! Just uploaded a patch for branch-2, the change is in 
TestHdfsNetworkTopologyWithNodeGroup.java as you mentioned.


 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

2014-06-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026027#comment-14026027
 ] 

Aaron T. Myers commented on HDFS-6439:
--

Definitely don't let me hold you up if you'd like to work on a patch, 
[~brandonli]. It'd be much appreciated, and I'd be happy to review it.

 NFS should not reject NFS requests to the NULL procedure whether port 
 monitoring is enabled or not
 --

 Key: HDFS-6439
 URL: https://issues.apache.org/jira/browse/HDFS-6439
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.4.0
Reporter: Brandon Li
Assignee: Aaron T. Myers
 Attachments: HDFS-6439.patch, HDFS-6439.patch, 
 linux-nfs-disallow-request-from-nonsecure-port.pcapng, 
 mount-nfs-requests.pcapng


 As discussed in HDFS-6406, this JIRA is to track the follow update:
 1. Port monitoring is the feature name with traditional NFS server and we may 
 want to make the config property (along with related variable 
 allowInsecurePorts) something as dfs.nfs.port.monitoring. 
 2 . According to RFC2623 (http://www.rfc-editor.org/rfc/rfc2623.txt):
 {quote}Whether port monitoring is enabled or not, NFS servers SHOULD NOT 
 reject NFS requests to the NULL procedure (procedure number 0). See 
 subsection 2.3.1, NULL procedure for a complete explanation. {quote}
 I do notice that NFS clients (most time) send mount NULL and nfs NULL from 
 non-privileged port. If we deny NULL call in mountd or nfs server, the client 
 can't mount the export even as user root.
 3. it would be nice to have the user guide updated for the port monitoring 
 feature.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking

2014-06-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026032#comment-14026032
 ] 

Andrew Wang commented on HDFS-6395:
---

Woops, my bad, I forgot that this patch also fixes the # limit to not apply to 
the non-user namespaces. I had a few comments:

- Would be nice to test that the system namespace isn't affected by these 
limits, I guess reach into FSNamesystem or FSDirectory via @VisibleForTesting 
methods.
- Let's remove the prints when the limits hit their max, since I think that was 
a misunderstanding of Chris' comment about printing.

Thanks Yi!

 Assorted improvements to xattr limit checking
 -

 Key: HDFS-6395
 URL: https://issues.apache.org/jira/browse/HDFS-6395
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Yi Liu
 Attachments: HDFS-6395.patch


 It'd be nice to print messages during fsimage and editlog loading if we hit 
 either the # of xattrs per inode or the xattr size limits.
 We should also consider making the # of xattrs limit only apply to the user 
 namespace, or to each namespace separately, to prevent users from locking out 
 access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026039#comment-14026039
 ] 

Hadoop QA commented on HDFS-6460:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12649510/HDFS-6460-branch2.001.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7069//console

This message is automatically generated.

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

2014-06-09 Thread Zesheng Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026047#comment-14026047
 ] 

Zesheng Wu commented on HDFS-6382:
--

Thanks [~cmccabe] for your feedback.
bq. For the MR strategy, it seems like this could be parallelized fairly 
easily. For example, if you have 5 MR tasks, you can calculate the hash of each 
path, and then task 1 can do all the paths that are 0 mod 5, task 2 can do all 
the paths that are 1 mod 5, and so forth. MR also doesn't introduce extra 
dependencies since HDFS and MR are packaged together.
You mean that we scan the whole namespace at first and then split it into 5 
pieces according to hash of the path, why do we just complete the work during 
the first scanning process? If I misunderstand your meaning, please point out.

bq. I don't understand what you mean by the mapreduce strategy will have 
additional overheads. What overheads are you foreseeing?
Possible overheads: Starting a mapreduce job needs to split the input, start an 
 AppMaster, collect result from random machines (Perhaps 'overheads' is not a 
proper word here)

bq. I don't understand what you mean by this. What will be done automatically?
Here automatically means we do not have to rely on external tools, the daemon 
itself can manage the work well.

bq. How are you going to implement HA for the standalone daemon?
Good point. As you suggested, one approach is save the state in HDFS and simply 
restart it when it fails. But managing the state is a complex work, I am 
considering how to simplify this. One possible simpler approach is that we can 
consider that the daemon is stateless and simply restart it when if fails. We 
needn't do checkpoint and just scan from the beginning when it restarts. 
Because we can require that the work the daemon does is idempotent, starting 
from the beginning will be harmless. Possible drawbacks of the later approach 
are that it may waste some time and may delay the work, but they are 
acceptable. 

bq. I don't see a lot of discussion of logging and monitoring in general. How 
is the user going to become aware that a file was deleted because of a TTL? Or 
if there is an error during the delete, how will the user know? 
For the simplicity purpose, in the initial version, we will use logs to record 
which file/directory is deleted by TTL, and errors during the deleting process.

bq. Does this need to be an administrator command?
It doesn't need to be an administrator command, user only can setTtl on 
file/directory that they have write permission, and can getTtl on 
file/directory that they have read permission.

 HDFS File/Directory TTL
 ---

 Key: HDFS-6382
 URL: https://issues.apache.org/jira/browse/HDFS-6382
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, namenode
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-TTL-Design.pdf


 In production environment, we always have scenario like this, we want to 
 backup files on hdfs for some time and then hope to delete these files 
 automatically. For example, we keep only 1 day's logs on local disk due to 
 limited disk space, but we need to keep about 1 month's logs in order to 
 debug program bugs, so we keep all the logs on hdfs and delete logs which are 
 older than 1 month. This is a typical scenario of HDFS TTL. So here we 
 propose that hdfs can support TTL.
 Following are some details of this proposal:
 1. HDFS can support TTL on a specified file or directory
 2. If a TTL is set on a file, the file will be deleted automatically after 
 the TTL is expired
 3. If a TTL is set on a directory, the child files and directories will be 
 deleted automatically after the TTL is expired
 4. The child file/directory's TTL configuration should override its parent 
 directory's
 5. A global configuration is needed to configure that whether the deleted 
 files/directories should go to the trash or not
 6. A global configuration is needed to configure that whether a directory 
 with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2014-06-09 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026068#comment-14026068
 ] 

Guo Ruijing commented on HDFS-6489:
---

Take example,

existing behavior:

1. create file 60M with prefer block size 64M.
2. append 10 bytes  (disk utilization is increased by 60M + 10 bytes, totally 
120M + 10 bytes)
3. append 10 bytes  (disk utilization is increased by  60M + 20 bytes, totally 
120M + 30 bytes)
4. append 10 bytes (disk utilization is increased by 60M + 30 bytes, totally 
180M + 60bytes)

expected behavior:

1. create file 60M with prefer block size 64M.
2. append 10 bytes  (disk utilization is increased 10 bytes, totally 60M + 10 
bytes)
3. append 10 bytes  (disk utilization is increased 10 bytes, totally 60M + 20 
bytes)
4. append 10 bytes (disk utilization is increased 10 bytes, totally 60M + 30 
bytes)

 DFS Used space is not correct computed on frequent append operations
 

 Key: HDFS-6489
 URL: https://issues.apache.org/jira/browse/HDFS-6489
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: stanley shi

 The current implementation of the Datanode will increase the DFS used space 
 on each block write operation. This is correct in most scenario (create new 
 file), but sometimes it will behave in-correct(append small data to a large 
 block).
 For example, I have a file with only one block(say, 60M). Then I try to 
 append to it very frequently but each time I append only 10 bytes;
 Then on each append, dfs used will be increased with the length of the 
 block(60M), not teh actual data length(10bytes).
 Consider in a scenario I use many clients to append concurrently to a large 
 number of files (1000+), assume the block size is 32M (half of the default 
 value), then the dfs used will be increased 1000*32M = 32G on each append to 
 the files; but actually I only write 10K bytes; this will cause the datanode 
 to report in-sufficient disk space on data write.
 {quote}2014-06-04 15:27:34,719 INFO 
 org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
 BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
 Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
 FINALIZED{quote}
 But the actual disk usage:
 {quote}
 [root@hdsh143 ~]# df -h
 FilesystemSize  Used Avail Use% Mounted on
 /dev/sda3  16G  2.9G   13G  20% /
 tmpfs 1.9G   72K  1.9G   1% /dev/shm
 /dev/sda1  97M   32M   61M  35% /boot
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6502) incorrect description in distcp2 document

2014-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026069#comment-14026069
 ] 

Hadoop QA commented on HDFS-6502:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649511/HDFS-6502.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7070//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7070//console

This message is automatically generated.

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.2.1, 2.5.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-6502.patch


 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6505) Can not close file due to last block is marked as corrupt

2014-06-09 Thread Gordon Wang (JIRA)
Gordon Wang created HDFS-6505:
-

 Summary: Can not close file due to last block is marked as corrupt
 Key: HDFS-6505
 URL: https://issues.apache.org/jira/browse/HDFS-6505
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Gordon Wang


After appending a file, client could not close it. Because namenode could not 
complete the last block in file. The UC status of last block remained as COMMIT 
and never change.
The namenode log was like this.
{code}
INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
checkFileProgress: blk_1073741920_13948{blockUCState=COMMITTED, 
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[172.28.1.2:50010|RBW]]} has not reached 
minimal replication 1
{code}

After going through the log of namenode, I found a log like this
{code}
INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: blk_1073741920 
added as corrupt on 172.28.1.2:50010 by sdw3/172.28.1.3 because client machine 
reported it
{code}
But actually, the last block was finished successfully in the data node. 
Because I could find the log in datanode
{code}
INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DataTransfer: Transmitted 
BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13808 
(numBytes=50120352) to /172.28.1.3:50010
INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.28.1.2:36860, dest: /172.28.1.2:50010, bytes: 51686616, op: HDFS_WRITE, 
cliID: libhdfs3_client_random_741511239_count_1_pid_215802_tid_140085714196576, 
offset: 0, srvID: DS-2074102060-172.28.1.2-50010-1401432768690, blockid: 
BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13948, duration: 
189226453336
INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13948, 
type=LAST_IN_PIPELINE, downstreams=0:[] terminating
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6505) Can not close file due to last block is marked as corrupt

2014-06-09 Thread Gordon Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026085#comment-14026085
 ] 

Gordon Wang commented on HDFS-6505:
---

This issue causes the last block is missing and the file is corrupted. But 
actually, the data on DataNode is correct.

I went through the code, and I think some safe check is missing when namenode 
receives a bad block report from datanodes.
See the following code snippet in namenode BlockManager
{code}
  public void findAndMarkBlockAsCorrupt(final ExtendedBlock blk,
  final DatanodeInfo dn, String storageID, String reason) throws 
IOException {
assert namesystem.hasWriteLock();
final BlockInfo storedBlock = getStoredBlock(blk.getLocalBlock());
if (storedBlock == null) {
  // Check if the replica is in the blockMap, if not
  // ignore the request for now. This could happen when BlockScanner
  // thread of Datanode reports bad block before Block reports are sent
  // by the Datanode on startup
  blockLog.info(BLOCK* findAndMarkBlockAsCorrupt: 
  + blk +  not found);
  return;
}
markBlockAsCorrupt(new BlockToMarkCorrupt(storedBlock, reason,
Reason.CORRUPTION_REPORTED), dn, storageID);
  }
{code} 
We should check the timestamp in reported block and stored block. If the 
reported block has a smaller timestamp, this block should not be marked as 
corrupt. It is possible that the reported block has a smaller timestamp when 
client has done some work on recovering pipeline.

 Can not close file due to last block is marked as corrupt
 -

 Key: HDFS-6505
 URL: https://issues.apache.org/jira/browse/HDFS-6505
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Gordon Wang

 After appending a file, client could not close it. Because namenode could not 
 complete the last block in file. The UC status of last block remained as 
 COMMIT and never change.
 The namenode log was like this.
 {code}
 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
 checkFileProgress: blk_1073741920_13948{blockUCState=COMMITTED, 
 primaryNodeIndex=-1,
 replicas=[ReplicaUnderConstruction[172.28.1.2:50010|RBW]]} has not reached 
 minimal replication 1
 {code}
 After going through the log of namenode, I found a log like this
 {code}
 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: 
 blk_1073741920 added as corrupt on 172.28.1.2:50010 by sdw3/172.28.1.3 
 because client machine reported it
 {code}
 But actually, the last block was finished successfully in the data node. 
 Because I could find the log in datanode
 {code}
 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DataTransfer: 
 Transmitted BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13808 
 (numBytes=50120352) to /172.28.1.3:50010
 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
 /172.28.1.2:36860, dest: /172.28.1.2:50010, bytes: 51686616, op: HDFS_WRITE, 
 cliID: 
 libhdfs3_client_random_741511239_count_1_pid_215802_tid_140085714196576, 
 offset: 0, srvID: DS-2074102060-172.28.1.2-50010-1401432768690, blockid: 
 BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13948, duration: 
 189226453336
 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
 BP-649434182-172.28.1.251-1401432753616:blk_1073741920_13948, 
 type=LAST_IN_PIPELINE, downstreams=0:[] terminating
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5723) Append failed FINALIZED replica should not be accepted as valid when that block is underconstruction

2014-06-09 Thread stanley shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026090#comment-14026090
 ] 

stanley shi commented on HDFS-5723:
---

Hi Vinay,
Seems my steps are an different error but with the same error log.
Do you want to fix it in this ticket or prefer me to submit another ticket?

 Append failed FINALIZED replica should not be accepted as valid when that 
 block is underconstruction
 

 Key: HDFS-5723
 URL: https://issues.apache.org/jira/browse/HDFS-5723
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-5723.patch, HDFS-5723.patch


 Scenario:
 1. 3 node cluster with 
 dfs.client.block.write.replace-datanode-on-failure.enable set to false.
 2. One file is written with 3 replicas, blk_id_gs1
 3. One of the datanode DN1 is down.
 4. File was opened with append and some more data is added to the file and 
 synced. (to only 2 live nodes DN2 and DN3)-- blk_id_gs2
 5. Now  DN1 restarted
 6. In this block report, DN1 reported FINALIZED block blk_id_gs1, this should 
 be marked corrupted.
 but since NN having appended block state as UnderConstruction, at this time 
 its not detecting this block as corrupt and adding to valid block locations.
 As long as the namenode is alive, this datanode also will be considered as 
 valid replica and read/append will fail in that datanode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >