date:20140609


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: Patch Available  (was: In Progress)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support


[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021718#comment-14021718
 ] 

Hadoop QA commented on HDFS-6379:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648919/jira-HDFS-6379.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1283 javac 
compiler warnings (more than the trunk's current 1277 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-httpfs:

  
org.apache.hadoop.fs.http.client.TestHttpFSFileSystemLocalFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7059//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7059//console

This message is automatically generated.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6494) In some case, the hedged read will lead to client infinite wait.

2014-06-09 Thread LiuLei (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LiuLei updated HDFS-6494:
-

Attachment: hedged-read-bug.patch

 In some case, the  hedged read will lead to client  infinite wait.
 --

 Key: HDFS-6494
 URL: https://issues.apache.org/jira/browse/HDFS-6494
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: hedged-read-bug.patch


 When I use hedged read, If there is only one live datanode, the reading 
 from  the datanode throw TimeoutException and ChecksumException., the Client 
 will infinite wait.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6494) In some case, the hedged read will lead to client infinite wait.

2014-06-09 Thread LiuLei (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021729#comment-14021729
 ] 

LiuLei commented on HDFS-6494:
--

Hi Liang,
I upload one patch, I hope that is helpful for you.

 In some case, the  hedged read will lead to client  infinite wait.
 --

 Key: HDFS-6494
 URL: https://issues.apache.org/jira/browse/HDFS-6494
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: hedged-read-bug.patch


 When I use hedged read, If there is only one live datanode, the reading 
 from  the datanode throw TimeoutException and ChecksumException., the Client 
 will infinite wait.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5442) Zero loss HDFS data replication for multiple datacenters

2014-06-09 Thread Dian Fu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated HDFS-5442:
--

Attachment: Disaster Recovery Solution for Hadoop.pdf

Updated the design doc, add some detailed implementation.

 Zero loss HDFS data replication for multiple datacenters
 

 Key: HDFS-5442
 URL: https://issues.apache.org/jira/browse/HDFS-5442
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Avik Dey
Assignee: Dian Fu
 Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster 
 Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf


 Hadoop is architected to operate efficiently at scale for normal hardware 
 failures within a datacenter. Hadoop is not designed today to handle 
 datacenter failures. Although HDFS is not designed for nor deployed in 
 configurations spanning multiple datacenters, replicating data from one 
 location to another is common practice for disaster recovery and global 
 service availability. There are current solutions available for batch 
 replication using data copy/export tools. However, while providing some 
 backup capability for HDFS data, they do not provide the capability to 
 recover all your HDFS data from a datacenter failure and be up and running 
 again with a fully operational Hadoop cluster in another datacenter in a 
 matter of minutes. For disaster recovery from a datacenter failure, we should 
 provide a fully distributed, zero data loss, low latency, high throughput and 
 secure HDFS data replication solution for multiple datacenter setup.
 Design and code for Phase-1 to follow soon.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6465) Enable the configuration of multiple clusters

2014-06-09 Thread Dian Fu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021786#comment-14021786
 ] 

Dian Fu commented on HDFS-6465:
---

update some design details about configurations:

requirements:
1.  Existing deployments must be able to use the existing configuration 
without any change.
2.  As many as possible the configurations for different clusters must be 
the same. The special configuration required for different clusters should be 
minimal.

Configurations added:
•   DFS_REGION_ID(dfs.region.id) : the region id of current cluster
•   DFS_REGIONS(dfs.regions) : the region ids of all clusters, including 
both the primary cluster and mirror clusters
•   DFS_REGION_PRIMARY(dfs.region.primary) : the region id of primary 
cluster

Configurations must be suffixed with regionId:
DFS_NAMENODE_RPC_ADDRESS_KEY, DFS_NAMENODE_SERVICE_RPC_ADDRESS_KEY, 
DFS_NAMENODE_HTTP_ADDRESS_KEY, DFS_NAMENODE_HTTPS_ADDRESS_KEY, 
DFS_NAMENODE_SECONDARY_HTTP_ADDRESS_KEY and DFS_NAMENODE_BACKUP_ADDRESS_KEY 

Configurations could be suffixed with regionId or not.
These include all the configurations in NameNode.NAMENODE_SPECIFIC_KEYS and 
NameNode.NAMESERVICE_SPECIFIC_KEYS except the above configurations which must 
be suffixed with regionId:
DFS_NAMENODE_RPC_BIND_HOST_KEY, DFS_NAMENODE_NAME_DIR_KEY, 
DFS_NAMENODE_EDITS_DIR_KEY, DFS_NAMENODE_SHARED_EDITS_DIR_KEY, 
DFS_NAMENODE_CHECKPOINT_DIR_KEY, DFS_NAMENODE_CHECKPOINT_EDITS_DIR_KEY, 
DFS_NAMENODE_SERVICE_RPC_BIND_HOST_KEY, DFS_NAMENODE_HTTP_BIND_HOST_KEY,
DFS_NAMENODE_HTTPS_BIND_HOST_KEY, DFS_NAMENODE_KEYTAB_FILE_KEY, 
DFS_NAMENODE_SECONDARY_HTTPS_ADDRESS_KEY, 
DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY, DFS_NAMENODE_BACKUP_HTTP_ADDRESS_KEY, 
DFS_NAMENODE_BACKUP_SERVICE_RPC_ADDRESS_KEY, 
DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY, 
DFS_NAMENODE_KERBEROS_INTERNAL_SPNEGO_PRINCIPAL_KEY, DFS_HA_FENCE_METHODS_KEY, 
DFS_HA_ZKFC_PORT_KEY and DFS_HA_AUTO_FAILOVER_ENABLED_KEY
The above configurations can be configured in the following format to 
distinguish between clusters:
configuration key.nameservice id.namenode id.region id
If a configuration with a region id as suffix cannot be found, the 
configuration without region id as suffix will be used instead.

All other configurations which aren’t mentioned should not be suffixed with 
regionId.

 Enable the configuration of multiple clusters
 -

 Key: HDFS-6465
 URL: https://issues.apache.org/jira/browse/HDFS-6465
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Dian Fu
Assignee: Dian Fu
 Attachments: HDFS-6465.1.patch, HDFS-6465.2.patch, HDFS-6465.patch


 Tracks the changes required for configuration DR.
 configurations added:
 DFS_REGION_ID(dfs.region.id) : the region id of current cluster
 DFS_REGIONS(dfs.regions) : the region ids of all clusters, including 
 both the primary cluster and mirror cluster
 DFS_REGION_PRIMARY(dfs.region.primary) : the region id of primary 
 cluster
 configurations modified:
 The configurations in NAMENODE.NAMENODE_SPECIFIC_KEYS can be configured 
 in the following format to distinguish between clusters.
 If a configuration with a region id as suffix cannot be found, the 
 configuration without region id as suffix will be used instead:
 configuration key.nameservice id.namenode id.region id
 The configurations in NAMENODE.NAMESERVICE_SPECIFIC_KEYS can be configured in 
 the following format to distinguish between clusters.
 If a configuration with a region id as suffix cannot be found, the 
 configuration without region id as suffix will be used instead:
 configuration key.nameservice id.region id



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6382) HDFS File/Directory TTL

[
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zesheng Wu updated HDFS-6382:
-

Attachment: HDFS-TTL-Design.pdf

An initial version of design doc.

HDFS File/Directory TTL
---

Key: HDFS-6382
URL: https://issues.apache.org/jira/browse/HDFS-6382
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client, namenode
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Attachments: HDFS-TTL-Design.pdf

In production environment, we always have scenario like this, we want to
backup files on hdfs for some time and then hope to delete these files
automatically. For example, we keep only 1 day's logs on local disk due to
limited disk space, but we need to keep about 1 month's logs in order to
debug program bugs, so we keep all the logs on hdfs and delete logs which are
older than 1 month. This is a typical scenario of HDFS TTL. So here we
propose that hdfs can support TTL.
Following are some details of this proposal:
1. HDFS can support TTL on a specified file or directory
2. If a TTL is set on a file, the file will be deleted automatically after
the TTL is expired
3. If a TTL is set on a directory, the child files and directories will be
deleted automatically after the TTL is expired
4. The child file/directory's TTL configuration should override its parent
directory's
5. A global configuration is needed to configure that whether the deleted
files/directories should go to the trash or not
6. A global configuration is needed to configure that whether a directory
with TTL should be deleted when it is emptied by TTL mechanism or not.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage

Zesheng Wu created HDFS-6503:


 Summary: Fix typo of DFSAdmin restoreFailedStorage
 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor


Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage


 [ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6503:
-

Status: Patch Available  (was: Open)

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage


 [ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6503:
-

Attachment: HDFS-6503.patch

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs


 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6481:
-

Assignee: Ted Yu

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at

[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs


 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6481:
-

Target Version/s: 2.5.0

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at

[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs


[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025216#comment-14025216
 ] 

Kihwal Lee commented on HDFS-6481:
--

We can add sanity checks, but this should not happen unless we have a bug 
somewhere. The root cause needs to be addressed.

 DatanodeManager#getDatanodeStorageInfos() should check the length of 
 storageIDs
 ---

 Key: HDFS-6481
 URL: https://issues.apache.org/jira/browse/HDFS-6481
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hdfs-6481-v1.txt


 Ian Brooks reported the following stack trace:
 {code}
 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
 /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
 hdfs.DFSClient: DataStreamer Exception
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at org.apache.hadoop.ipc.Client.call(Client.java:1300)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
 at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
 syncer encountered error, will retry. txid=211
 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
  0
 at

[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage


[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025281#comment-14025281
 ] 

Hadoop QA commented on HDFS-6503:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649235/HDFS-6503.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7060//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7060//console

This message is automatically generated.

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage


[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025285#comment-14025285
 ] 

Zesheng Wu commented on HDFS-6503:
--

Just fix typo, no need to add new tests.

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6403) Add metrics for log warnings reported by HADOOP-9618


[ 
https://issues.apache.org/jira/browse/HDFS-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025326#comment-14025326
 ] 

Yongjun Zhang commented on HDFS-6403:
-

HI [~tlipcon], as we chatted earlier,  appreciate if you could help reviewing 
the patch. Thanks.


 Add metrics for log warnings reported by HADOOP-9618
 

 Key: HDFS-6403
 URL: https://issues.apache.org/jira/browse/HDFS-6403
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6403.001.patch, HDFS-6403.002.patch


 HADOOP-9618 logs warnings when there are long GC pauses. If this is exposed 
 as a metric, then they can be monitored.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6159) TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails if there is block missing after balancer success

2014-06-09 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025373#comment-14025373
 ] 

Arpit Agarwal commented on HDFS-6159:
-

Hi [~djp], looks unrelated to HDFS-6362 from a quick look. I also took a quick 
look at HDFS-6424 and it appears unrelated.

Please feel free a separate Jira for the test failure and attach the 
logs/analysis.

 TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails if there is block 
 missing after balancer success
 --

 Key: HDFS-6159
 URL: https://issues.apache.org/jira/browse/HDFS-6159
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.3.0
Reporter: Chen He
Assignee: Chen He
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6159-v2.patch, HDFS-6159-v2.patch, HDFS-6159.patch, 
 logs.txt


 The TestBalancerWithNodeGroup.testBalancerWithNodeGroup will report negative 
 false failure if there is(are) data block(s) losing after balancer 
 successfuly finishes. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file

2014-06-09 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025391#comment-14025391
 ] 

Uma Maheswara Rao G commented on HDFS-2006:
---

XAttr support for DistCP(MAPREDUCE-5898) committed now to trunk. So, I plan to 
merge this to branch-2. Do we need separate voting for this?
What do you say [~cnauroth] and [~andrew.wang] ?

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6503) Fix typo of DFSAdmin restoreFailedStorage


[ 
https://issues.apache.org/jira/browse/HDFS-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025418#comment-14025418
 ] 

Akira AJISAKA commented on HDFS-6503:
-

+1 (non-binding).

 Fix typo of DFSAdmin restoreFailedStorage
 -

 Key: HDFS-6503
 URL: https://issues.apache.org/jira/browse/HDFS-6503
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Minor
 Attachments: HDFS-6503.patch


 Fix typo: restoreFaileStorage should be restoreFailedStorage



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk


[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025438#comment-14025438
 ] 

Andrew Wang commented on HDFS-6257:
---

Idea look good, the current check definitely seems racy. Only q, maybe we 
should try and check more deterministically, e.g. pause DN cache reports and 
wait for a few refresh intervals (1s each) before doing the check.

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file


[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025453#comment-14025453
 ] 

Andrew Wang commented on HDFS-2006:
---

Hey Uma,  we haven't done a vote for previous branch-2 merges (e.g. caching, 
ACLs). If you post a patch or a link to a branch, I'd be happy to review. 
Unless you already plan to do something similar, I can also do a full branch-2 
test run on our internal jenkins.

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2014-06-09 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025455#comment-14025455
 ] 

Daryn Sharp commented on HDFS-2856:
---

Chris asked that I take a look, so I'll try to review this week.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Affects Versions: 3.0.0, 2.4.0
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
 HDFS-2856.1.patch, HDFS-2856.prototype.patch


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025472#comment-14025472
 ] 

Andrew Wang commented on HDFS-6460:
---

Hey Yongjun, thanks for working on this. Just one review comment, two of the 
DNs have the same IP of 11.11.11.11. Otherwise +1 pending Jenkins.

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk


[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025495#comment-14025495
 ] 

Colin Patrick McCabe commented on HDFS-6257:


The current check should always succeed if the code being tested is correct, so 
it's not racy in that sense.  We could wait for more DN cache reports, but 
since the DNs are full they shouldn't change.  Since we test the cache reports 
elsewhere, I think it's probably fine as-is, what do you think?

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025499#comment-14025499
 ] 

Jing Zhao commented on HDFS-6315:
-

The patch looks good to me in general. Some comments:
# After moving persistBlocks/persistNewBlocks/closeFile from FSDirectory to 
FSNamesystem, we may no longer need to add DIR* FSDirectory into the log 
information.
# Looks like FSNamesystem#persistBlocks(INodeFile, boolean) can be removed. We 
can just call persistBlocks(String, INodeFile, boolean) instead.
# In FSNamesystem#setQuota logSync cannot be called inside of the write lock:
{code}
+  INodeDirectory changed = dir.setQuota(path, nsQuota, dsQuota);
+  if (changed != null) {
+final Quota.Counts q = changed.getQuotaCounts();
+getEditLog().logSetQuota(path,
+q.get(Quota.NAMESPACE), q.get(Quota.DISKSPACE));
+getEditLog().logSync();
+  }
 } finally {
   writeUnlock();
 }
-getEditLog().logSync();
{code}
# A typo in the java comment:
{code}
-  // if src indicates a snapshot file, we need to make sure the 
returned
+  // if src inSicates a snapshot file, we need to make sure the 
returned
{code}

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch, jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: In Progress  (was: Patch Available)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally in trunk


[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025502#comment-14025502
 ] 

Andrew Wang commented on HDFS-6257:
---

Hmm, I guess good enough. +1 thanks colin.

 TestCacheDirectives#testExceedsCapacity fails occasionally in trunk
 ---

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: (was: jira-HDFS-6379.patch)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Status: Patch Available  (was: In Progress)

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally


 [ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6257:
---

Summary: TestCacheDirectives#testExceedsCapacity fails occasionally  (was: 
TestCacheDirectives#testExceedsCapacity fails occasionally in trunk)

 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6379) HTTPFS - Implement ACLs support


 [ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Yoder updated HDFS-6379:
-

Attachment: jira-HDFS-6379.patch

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

[
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025504#comment-14025504
]

Andrew Wang commented on HDFS-6493:
---

I think we should keep the default value to disabled, but the property and this
value should be documented in hdfs-default.xml.

It'd also be nice (if not true already) to pretty print this value on NN
startup, e.g. 30 minutes rather than 1800 seconds. It'd actually be nice
follow-on work to look for similarly unfriendly values in the logs and pretty
printing them. There are some time-related function in DFSUtil (e.g.
durationToString, datetoIso8601String), but feel free to write your own
functions too.

Propose to change dfs.namenode.startup.delay.block.deletion to second
instead of millisecond
--

Key: HDFS-6493
URL: https://issues.apache.org/jira/browse/HDFS-6493
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial

Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186,
the delay will be at least 30 minutes or even hours. it's not very user
friendly to use milliseconds when it's likely measured in hours.
I suggest to make the following change
1. change the unit of this config to second
2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms
to dfs.namenode.startup.delay.block.deletion.sec
3. add the default value to hdfs-default.xml, what's the reasonable value, 30
minutes, one hour?

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally


 [ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6257:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

committed, thanks

 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6399) FSNamesystem ACL operations should check isPermissionEnabled


 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-6399:
-

Assignee: Chris Nauroth  (was: Charles Lamb)

 FSNamesystem ACL operations should check isPermissionEnabled
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6330) Move mkdirs() to FSNamesystem


 [ 
https://issues.apache.org/jira/browse/HDFS-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6330:


Summary: Move mkdirs() to FSNamesystem  (was: Move mkdir() to FSNamesystem)

 Move mkdirs() to FSNamesystem
 -

 Key: HDFS-6330
 URL: https://issues.apache.org/jira/browse/HDFS-6330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6330.000.patch, HDFS-6330.001.patch


 Currently mkdir() automatically creates all ancestors for a directory. This 
 is implemented in FSDirectory, by calling unprotectedMkdir() along the path. 
 This jira proposes to move the function to FSNamesystem to simplify the 
 primitive that FSDirectory needs to provide.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory


 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: HDFS-6315.005.patch

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario


 [ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6478:
--

Attachment: HDFS-6478.patch

The patch has the followings:

1. Modify the proxy chain order for NamenodeProtocol and ClientProtocol so that 
NamenodeProtocolTranslatorPB/ClientNamenodeProtocolTranslatorPB directly call  
NamenodeProtocolPB and ClientNamenodeProtocolPB for non-HA case.
2. Update unit test TestFileCreation to verify retry count. This depends on 
HADOOP-10673, thus the patch also include HADOOP-10673 so that the patch can be 
submitted to run unit test.
3. Simplify the remoteException policy setup in NameNodeProxies.
4. Remove unnecessary retry policy for method create in 
DatanodeProtocolClientSideTranslatorPB.
5. DatanodeProtocolClientSideTranslatorPB still has the old proxy order. Leave 
it as it is given DataNodeProtocol doesn't do retry. We can open a separate 
jira to DataNodeProtocol retry if that is necessary.

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario


 [ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6478:
--

Status: Patch Available  (was: Open)

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6399) Add note about setfacl in HDFS permissions guide


 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6399:
--

Summary: Add note about setfacl in HDFS permissions guide  (was: 
FSNamesystem ACL operations should check isPermissionEnabled)

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory


 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: (was: HDFS-6315.005.patch)

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6399) Add note about setfacl in HDFS permissions guide


 [ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6399:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Chris

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6399) FSNamesystem ACL operations should check isPermissionEnabled


[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025509#comment-14025509
 ] 

Andrew Wang commented on HDFS-6399:
---

+1 thanks chris, will commit shortly.

 FSNamesystem ACL operations should check isPermissionEnabled
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6257) TestCacheDirectives#testExceedsCapacity fails occasionally

2014-06-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025541#comment-14025541
 ] 

Hudson commented on HDFS-6257:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5668 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5668/])
HDFS-6257. TestCacheDirectives#testExceedsCapacity fails occasionally (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601473)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCacheDirectives.java


 TestCacheDirectives#testExceedsCapacity fails occasionally
 --

 Key: HDFS-6257
 URL: https://issues.apache.org/jira/browse/HDFS-6257
 Project: Hadoop HDFS
  Issue Type: Test
  Components: caching
Affects Versions: 2.4.0
Reporter: Ted Yu
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6257.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1736/ :
 REGRESSION:  
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity
 {code}
 Error Message:
 Namenode should not send extra CACHE commands expected:0 but was:2
 Stack Trace:
 java.lang.AssertionError: Namenode should not send extra CACHE commands 
 expected:0 but was:2
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1419)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025540#comment-14025540
 ] 

Chris Nauroth commented on HDFS-6399:
-

Andrew, thank you for reviewing and committing.

 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025516#comment-14025516
 ] 

Haohui Mai commented on HDFS-6315:
--

Thanks Jing for the review. I've uploaded the v5 patch to address Jing's 
comments.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6399) Add note about setfacl in HDFS permissions guide

2014-06-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1402#comment-1402
 ] 

Hudson commented on HDFS-6399:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5669 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5669/])
HDFS-6399. Add note about setfacl in HDFS permissions guide. Contributed by 
Chris Nauroth. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601476)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsPermissionsGuide.apt.vm


 Add note about setfacl in HDFS permissions guide
 

 Key: HDFS-6399
 URL: https://issues.apache.org/jira/browse/HDFS-6399
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation, namenode
Affects Versions: 2.4.0
Reporter: Charles Lamb
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6399.1.patch, HDFS-6399.2.patch


 The ACL operations in FSNamesystem don't currently check isPermissionEnabled 
 before calling checkOwner(). This patch corrects that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread Arpit Agarwal (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025559#comment-14025559
]

Arpit Agarwal commented on HDFS-6482:
-

{{DFS_DATANODE_NUMBLOCKS_DEFAULT}} is currently 64. I am not sure why the
default was set so low. It would be good to know the reason before we change
the behavior. It was quite possibly an arbitrary choice.

After ~4 million blocks we would start putting more than 256 blocks in each
leaf subdirectory. With every 4M blocks, we'd add 256 files to each leaf. I
think this is fine since 4 million blocks itself is going to be very unlikely.
I recall as late as Vista NTFS directory listings would get noticeably slow
with thousands of files per directory. Is there any performance loss with
always having three levels of subdirectories, restricting each to 256 children
at the most?

- Who removes empty subdirectories when blocks are deleted?
- Let's avoid suffixing hex numerals to subdir for consistency with the
existing naming convention.
- StringBuilder looks unnecessary in {{idToBlockDir}}.
- We should add a release note stating that {{DFS_DATANODE_NUMBLOCKS_DEFAULT}}
is obsolete.

The approach looks good and a big +1 for removing LDir.

Use block ID-based block layout on datanodes

Key: HDFS-6482
URL: https://issues.apache.org/jira/browse/HDFS-6482
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch

Right now blocks are placed into directories that are split into many
subdirectories when capacity is reached. Instead we can use a block's ID to
determine the path it should go in. This eliminates the need for the LDir
data structure that facilitates the splitting of directories when they reach
capacity as well as fields in ReplicaInfo that keep track of a replica's
location.
An extension of the work in HDFS-3290.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance


 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6460:


Attachment: HDFS-6460.002.patch

Hi Andrew,

Thanks a lot for the review and the good catch. I'm uploading new revision to 
address it. 



 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6315) Decouple recording edit logs from FSDirectory


 [ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6315:
-

Attachment: HDFS-6315.005.patch

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

[
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025585#comment-14025585
]

Colin Patrick McCabe commented on HDFS-6382:

For the MR strategy, it seems like this could be parallelized fairly easily.
For example, if you have 5 MR tasks, you can calculate the hash of each path,
and then task 1 can do all the paths that are 0 mod 5, task 2 can do all the
paths that are 1 mod 5, and so forth. MR also doesn't introduce extra
dependencies since HDFS and MR are packaged together.

I don't understand what you mean by the mapreduce strategy will have
additional overheads. What overheads are you forseeing?

It is true that you need to avoid overloading the NameNode. But this is a
concern with any approach, not just the MR one. It would be good to see a
section on this. I think the simplest way to do it is to rate-limit RPCs to
the NameNode to a configurable rate.

bq. \[for the standalone daemon\] The major advantage of this approach is that
we don’t need any extra work to finish the TTL work, all will be done in the
daemon automatically.

I don't understand what you mean by this. What will be done automatically?

How are you going to implement HA for the standalone daemon? I suppose if all
the state is kept in HDFS, you can simply restart it when it fails. However,
it seems like you need to checkpoint how far along in the FS you are, so that
if you die and later get restarted, you don't have to redo the whole FS scan.
This implies reading directories in alphabetical order, or similar. You also
need to somehow record when the last scan was, perhaps in a file in HDFS.

I don't see a lot of discussion of logging and monitoring in general. How is
the user going to become aware that a file was deleted because of a TTL? Or if
there is an error during the delete, how will the user know? Logging is one
choice here. Creating a file in HDFS is another.

The setTtl command seems reasonable. Does this need to be an administrator
command?

HDFS File/Directory TTL
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support


[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025608#comment-14025608
 ] 

Hadoop QA commented on HDFS-6379:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649411/jira-HDFS-6379.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-httpfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7062//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7062//console

This message is automatically generated.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025693#comment-14025693
]

Colin Patrick McCabe commented on HDFS-6482:

bq. DFS_DATANODE_NUMBLOCKS_DEFAULT is currently 64. I am not sure why the
default was set so low. It would be good to know the reason before we change
the behavior. It was quite possibly an arbitrary choice.

So, back in the really old days (think ext2), there were performance issues for
directories with a large number of files (10,000+). See wikipedia's page on
ext2 here: http://en.wikipedia.org/wiki/Ext2. The LDir subdirectory mechanism
was intended to alleviate this.

More recent filesystems like ext4 (and recent revisions of ext3) have what's
called directory indices. This basically means that there is an index which
allows you to look up a particular entry in a directory in less than O(N) time.
This makes having directories with a huge number of entries possible.

It's still nice to have multiple directories to avoid overloading {{readdir}}
(when we have to do that-- for example, to find a metadata file without knowing
its genstamp) and to make inspecting things easier. Plus, it allows us to stay
compatible with systems that don't handle giant directories well.

bq. After ~4 million blocks we would start putting more than 256 blocks in each
leaf subdirectory. With every 4M blocks, we'd add 256 files to each leaf. I
think this is fine since 4 million blocks itself is going to be very unlikely.
I recall as late as Vista NTFS directory listings would get noticeably slow
with thousands of files per directory. Is there any performance loss with
always having three levels of subdirectories, restricting each to 256 children
at the most?

It's an interesting idea, but after all, as you pointed out, even to get to
1,024 blocks per subdirectory (which still isn't thousands but is a single
thousand) under James' scheme would require 16 million blocks. At that point,
it seems like there will be other problems. We can always evolve the directory
and metadata naming structure again once 16 million blocks is on the horizon
(and we probably will have to do other things too, like investigate off-heap
memory storage)

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025713#comment-14025713
 ] 

Daryn Sharp commented on HDFS-6315:
---

Catching up from summit, will look at this soon.  It's sadly conflicting with 
the single path resolution patch I keep working on.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025740#comment-14025740
 ] 

Jing Zhao commented on HDFS-6315:
-

bq. It's sadly conflicting with the single path resolution patch I keep working 
on.

Thanks for the comments, [~daryn]. This patch only makes limited changes in 
FSDirectory. Most changes just move the FSEditLog#logxxx call into 
FSNamesystem. Thus the rebase should not be complicated I guess.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6330) Move mkdirs() to FSNamesystem


[ 
https://issues.apache.org/jira/browse/HDFS-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025730#comment-14025730
 ] 

Jing Zhao commented on HDFS-6330:
-

The patch looks good to me. Some minors:
# Let's use this chance to remove the empty javadoc of FSDirectory#normalizePath
# The following change may be unnecessary?
{code}
-  blockManager.getDatanodeManager().clearPendingCachingCommands();
-  blockManager.getDatanodeManager().setShouldSendCachingCommands(false);
-  // Don't want to keep replication queues when not in Active.
-  blockManager.clearQueues();
+  if (blockManager != null) {
+blockManager.getDatanodeManager().clearPendingCachingCommands();
+blockManager.getDatanodeManager().setShouldSendCachingCommands(false);
+// Don't want to keep replication queues when not in Active.
+blockManager.clearQueues();
+  }
{code}
# Nit: Some lines exceed the 80 character limit (e.g., mkdirsRecursively and 
addSymlink).
# We may need to update the log information in mkdirsRecursively since it's no 
longer a FSDirectory call.


 Move mkdirs() to FSNamesystem
 -

 Key: HDFS-6330
 URL: https://issues.apache.org/jira/browse/HDFS-6330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6330.000.patch, HDFS-6330.001.patch


 Currently mkdir() automatically creates all ancestors for a directory. This 
 is implemented in FSDirectory, by calling unprotectedMkdir() along the path. 
 This jira proposes to move the function to FSNamesystem to simplify the 
 primitive that FSDirectory needs to provide.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025761#comment-14025761
]

Kihwal Lee commented on HDFS-6482:
--

BlockIDs are sequential nowadays. With the proposed block distribution method,
leaf dirs can get severely unbalanced, especially in smaller clusters. Besides
the cost of looking up entries in a directory, directory lock contention can
become high and hurt performance if many files are created and read from a
small set of directories. I think limiting the number to 64 kind of imposed a
cap on how contentious it can be. We might do better by more evenly
distributing blocks.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025773#comment-14025773
 ] 

Hadoop QA commented on HDFS-6315:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649417/HDFS-6315.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7061//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7061//console

This message is automatically generated.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory

2014-06-09 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025783#comment-14025783
 ] 

Daryn Sharp commented on HDFS-6315:
---

Maybe it's ok, but I'll apply the patch and comment in the morning.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread James Thomas (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025786#comment-14025786
]

James Thomas commented on HDFS-6482:

Thanks for the review, Arpit, and thanks for the follow-up, Colin. I want to
clarify one thing -- the numbers 4 million and 16 million that both of you
mention are, as far as I understand, actually numbers of blocks for the ENTIRE
cluster, not just a single DN. Suppose we had a cluster of 16 million blocks
(with sequential block IDs), we could in theory have a single DN with a
directory as large as 1024 entries, if we got unlucky with the assignment of
blocks to DNs. Assuming uniform distribution of blocks across the DNs available
in the cluster and a maximum # of blocks per DN of 2^24, we have an expected #
of blocks per directory of 256. I don't know how accurate this assumption is.

Use block ID-based block layout on datanodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-06-09 Thread James Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025802#comment-14025802
 ] 

James Thomas commented on HDFS-6482:


Kihwal, we were considering using some sort of deterministic probing (as in 
hash tables) to find less full directories if the initial directory for a block 
is full. Do you think the cost (and additional complexity) of this sort of 
scheme is justified given the relatively low probability (given the uniform 
block distribution assumption, at least) of directory blowup?

Additionally, I want to note that if the total number of blocks in the cluster 
is N, N/2^16 is a strict upper bound on the number of blocks in a single 
directory on any DN, assuming completely sequential block IDs. So for a small 
cluster we can't see any blowup.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6315) Decouple recording edit logs from FSDirectory


[ 
https://issues.apache.org/jira/browse/HDFS-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025803#comment-14025803
 ] 

Hadoop QA commented on HDFS-6315:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649417/HDFS-6315.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7063//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7063//console

This message is automatically generated.

 Decouple recording edit logs from FSDirectory
 -

 Key: HDFS-6315
 URL: https://issues.apache.org/jira/browse/HDFS-6315
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6315.000.patch, HDFS-6315.001.patch, 
 HDFS-6315.002.patch, HDFS-6315.003.patch, HDFS-6315.004.patch, 
 HDFS-6315.005.patch


 Currently both FSNamesystem and FSDirectory record edit logs. This design 
 requires both FSNamesystem and FSDirectory to be tightly coupled together to 
 implement a durable namespace.
 This jira proposes to separate the responsibility of implementing the 
 namespace and providing durability with edit logs. Specifically, FSDirectory 
 implements the namespace (which should have no edit log operations), and 
 FSNamesystem implement durability by recording the edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2006) ability to support storing extended attributes per file

2014-06-09 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025823#comment-14025823
 ] 

Chris Nauroth commented on HDFS-2006:
-

I agree with Andrew on the plan for merging to branch-2.  Thank you, Uma.

 ability to support storing extended attributes per file
 ---

 Key: HDFS-2006
 URL: https://issues.apache.org/jira/browse/HDFS-2006
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: dhruba borthakur
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: ExtendedAttributes.html, HDFS-2006-Merge-1.patch, 
 HDFS-2006-Merge-2.patch, HDFS-XAttrs-Design-1.pdf, HDFS-XAttrs-Design-2.pdf, 
 HDFS-XAttrs-Design-3.pdf, Test-Plan-for-Extended-Attributes-1.pdf, 
 xattrs.1.patch, xattrs.patch


 It would be nice if HDFS provides a feature to store extended attributes for 
 files, similar to the one described here: 
 http://en.wikipedia.org/wiki/Extended_file_attributes. 
 The challenge is that it has to be done in such a way that a site not using 
 this feature does not waste precious memory resources in the namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6478) RemoteException can't be retried properly for non-HA scenario


[ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025843#comment-14025843
 ] 

Hadoop QA commented on HDFS-6478:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649415/HDFS-6478.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7064//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7064//console

This message is automatically generated.

 RemoteException can't be retried properly for non-HA scenario
 -

 Key: HDFS-6478
 URL: https://issues.apache.org/jira/browse/HDFS-6478
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-6478.patch


 For HA case, the call stack is DFSClient - RetryInvocationHandler - 
 ClientNamenodeProtocolTranslatorPB - ProtobufRpcEngine. ProtobufRpcEngine. 
 ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
 it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
 that.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy26.getFileInfo
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
 at sun.reflect.GeneratedMethodAccessor24.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy27.getFileInfo
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
 {noformat}
 However, for non-HA case, the call stack is DFSClient - 
 ClientNamenodeProtocolTranslatorPB - RetryInvocationHandler - 
 ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
 retried properly.
 {noformat}
 at org.apache.hadoop.ipc.Client.call
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
 at com.sun.proxy.$Proxy9.getListing
 at sun.reflect.NativeMethodAccessorImpl.invoke0
 at sun.reflect.NativeMethodAccessorImpl.invoke
 at sun.reflect.DelegatingMethodAccessorImpl.invoke
 at java.lang.reflect.Method.invoke
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
 at com.sun.proxy.$Proxy9.getListing
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
 at org.apache.hadoop.hdfs.DFSClient.listPaths
 {noformat}
 Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
 ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
 order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking


[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025855#comment-14025855
 ] 

Andrew Wang commented on HDFS-6395:
---

I should have realized this earlier, considering I worked on something pretty 
similar with the fs-limits and the edit log before. I agree that it's difficult 
to do this without some serious code gymnastics, so let's just table the entire 
thing for now. Please resolve this if you agree, thanks again [~hitliuyi].

 Assorted improvements to xattr limit checking
 -

 Key: HDFS-6395
 URL: https://issues.apache.org/jira/browse/HDFS-6395
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Yi Liu
 Attachments: HDFS-6395.patch


 It'd be nice to print messages during fsimage and editlog loading if we hit 
 either the # of xattrs per inode or the xattr size limits.
 We should also consider making the # of xattrs limit only apply to the user 
 namespace, or to each namespace separately, to prevent users from locking out 
 access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support

2014-06-09 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025943#comment-14025943
 ] 

Alejandro Abdelnur commented on HDFS-6379:
--

[~michaelbyoder], nice work. Would you mind adding a testcase where ACLs are 
disabled in HDFS to verify that being disable do not break file status and list 
status? After that I think it is ready to go.

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025944#comment-14025944
 ] 

Hadoop QA commented on HDFS-6460:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649427/HDFS-6460.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7066//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7066//console

This message is automatically generated.

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6379) HTTPFS - Implement ACLs support


[ 
https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025956#comment-14025956
 ] 

Mike Yoder commented on HDFS-6379:
--

Looks like there's another me out there!  I'm [~yoderme], not that 
other...uh...guy with my name. :-)

[~tucu00], I totally agree with that test case, but can you send a quick 
pointer as to how to do that in an automated fashion?  All the test cases I've 
seen fire up the server part once at the start and leave it running for all 
tests.  Any way to change the server conf dynamically?

Thanks, -Mike

 HTTPFS - Implement ACLs support
 ---

 Key: HDFS-6379
 URL: https://issues.apache.org/jira/browse/HDFS-6379
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Alejandro Abdelnur
Assignee: Mike Yoder
 Fix For: 2.4.0

 Attachments: jira-HDFS-6379.patch


 HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS.
 This JIRA is for such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6504) NFS: invalid Keytab/principal entry should shutdown nfs server

2014-06-09 Thread Yesha Vora (JIRA)

Yesha Vora created HDFS-6504:


 Summary: NFS: invalid Keytab/principal entry should shutdown nfs 
server
 Key: HDFS-6504
 URL: https://issues.apache.org/jira/browse/HDFS-6504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora


Invalid value in 'dfs.nfs.keytab.file' and 'dfs.nfs.kerberos.principal' should 
shutdown nfs.

Currently NFS does not throw any error or shutdown nfs if invalid value is 
entered in any of the above properties.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025964#comment-14025964
 ] 

Yongjun Zhang commented on HDFS-6460:
-

The failed test is irrelevant, and it was reported as HADOOP-10668.


 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025970#comment-14025970
 ] 

Andrew Wang commented on HDFS-6460:
---

+1 will commit shortly, thanks Yongjun

 To ignore stale/decommissioned nodes in NetworkTopology#pseudoSortByDistance
 

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

2014-06-09 Thread Brandon Li (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025972#comment-14025972
]

Brandon Li commented on HDFS-6439:
--

[~atm], are you still working on this? If you are distracted by other tasks, I
will upload a new patch based on yours.

NFS should not reject NFS requests to the NULL procedure whether port
monitoring is enabled or not
--

Key: HDFS-6439
URL: https://issues.apache.org/jira/browse/HDFS-6439
Project: Hadoop HDFS
Issue Type: Bug
Components: nfs
Affects Versions: 2.4.0
Reporter: Brandon Li
Assignee: Aaron T. Myers
Attachments: HDFS-6439.patch, HDFS-6439.patch,
linux-nfs-disallow-request-from-nonsecure-port.pcapng,
mount-nfs-requests.pcapng

As discussed in HDFS-6406, this JIRA is to track the follow update:
1. Port monitoring is the feature name with traditional NFS server and we may
want to make the config property (along with related variable
allowInsecurePorts) something as dfs.nfs.port.monitoring.
2 . According to RFC2623 (http://www.rfc-editor.org/rfc/rfc2623.txt):
{quote}Whether port monitoring is enabled or not, NFS servers SHOULD NOT
reject NFS requests to the NULL procedure (procedure number 0). See
subsection 2.3.1, NULL procedure for a complete explanation. {quote}
I do notice that NFS clients (most time) send mount NULL and nfs NULL from
non-privileged port. If we deny NULL call in mountd or nfs server, the client
can't mount the export even as user root.
3. it would be nice to have the user guide updated for the port monitoring
feature.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

[
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025977#comment-14025977
]

Hadoop QA commented on HDFS-6439:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12646408/linux-nfs-disallow-request-from-nonsecure-port.pcapng
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7067//console

This message is automatically generated.

NFS should not reject NFS requests to the NULL procedure whether port
monitoring is enabled or not
--

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance


 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6460:
--

Summary: Ignore stale and decommissioned nodes in 
NetworkTopology#sortByDistance  (was: To ignore stale/decommissioned nodes in 
NetworkTopology#pseudoSortByDistance)

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025980#comment-14025980
 ] 

Colin Patrick McCabe commented on HDFS-6482:


bq. Suppose we had a cluster of 16 million blocks (with sequential block IDs), 
we could in theory have a single DN with a directory as large as 1024 entries, 
if we got unlucky with the assignment of blocks to DNs.

I don't think this calculation is right.

Even if all the blocks end up on a single DN (maximally unbalanced), in a 16 
million block cluster, you have  (16 * 1024 * 1024) / (256 * 256) = 256 entries 
per directory.

To confirm this calculation, I ran this test program:
{code}
#include inttypes.h
#include stdio.h

#define MAX_A 256
#define MAX_B 256

uint64_t dir_entries[MAX_A][MAX_B];

int main(void)
{
  uint64_t i, j, l, a, b, c;
  uint64_t max = (16LL * 1024LL * 1024LL);

  for (i = 0; i  max; i++) {
l = (i  0x00ffLL);
a = (i  0xff00LL)  8LL;
b = (i  0x00ffLL)  16LL;
c = (i  0xff00LL)  16LL;
c |= l;
//printf(%02PRIx64/%02PRIx64/%012PRIx64\n, a, b, c);
dir_entries[a][b]++;
  }
  max = 0;
  for (i = 0; i  MAX_A; i++) {
for (j = 0; j  MAX_B; j++) {
  if (max  dir_entries[i][j]) {
max = dir_entries[i][j];
  }
}
  }
  printf(max entries per directory = %PRId64\n, max);
  return 0;
}
{code}

bq. we were considering using some sort of deterministic probing (as in hash 
tables) to find less full directories if the initial directory for a block is 
full...

I don't think probing is a good idea.  It's going to slow things down in the 
common case when we're reading a block.

Maybe we should add another layer in the hierarchy so that we know we won't get 
big directories even on huge clusters.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.0
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025981#comment-14025981
 ] 

Andrew Wang commented on HDFS-6460:
---

Committed this to trunk. Yongjun, do you mind prepping a branch-2 patch too? 
There's another test that needs to be updated.

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance

2014-06-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025992#comment-14025992
 ] 

Hudson commented on HDFS-6460:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5671 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5671/])
HDFS-6460. Ignore stale and decommissioned nodes in 
NetworkTopology#sortByDistance. Contributed by Yongjun Zhang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1601535)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopologyWithNodeGroup.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestNetworkTopologyWithNodeGroup.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460.001.patch, HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6470) TestBPOfferService.testBPInitErrorHandling is flaky


 [ 
https://issues.apache.org/jira/browse/HDFS-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6470:
--

Attachment: HDFS-6470.patch

It seems the test has the following issues.

1. It asserts the size of BPServiceActor is 2 after BPOfferService started. One 
of the BPServiceActors could have shutdown due to initBlockPool failure by the 
time the assert is called.

2. It assumes the first BPServiceActor is healthy and uses that for blockReport 
verification. It is possible the second BPServiceActor is healthy.

The patch moves the size check before BPOfferService starts. In addition, as 
long as one of the BPServiceActors can send blockReport, the test is considered 
passed.

 TestBPOfferService.testBPInitErrorHandling is flaky
 ---

 Key: HDFS-6470
 URL: https://issues.apache.org/jira/browse/HDFS-6470
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
 Attachments: HDFS-6470.patch


 Saw some test flakage in a test-patch run, stacktrace:
 {code}
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBPInitErrorHandling(TestBPOfferService.java:334)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6470) TestBPOfferService.testBPInitErrorHandling is flaky


 [ 
https://issues.apache.org/jira/browse/HDFS-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6470:
--

Status: Patch Available  (was: Open)

 TestBPOfferService.testBPInitErrorHandling is flaky
 ---

 Key: HDFS-6470
 URL: https://issues.apache.org/jira/browse/HDFS-6470
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
 Attachments: HDFS-6470.patch


 Saw some test flakage in a test-patch run, stacktrace:
 {code}
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBPInitErrorHandling(TestBPOfferService.java:334)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Work started] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-6493 started by Juan Yu.

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial
 Attachments: HDFS-6493.001.patch


 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6493:
--

Attachment: HDFS-6493.001.patch

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial
 Attachments: HDFS-6493.001.patch


 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6493) Propose to change dfs.namenode.startup.delay.block.deletion to second instead of millisecond

2014-06-09 Thread Juan Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6493:
--

Attachment: (was: HDFS-6493.001.patch)

 Propose to change dfs.namenode.startup.delay.block.deletion to second 
 instead of millisecond
 --

 Key: HDFS-6493
 URL: https://issues.apache.org/jira/browse/HDFS-6493
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Trivial

 Based on the discussion in https://issues.apache.org/jira/browse/HDFS-6186, 
 the delay will be at least 30 minutes or even hours. it's not very user 
 friendly to use milliseconds when it's likely measured in hours.
 I suggest to make the following change
 1. change the unit of this config to second
 2. rename the config key from dfs.namenode.startup.delay.block.deletion.ms 
 to dfs.namenode.startup.delay.block.deletion.sec
 3. add the default value to hdfs-default.xml, what's the reasonable value, 30 
 minutes, one hour?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6502) incorrect description in distcp2 document


 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned HDFS-6502:
---

Assignee: Akira AJISAKA

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA

 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6502) incorrect description in distcp2 document


 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6502:


Attachment: HDFS-6502.patch

Thanks [~yzhangal] for the report. Attaching a patch for trunk and branch-2.

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA
 Attachments: HDFS-6502.patch


 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance


 [ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6460:


Attachment: HDFS-6460-branch2.001.patch

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6502) incorrect description in distcp2 document


 [ 
https://issues.apache.org/jira/browse/HDFS-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6502:


   Labels: newbie  (was: )
 Target Version/s: 2.5.0
Affects Version/s: (was: 2.4.0)
   2.5.0
   1.2.1
   Status: Patch Available  (was: Open)

 incorrect description in distcp2 document
 -

 Key: HDFS-6502
 URL: https://issues.apache.org/jira/browse/HDFS-6502
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.2.1, 2.5.0
Reporter: Yongjun Zhang
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-6502.patch


 In http://hadoop.apache.org/docs/r1.2.1/distcp2.html#UpdateAndOverwrite
 The first statement of the Update and Overwrite section says:
 {quote}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files even if they 
 exist at the source, or have the same contents.
 {quote}
 The Command Line Options table says :
 {quote}
   -overwrite: Overwrite destination
   -update: Overwrite if src size different from dst size
 {quote}
 Based on the implementation, making the following modification would be more 
 accurate:
 The first statement of the Update and Overwrite section:
 {code}
 -update is used to copy files from source that don't exist at the target, or 
 have different contents. -overwrite overwrites target-files if they exist at 
 the target.
 {code}
 The Command Line Options table:
 {code}
   -overwrite: Overwrite destination
   -update: Overwrite destination if source and destination have different 
 contents
 {code}
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026024#comment-14026024
 ] 

Yongjun Zhang commented on HDFS-6460:
-

Many thanks Andrew! Just uploaded a patch for branch-2, the change is in 
TestHdfsNetworkTopologyWithNodeGroup.java as you mentioned.


 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6439) NFS should not reject NFS requests to the NULL procedure whether port monitoring is enabled or not

2014-06-09 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026027#comment-14026027
]

Aaron T. Myers commented on HDFS-6439:
--

Definitely don't let me hold you up if you'd like to work on a patch,
[~brandonli]. It'd be much appreciated, and I'd be happy to review it.

NFS should not reject NFS requests to the NULL procedure whether port
monitoring is enabled or not
--

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6395) Assorted improvements to xattr limit checking


[ 
https://issues.apache.org/jira/browse/HDFS-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026032#comment-14026032
 ] 

Andrew Wang commented on HDFS-6395:
---

Woops, my bad, I forgot that this patch also fixes the # limit to not apply to 
the non-user namespaces. I had a few comments:

- Would be nice to test that the system namespace isn't affected by these 
limits, I guess reach into FSNamesystem or FSDirectory via @VisibleForTesting 
methods.
- Let's remove the prints when the limits hit their max, since I think that was 
a misunderstanding of Chris' comment about printing.

Thanks Yi!

 Assorted improvements to xattr limit checking
 -

 Key: HDFS-6395
 URL: https://issues.apache.org/jira/browse/HDFS-6395
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Yi Liu
 Attachments: HDFS-6395.patch


 It'd be nice to print messages during fsimage and editlog loading if we hit 
 either the # of xattrs per inode or the xattr size limits.
 We should also consider making the # of xattrs limit only apply to the user 
 namespace, or to each namespace separately, to prevent users from locking out 
 access to other namespaces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6460) Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance


[ 
https://issues.apache.org/jira/browse/HDFS-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026039#comment-14026039
 ] 

Hadoop QA commented on HDFS-6460:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12649510/HDFS-6460-branch2.001.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7069//console

This message is automatically generated.

 Ignore stale and decommissioned nodes in NetworkTopology#sortByDistance
 ---

 Key: HDFS-6460
 URL: https://issues.apache.org/jira/browse/HDFS-6460
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6460-branch2.001.patch, HDFS-6460.001.patch, 
 HDFS-6460.002.patch


 Per discussion in HDFS-6268, filing this jira as a follow-up, so that we can 
 improve the sorting result and save a bit runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL

[
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026047#comment-14026047
]

Zesheng Wu commented on HDFS-6382:
--

Thanks [~cmccabe] for your feedback.
bq. For the MR strategy, it seems like this could be parallelized fairly
easily. For example, if you have 5 MR tasks, you can calculate the hash of each
path, and then task 1 can do all the paths that are 0 mod 5, task 2 can do all
the paths that are 1 mod 5, and so forth. MR also doesn't introduce extra
dependencies since HDFS and MR are packaged together.
You mean that we scan the whole namespace at first and then split it into 5
pieces according to hash of the path, why do we just complete the work during
the first scanning process? If I misunderstand your meaning, please point out.

bq. I don't understand what you mean by the mapreduce strategy will have
additional overheads. What overheads are you foreseeing?
Possible overheads: Starting a mapreduce job needs to split the input, start an
AppMaster, collect result from random machines (Perhaps 'overheads' is not a
proper word here)

bq. I don't understand what you mean by this. What will be done automatically?
Here automatically means we do not have to rely on external tools, the daemon
itself can manage the work well.

bq. How are you going to implement HA for the standalone daemon?
Good point. As you suggested, one approach is save the state in HDFS and simply
restart it when it fails. But managing the state is a complex work, I am
considering how to simplify this. One possible simpler approach is that we can
consider that the daemon is stateless and simply restart it when if fails. We
needn't do checkpoint and just scan from the beginning when it restarts.
Because we can require that the work the daemon does is idempotent, starting
from the beginning will be harmless. Possible drawbacks of the later approach
are that it may waste some time and may delay the work, but they are
acceptable.

bq. I don't see a lot of discussion of logging and monitoring in general. How
is the user going to become aware that a file was deleted because of a TTL? Or
if there is an error during the delete, how will the user know?
For the simplicity purpose, in the initial version, we will use logs to record
which file/directory is deleted by TTL, and errors during the deleting process.

bq. Does this need to be an administrator command?
It doesn't need to be an administrator command, user only can setTtl on
file/directory that they have write permission, and can getTtl on
file/directory that they have read permission.

HDFS File/Directory TTL
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2014-06-09 Thread Guo Ruijing (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026068#comment-14026068
 ] 

Guo Ruijing commented on HDFS-6489:
---

Take example,

existing behavior:

1. create file 60M with prefer block size 64M.
2. append 10 bytes  (disk utilization is increased by 60M + 10 bytes, totally 
120M + 10 bytes)
3. append 10 bytes  (disk utilization is increased by  60M + 20 bytes, totally 
120M + 30 bytes)
4. append 10 bytes (disk utilization is increased by 60M + 30 bytes, totally 
180M + 60bytes)

expected behavior:

1. create file 60M with prefer block size 64M.
2. append 10 bytes  (disk utilization is increased 10 bytes, totally 60M + 10 
bytes)
3. append 10 bytes  (disk utilization is increased 10 bytes, totally 60M + 20 
bytes)
4. append 10 bytes (disk utilization is increased 10 bytes, totally 60M + 30 
bytes)

 DFS Used space is not correct computed on frequent append operations
 

 Key: HDFS-6489
 URL: https://issues.apache.org/jira/browse/HDFS-6489
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: stanley shi

 The current implementation of the Datanode will increase the DFS used space 
 on each block write operation. This is correct in most scenario (create new 
 file), but sometimes it will behave in-correct(append small data to a large 
 block).
 For example, I have a file with only one block(say, 60M). Then I try to 
 append to it very frequently but each time I append only 10 bytes;
 Then on each append, dfs used will be increased with the length of the 
 block(60M), not teh actual data length(10bytes).
 Consider in a scenario I use many clients to append concurrently to a large 
 number of files (1000+), assume the block size is 32M (half of the default 
 value), then the dfs used will be increased 1000*32M = 32G on each append to 
 the files; but actually I only write 10K bytes; this will cause the datanode 
 to report in-sufficient disk space on data write.
 {quote}2014-06-04 15:27:34,719 INFO 
 org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
 BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
 exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
 Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
 FINALIZED{quote}
 But the actual disk usage:
 {quote}
 [root@hdsh143 ~]# df -h
 FilesystemSize  Used Avail Use% Mounted on
 /dev/sda3  16G  2.9G   13G  20% /
 tmpfs 1.9G   72K  1.9G   1% /dev/shm
 /dev/sda1  97M   32M   61M  35% /boot
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6502) incorrect description in distcp2 document