[jira] [Commented] (HDFS-5637) try to refeatchToken while local read InvalidToken occurred

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841110#comment-13841110
 ] 

Hadoop QA commented on HDFS-5637:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617342/HDFS-5637.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5656//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5656//console

This message is automatically generated.

 try to refeatchToken while local read InvalidToken occurred
 ---

 Key: HDFS-5637
 URL: https://issues.apache.org/jira/browse/HDFS-5637
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, security
Affects Versions: 2.0.5-alpha, 2.2.0
Reporter: Liang Xie
Assignee: Liang Xie
 Attachments: HDFS-5637.txt


 we observed several warning logs like below from region server nodes:
 2013-12-05,13:22:26,042 WARN org.apache.hadoop.hdfs.DFSClient: Failed to 
 connect to /10.2.201.110:11402 for block, add to deadNodes and continue. 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
 block_token_identifier (expiryDate=1386060141977, keyId=-333530248, 
 userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, 
 blockId=-190217754078101701, access modes=[READ]) is expired.
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
 block_token_identifier (expiryDate=1386060141977, keyId=-333530248, 
 userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, 
 blockId=-190217754078101701, access modes=[READ]) is expired.
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033)
 at 
 

[jira] [Commented] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841123#comment-13841123
 ] 

Hadoop QA commented on HDFS-5312:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617346/HDFS-5312.008.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5657//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5657//console

This message is automatically generated.

 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5633) Improve OfflineImageViewer to use less memory

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841180#comment-13841180
 ] 

Hudson commented on HDFS-5633:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #413 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/413/])
HDFS-5633. Improve OfflineImageViewer to use less memory. Contributed by Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548359)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FileDistributionVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java


 Improve OfflineImageViewer to use less memory
 -

 Key: HDFS-5633
 URL: https://issues.apache.org/jira/browse/HDFS-5633
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5633.000.patch


 Currently after we rename a file/dir which is included in a snapshot, the 
 file/dir can be linked with two different reference INodes. To avoid 
 saving/loading the inode multiple times in/from FSImage, we use a temporary 
 map to record whether we have visited this inode before.
 However, in OfflineImageViewer (specifically, in ImageLoaderCurrent), the 
 current implementation simply records all the directory inodes. This can take 
 a lot of memory when the fsimage is big. We should only record an inode in 
 the temp map when it is referenced by an INodeReference, just like what we do 
 in FSImageFormat.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5630) Hook up cache directive and pool usage statistics

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841176#comment-13841176
 ] 

Hudson commented on HDFS-5630:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #413 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/413/])
HDFS-5630. Hook up cache directive and pool usage statistics. (wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548309)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirective.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirectiveStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CachePool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/CacheAdmin.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientNamenodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCacheDirectives.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testCacheAdminConf.xml


 Hook up cache directive and pool usage statistics
 -

 Key: HDFS-5630
 URL: https://issues.apache.org/jira/browse/HDFS-5630
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 3.0.0

 Attachments: hdfs-5630-1.patch, hdfs-5630-2.patch


 Right now we have stubs for bytes/files statistics for cache pools, but we 
 need to hook them up so they're actually being tracked.
 This is a pre-requisite for enforcing per-pool quotas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841183#comment-13841183
 ] 

Hudson commented on HDFS-5514:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #413 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/413/])
Neglected to add new file in HDFS-5514 (daryn) (daryn: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548167)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystemLock.java
HDFS-5514. FSNamesystem's fsLock should allow custom implementation (daryn) 
(daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548161)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java


 FSNamesystem's fsLock should allow custom implementation
 

 Key: HDFS-5514
 URL: https://issues.apache.org/jira/browse/HDFS-5514
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5514.patch, HDFS-5514.patch


 Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible 
 class that encapsulates the rwLock will allow for more sophisticated locking 
 implementations such as fine grain locking.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841177#comment-13841177
 ] 

Hudson commented on HDFS-5590:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #413 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/413/])
HDFS-5590. Block ID and generation stamp may be reused when persistBlocks is 
set to false. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548368)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java


 Block ID and generation stamp may be reused when persistBlocks is set to false
 --

 Key: HDFS-5590
 URL: https://issues.apache.org/jira/browse/HDFS-5590
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.3.0

 Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch


 In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
 have data loss in the following case:
 # client creates file1 and requests a block from NN and get blk_id1_gs1
 # client writes blk_id1_gs1 to DN
 # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
 persisted in disk
 # another client creates file2 and NN will allocate a new block using the 
 same block id blk_id1_gs1 since block ID and generation stamp are both 
 increased sequentially.
 Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
 same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841250#comment-13841250
 ] 

Hudson commented on HDFS-5514:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1630 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1630/])
Neglected to add new file in HDFS-5514 (daryn) (daryn: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548167)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystemLock.java
HDFS-5514. FSNamesystem's fsLock should allow custom implementation (daryn) 
(daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548161)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java


 FSNamesystem's fsLock should allow custom implementation
 

 Key: HDFS-5514
 URL: https://issues.apache.org/jira/browse/HDFS-5514
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5514.patch, HDFS-5514.patch


 Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible 
 class that encapsulates the rwLock will allow for more sophisticated locking 
 implementations such as fine grain locking.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841244#comment-13841244
 ] 

Hudson commented on HDFS-5590:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1630 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1630/])
HDFS-5590. Block ID and generation stamp may be reused when persistBlocks is 
set to false. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548368)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java


 Block ID and generation stamp may be reused when persistBlocks is set to false
 --

 Key: HDFS-5590
 URL: https://issues.apache.org/jira/browse/HDFS-5590
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.3.0

 Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch


 In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
 have data loss in the following case:
 # client creates file1 and requests a block from NN and get blk_id1_gs1
 # client writes blk_id1_gs1 to DN
 # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
 persisted in disk
 # another client creates file2 and NN will allocate a new block using the 
 same block id blk_id1_gs1 since block ID and generation stamp are both 
 increased sequentially.
 Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
 same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5630) Hook up cache directive and pool usage statistics

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841243#comment-13841243
 ] 

Hudson commented on HDFS-5630:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1630 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1630/])
HDFS-5630. Hook up cache directive and pool usage statistics. (wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548309)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirective.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirectiveStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CachePool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/CacheAdmin.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientNamenodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCacheDirectives.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testCacheAdminConf.xml


 Hook up cache directive and pool usage statistics
 -

 Key: HDFS-5630
 URL: https://issues.apache.org/jira/browse/HDFS-5630
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 3.0.0

 Attachments: hdfs-5630-1.patch, hdfs-5630-2.patch


 Right now we have stubs for bytes/files statistics for cache pools, but we 
 need to hook them up so they're actually being tracked.
 This is a pre-requisite for enforcing per-pool quotas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5633) Improve OfflineImageViewer to use less memory

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841247#comment-13841247
 ] 

Hudson commented on HDFS-5633:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1630 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1630/])
HDFS-5633. Improve OfflineImageViewer to use less memory. Contributed by Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548359)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FileDistributionVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java


 Improve OfflineImageViewer to use less memory
 -

 Key: HDFS-5633
 URL: https://issues.apache.org/jira/browse/HDFS-5633
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5633.000.patch


 Currently after we rename a file/dir which is included in a snapshot, the 
 file/dir can be linked with two different reference INodes. To avoid 
 saving/loading the inode multiple times in/from FSImage, we use a temporary 
 map to record whether we have visited this inode before.
 However, in OfflineImageViewer (specifically, in ImageLoaderCurrent), the 
 current implementation simply records all the directory inodes. This can take 
 a lot of memory when the fsimage is big. We should only record an inode in 
 the temp map when it is referenced by an INodeReference, just like what we do 
 in FSImageFormat.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841261#comment-13841261
 ] 

Hudson commented on HDFS-5590:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1604 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1604/])
HDFS-5590. Block ID and generation stamp may be reused when persistBlocks is 
set to false. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548368)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java


 Block ID and generation stamp may be reused when persistBlocks is set to false
 --

 Key: HDFS-5590
 URL: https://issues.apache.org/jira/browse/HDFS-5590
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.3.0

 Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch


 In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
 have data loss in the following case:
 # client creates file1 and requests a block from NN and get blk_id1_gs1
 # client writes blk_id1_gs1 to DN
 # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
 persisted in disk
 # another client creates file2 and NN will allocate a new block using the 
 same block id blk_id1_gs1 since block ID and generation stamp are both 
 increased sequentially.
 Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
 same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5630) Hook up cache directive and pool usage statistics

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841260#comment-13841260
 ] 

Hudson commented on HDFS-5630:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1604 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1604/])
HDFS-5630. Hook up cache directive and pool usage statistics. (wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548309)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirective.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CacheDirectiveStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolStats.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CachePool.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/CacheAdmin.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientNamenodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCacheDirectives.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testCacheAdminConf.xml


 Hook up cache directive and pool usage statistics
 -

 Key: HDFS-5630
 URL: https://issues.apache.org/jira/browse/HDFS-5630
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 3.0.0

 Attachments: hdfs-5630-1.patch, hdfs-5630-2.patch


 Right now we have stubs for bytes/files statistics for cache pools, but we 
 need to hook them up so they're actually being tracked.
 This is a pre-requisite for enforcing per-pool quotas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841267#comment-13841267
 ] 

Hudson commented on HDFS-5514:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1604 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1604/])
Neglected to add new file in HDFS-5514 (daryn) (daryn: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548167)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystemLock.java
HDFS-5514. FSNamesystem's fsLock should allow custom implementation (daryn) 
(daryn: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548161)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java


 FSNamesystem's fsLock should allow custom implementation
 

 Key: HDFS-5514
 URL: https://issues.apache.org/jira/browse/HDFS-5514
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5514.patch, HDFS-5514.patch


 Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible 
 class that encapsulates the rwLock will allow for more sophisticated locking 
 implementations such as fine grain locking.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5633) Improve OfflineImageViewer to use less memory

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841264#comment-13841264
 ] 

Hudson commented on HDFS-5633:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1604 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1604/])
HDFS-5633. Improve OfflineImageViewer to use less memory. Contributed by Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548359)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FileDistributionVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java


 Improve OfflineImageViewer to use less memory
 -

 Key: HDFS-5633
 URL: https://issues.apache.org/jira/browse/HDFS-5633
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5633.000.patch


 Currently after we rename a file/dir which is included in a snapshot, the 
 file/dir can be linked with two different reference INodes. To avoid 
 saving/loading the inode multiple times in/from FSImage, we use a temporary 
 map to record whether we have visited this inode before.
 However, in OfflineImageViewer (specifically, in ImageLoaderCurrent), the 
 current implementation simply records all the directory inodes. This can take 
 a lot of memory when the fsimage is big. We should only record an inode in 
 the temp map when it is referenced by an INodeReference, just like what we do 
 in FSImageFormat.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0

2013-12-06 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841326#comment-13841326
 ] 

Konstantin Shvachko commented on HDFS-4114:
---

[~sureshms], I missed your comment of Nov 8 while travelling, sorry, still 
recovering.
As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
It still provides important bindings to downstream projects that I believe we 
both care about: by its mere existence and test coverage.
You are right it's been a while and I have a debt to provide proper ones, which 
is on my todo list.
I understand the burden of supporting and in the mean time want to reiterate my 
readiness to promptly address any related issues.
LMK if I missed or can help with any.

Glanced through your patch.
Saw some things that probably fall as collateral damage, like documentation 
about Import Checkpoint, which is not related to BN.
But, thanks, it nicely scopes for me the essence of the bindings required.
If you wish we can assign this issue to me so that I could take care of it in 
the future.

 Deprecate the BackupNode and CheckpointNode in 2.0
 --

 Key: HDFS-4114
 URL: https://issues.apache.org/jira/browse/HDFS-4114
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Eli Collins
Assignee: Suresh Srinivas
 Attachments: HDFS-4114.patch


 Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the 
 BackupNode and CheckpointNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-4983:


Attachment: HDFS-4983.005.patch

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841360#comment-13841360
 ] 

Yongjun Zhang commented on HDFS-4983:
-

Thanks a lot Andrew, just uploaded a version with this slight change you 
pointed out.


 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841447#comment-13841447
 ] 

Jonathan Eagles commented on HDFS-5023:
---

Mit, Thanks for the patch. Looks like you have left some debugging code in the 
patch submitted. 

//testSnapshotPathINodesAfterModification();


 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5594) FileSystem API for ACLs.

2013-12-06 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5594:


Attachment: HDFS-5594.2.patch

Here is a new version of the patch that updates {{TestHarFileSystem}} to fix 
the prior test failure.

 FileSystem API for ACLs.
 

 Key: HDFS-5594
 URL: https://issues.apache.org/jira/browse/HDFS-5594
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5594.1.patch, HDFS-5594.2.patch


 Add new methods to {{FileSystem}} for manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841472#comment-13841472
 ] 

Jing Zhao commented on HDFS-5312:
-

+1 for the 008 patch. I will commit it shortly.

 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5312:


   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk.

 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 3.0.0

 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5629:


Status: Patch Available  (was: Open)

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841482#comment-13841482
 ] 

Hadoop QA commented on HDFS-4983:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617409/HDFS-4983.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5659//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5659//console

This message is automatically generated.

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841499#comment-13841499
 ] 

Hudson commented on HDFS-5312:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4846 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4846/])
HDFS-5312. Generate HTTP/HTTPS URL in DFSUtil#getInfoServer() based on the 
configured http policy. Contributed by Haohui Mai. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1548629)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ClusterJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAConfiguration.java


 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 3.0.0

 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5594) FileSystem API for ACLs.

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841501#comment-13841501
 ] 

Hadoop QA commented on HDFS-5594:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617424/HDFS-5594.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5660//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5660//console

This message is automatically generated.

 FileSystem API for ACLs.
 

 Key: HDFS-5594
 URL: https://issues.apache.org/jira/browse/HDFS-5594
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5594.1.patch, HDFS-5594.2.patch


 Add new methods to {{FileSystem}} for manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841508#comment-13841508
 ] 

Jing Zhao commented on HDFS-4983:
-

+1 for the new patch. Thanks Yongjun!

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841511#comment-13841511
 ] 

Vinay commented on HDFS-5312:
-

I have verified this with HDFS-3405, uploading of fsimage with HTTPS_ONLY is 
success. 
Without HDFS-3405 it fails because SNN doesnot have HTTPS and NN tries to 
access http port with https scheme. Anyway this will no more exist if HDFS-3405 
 goes in.

Thanks Hauhui Mai and Jing.

 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 3.0.0

 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2013-12-06 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-3405:


Attachment: HDFS-3405.patch

Rebased patch after HDFS-5312

 Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged 
 fsimages
 

 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
Reporter: Aaron T. Myers
Assignee: Vinay
 Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch


 As Todd points out in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
  the current scheme for a checkpointing daemon to upload a merged fsimage 
 file to an NN is to issue an HTTP get request to tell the target NN to issue 
 another GET request back to the checkpointing daemon to retrieve the merged 
 fsimage file. There's no fundamental reason the checkpointing daemon can't 
 just use an HTTP POST or PUT to send back the merged fsimage file, rather 
 than the double-GET scheme.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5312) Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841523#comment-13841523
 ] 

Jing Zhao commented on HDFS-5312:
-

Thanks Vinay! Yes, currently SNN does not have https thus the GET-GET will not 
work for https_only policy. Https_only policy also does not work in HA for JN. 
These depends on HDFS-3405 and HDFS-5629.

 Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured 
 http policy
 

 Key: HDFS-5312
 URL: https://issues.apache.org/jira/browse/HDFS-5312
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 3.0.0

 Attachments: HDFS-5312.000.patch, HDFS-5312.001.patch, 
 HDFS-5312.002.patch, HDFS-5312.003.patch, HDFS-5312.004.patch, 
 HDFS-5312.005.patch, HDFS-5312.006.patch, HDFS-5312.007.patch, 
 HDFS-5312.008.patch


 DFSUtil#getInfoServer() returns only the authority (i.e., host+port) when 
 searching for the http / https server. This is insufficient because HDFS-5536 
 and related jiras allows NN / DN / JN to open HTTPS only using the HTTPS_ONLY 
 policy.
 This JIRA addresses two issues. First, DFSUtil#getInfoServer() should return 
 an URI instead of a string, so that the scheme is an inherent parts of the 
 return value, which eliminates the task of figuring out the scheme by design. 
 Second, it introduces a new function to choose whether http or https should 
 be used to connect to the remote server based of dfs.http.policy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4201) NPE in BPServiceActor#sendHeartBeat

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-4201:
--

Status: Open  (was: Patch Available)

Looking into the test failures.

 NPE in BPServiceActor#sendHeartBeat
 ---

 Key: HDFS-4201
 URL: https://issues.apache.org/jira/browse/HDFS-4201
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Eli Collins
Assignee: Jimmy Xiang
Priority: Critical
 Fix For: 3.0.0

 Attachments: trunk-4201.patch


 Saw the following NPE in a log.
 Think this is likely due to {{dn}} or {{dn.getFSDataset()}} being null, (not 
 {{bpRegistration}}) due to a configuration or local directory failure.
 {code}
 2012-09-25 04:33:20,782 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
 For namenode svsrs00127/11.164.162.226:8020 using DELETEREPORT_INTERVAL of 
 30 msec  BLOCKREPORT_INTERVAL of 2160msec Initial delay: 0msec; 
 heartBeatInterval=3000
 2012-09-25 04:33:20,782 ERROR 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService 
 for Block pool BP-1678908700-11.164.162.226-1342785481826 (storage id 
 DS-1031100678-11.164.162.251-5010-1341933415989) service to 
 svsrs00127/11.164.162.226:8020
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:434)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:520)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673)
 at java.lang.Thread.run(Thread.java:722)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)
Ming Ma created HDFS-5639:
-

 Summary: rpc scheduler abstraction
 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma


We have run into various issues in namenode and hbase w.r.t. rpc handling in 
multi-tenant clusters. The examples are

https://issues.apache.org/jira/i#browse/HADOOP-9640
 https://issues.apache.org/jira/i#browse/HBASE-8836

There are different ideas on how to prioritize rpc requests. It could be based 
on user id, or whether it is read request or write request, or it could use 
specific rule like datanode's RPC is more important than client RPC.

We want to enable people to implement and experiiment different rpc schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841537#comment-13841537
 ] 

Haohui Mai commented on HDFS-4983:
--

Good patch. A couple comments:

{code}
+  static {
+setUserPattern(DFS_WEBHDFS_USER_PATTERN_DEFAULT);
+  }
{code}

The execution of static code block for a class is fairly unpredictable. Since 
you've initialized it explicitly (thus no NPE), you can delete it.

{code}
+  private static String userPattern = null;
+  private static Domain domain = null;

+  public static void setUserPattern(Configuration conf) {
+String pattern = conf.get(DFS_WEBHDFS_USER_PATTERN_KEY, 
DFS_WEBHDFS_USER_PATTERN_DEFAULT);
+setUserPattern(pattern);
+  }

+  @VisibleForTesting
+  public static String getUserPattern() {
+return userPattern;
+  }
+
+  @VisibleForTesting
+  public static void setUserPattern(String pattern) {
+userPattern = pattern;
+Pattern pt = Pattern.compile(userPattern);
+domain = new Domain(NAME, pt);
+  }
{code}

A better way to do it is to change the signature to setUserPattern(String). By 
doing this you can remove the userPattern field from the class, which reduces 
the overhead as jersey will construct an instance of this class for each 
request.

{code}
+property
+  namewebhdfs.user.provider.user.pattern/name
...
{code}

Can you rename the configuration to dfs.webhdfs.user.provider.user.pattern to 
make things consistent?

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Attachment: HDMIT-5092.patch

The patch borrows lots of work from 
https://issues.apache.org/jira/i#browse/HBASE-8884 and 
https://issues.apache.org/jira/i#browse/HBASE-9461. It improves couple things 
specific to hadoop.
 
1. Scheduler could be a global object that can be shared among different rpc 
servers. This is useful in the NN case where there could be two RPC servers; 
one for client requests and one for service requests. Currently there is no way 
to priortize requests between client RPC and DN RPC. The patch includes both 
the new rpc scheduler API in hadoop-common-project and NN's usage of this API. 
NN's default RPC scheduler takes care of the scenario where NN uses client RPC 
server and service RPC server. New scheduler can be plugged in via config 
dfs.namenode.rpc.scheduler.factory.class.
 
2. This can also be useful in the case of YARN RM where several RPC severs are 
used; for example it can prioritize AM RPCs over some client RPCs. The default 
behavior for YARN is still one scheduler for RPC server unless it changes to 
use a global rpc scheduler.
 
3. There shouldn't be any change in terms of how RPC scheduling is done for any 
hadoop services.
 
4. Fix the handling of queueSizePerHandler when a specific value is passed in. 
The fix is maxQueueSize = handlerCount * queueSizePerHandler.
 
5. Update RPCCallBenchmark to support the external rpc scheduler; include a 
test RpcScheduler implementation.
 
6. Move CallQueueLength metric from RPCMetrics to DefaultRpcSchedulerMetrics. 

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDMIT-5092.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Status: Patch Available  (was: Open)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDMIT-5092.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Status: Open  (was: Patch Available)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Attachment: (was: HDMIT-5092.patch)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Attachment: HDFS-5639.patch

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Status: Patch Available  (was: Open)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Attachment: HDFS-5023.patch

Thanks Jon for the comments. Uploading the updated patch.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841561#comment-13841561
 ] 

Hadoop QA commented on HDFS-5023:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617446/HDFS-5023.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5664//console

This message is automatically generated.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841560#comment-13841560
 ] 

Haohui Mai commented on HDFS-4983:
--

Sorry, I didn't see userPattern is a static field. To clarify, here is a sketch:

{code}
// Some javadocs
public static void setUserNameDomain(String pattern) {
  domain = Pattern.compile(pattern);
}
{code}

{code}
if (webhdfs is enabled) {
  UserParam.setUserPattern(conf.get(DFS_WEBHDFS_USER_PATTERN_KEY, 
DFS_WEBHDFS_USER_PATTERN_DEFAULT));
  ...
}
{code}

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Status: Open  (was: Patch Available)

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Status: Patch Available  (was: Open)

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841577#comment-13841577
 ] 

Jing Zhao commented on HDFS-5629:
-

The patch looks good to me. Some comments so far:
# The following code needs to be cleaned in DFSUtil:
{code}
+sslConf.addResource(conf.get(
+DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_KEY,
+DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_DEFAULT));
+
+sslConf.addResource(conf.get(
+DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_KEY,
+DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_DEFAULT));
{code}
# Nit: the Indentation needs to be fixed here:
{code}
+HttpServer.Builder builder = new HttpServer.Builder()
+.setName(journal)
+.setConf(conf)
+.setACL(new AccessControlList(conf.get(DFS_ADMIN,  )))
+.setSecurityEnabled(UserGroupInformation.isSecurityEnabled())
+.setUsernameConfKey(
+DFSConfigKeys.DFS_NAMENODE_INTERNAL_SPNEGO_USER_NAME_KEY)
+.setKeytabConfKey(
+DFSUtil.getSpnegoKeytabKey(conf,
+DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
{code}
# You may also want to fix the following in SecondaryNameNode:
{code}
URI httpEndpoint = URI.create(http://; + 
NetUtils.getHostPortString(infoSocAddr));

infoServer = new HttpServer.Builder().setName(secondary)
.addEndpoint(httpEndpoint)
.setFindPort(tmpInfoPort == 0).setConf(conf).setACL(
new AccessControlList(conf.get(DFS_ADMIN,  )))
.setSecurityEnabled(UserGroupInformation.isSecurityEnabled())
.setUsernameConfKey(
DFSConfigKeys.DFS_SECONDARY_NAMENODE_INTERNAL_SPNEGO_USER_NAME_KEY)
.setKeytabConfKey(DFSUtil.getSpnegoKeytabKey(conf,
DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY)).build();
{code}
It is also OK to fix this in a separate jira.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841584#comment-13841584
 ] 

Jing Zhao commented on HDFS-5629:
-

4. NAMENODE should be JOURNALNODE in the following code:
{code}
+.setUsernameConfKey(
+DFSConfigKeys.DFS_NAMENODE_INTERNAL_SPNEGO_USER_NAME_KEY)
+.setKeytabConfKey(
+DFSUtil.getSpnegoKeytabKey(conf,
+DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
{code}

5. Please post your system tests results.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Attachment: HDFS-5023.patch

Invalid patch updated.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Status: Open  (was: Patch Available)

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated HDFS-5023:


Status: Patch Available  (was: Open)

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841597#comment-13841597
 ] 

Yongjun Zhang commented on HDFS-4983:
-

Many thanks to you all for the reviews and comments.

Hi Haohui,

Thanks for your detailed review. Some clarification here.

About the static block, the intention is that it always get executed when the 
class is loaded, so to assure the default is initialized correctly.

I made the userPattern string static, and made it retrievable by the test code 
(please refer to the change in TestParam.java), and annotated as 
visibleForTesting.
 
About the property name, I think webhdfs., meant to be consistent with 
the counterpart httpfs And this is what the original bug requested.

I made the public interface setUserPattern(Configuration) instead of 
setUserPattern(String), so if in the future we want to set something 
differently based on other configuration, we can just change inside UserParam 
class without changing the caller. 

Does that make sense to you?

Thanks.

--Yongjun




 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0

2013-12-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841600#comment-13841600
 ] 

Suresh Srinivas commented on HDFS-4114:
---

bq. As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
I do not think it is a sufficient reason to retain BackupNode. If you really 
want to shot how Namenode can be extended, you could contribute another 
simpler, easier to maintain example that extends Namenode. In fact some of the 
constructs that are used only by BackupNode, I reckon, are not what extensions 
of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. 
This is no longer necessary with the improvements in edits, where checkpointing 
can be done any time without the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. 
This code is not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of 
code. This code belongs to a functionality that no one tests or uses. In fact I 
will not be surprised that there are bugs lurking in that functionality that 
might cause major issues for a misguided user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code 
that helps extending namenode is being removed, I would like to see a proposal 
on what extending a namenode means, which of the functionality relevant to that 
is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, 
which is on my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant 
anymore. Describing that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it 
in the future.
I wish just assigning a bug to you would have been that easy. When making 
changes in the code, with a feature in mind, there are lot of these unused code 
and tests that also need change. This is currently a tax that feature 
developers are paying. The folks working on a feature have a time frame that 
they are working towards. Having to depend on you for related changes means, 
having to co-ordinate the work with you, getting the work done within the 
timeline. This will not only be work for you, but also work for people working 
on features. It is hard for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that 
is only used by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just 
understanding how all this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific 
functionality when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular 
contributors of HDFS will have to continue pay this cost. We have waited almost 
an year for a plan for taking BackupNode forward. I also think with Namenode HA 
stabilizing, even if there is a plan, I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you 
could maintain it. This in essence is equivalent to involving you to maintain 
BackupNode related functionality for features added to HDFS, without the cost 
of co-ordination.

 Deprecate the BackupNode and CheckpointNode in 2.0
 --

 Key: HDFS-4114
 URL: https://issues.apache.org/jira/browse/HDFS-4114
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Eli Collins
Assignee: Suresh Srinivas
 Attachments: HDFS-4114.patch


 Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the 
 BackupNode and CheckpointNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0

2013-12-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841600#comment-13841600
 ] 

Suresh Srinivas edited comment on HDFS-4114 at 12/6/13 7:54 PM:


bq. As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
I do not think it is a sufficient reason to retain BackupNode. If you really 
want to show how Namenode can be extended, you could contribute another 
simpler, easier to maintain example that extends Namenode. In fact some of the 
constructs that are used only by BackupNode, I reckon, are not what extensions 
of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. 
This is no longer necessary with the improvements in edits, where checkpointing 
can be done any time without the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. 
This code is not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of 
code. This code belongs to a functionality that no one tests or uses. In fact I 
will not be surprised that there are bugs lurking in that functionality that 
might cause major issues for a misguided user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code 
that helps extending namenode is being removed, I would like to see a proposal 
on what extending a namenode means, which of the functionality relevant to that 
is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, 
which is on my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant 
anymore. Describing that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it 
in the future.
I wish just assigning a bug to you would have been that easy. When making 
changes in the code, with a feature in mind, there are lot of these unused code 
and tests that also need change. This is currently a tax that feature 
developers are paying. The folks working on a feature have a time frame that 
they are working towards. Having to depend on you for related changes means, 
having to co-ordinate the work with you, getting the work done within the 
timeline. This will not only be work for you, but also work for people working 
on features. It is hard for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that 
is only used by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just 
understanding how all this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific 
functionality when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular 
contributors of HDFS will have to continue pay this cost. We have waited almost 
an year for a plan for taking BackupNode forward. I also think with Namenode HA 
stabilizing, even if there is a plan, I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you 
could maintain it. This in essence is equivalent to involving you to maintain 
BackupNode related functionality for features added to HDFS, without the cost 
of co-ordination.


was (Author: sureshms):
bq. As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
I do not think it is a sufficient reason to retain BackupNode. If you really 
want to shot how Namenode can be extended, you could contribute another 
simpler, easier to maintain example that extends Namenode. In fact some of the 
constructs that are used only by BackupNode, I reckon, are not what extensions 
of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. 
This is no longer necessary with the improvements in edits, where checkpointing 
can be done any time without the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. 
This code is not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of 
code. This code belongs to a functionality that no one tests or uses. In fact I 
will not be surprised that there are bugs lurking in that functionality that 
might cause major issues for a misguided user that ends up using it.

Given that I believe BackupNode should be removed. As regards 

[jira] [Commented] (HDFS-5431) support cachepool-based quota management in path-based caching

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841628#comment-13841628
 ] 

Colin Patrick McCabe commented on HDFS-5431:


{code}
if (in.readBoolean()) {
  info.setOwnerName(Text.readString(in));
}
if (in.readBoolean())  {
  info.setGroupName(Text.readString(in));
}
if (in.readBoolean()) {
  info.setMode(FsPermission.read(in));
}
if (in.readBoolean()) {
  info.setReservation(in.readLong());
}
if (in.readBoolean()) {
  info.setQuota(in.readLong());
}
if (in.readBoolean()) {
  info.setWeight(in.readInt());
}
{code}

I don't think the backwards-compatibility stuff here is really going to work.  
The problem is, if we add more booleans, the old code won't know they're there, 
and will ignore them.  Then we will interpret those bytes as something else, 
which could cause some really bad results.

I think the best way to do this is to start with a 32-bit word, which we can 
treat as a bitfield.  We can then load or not load field N according to whether 
bit N is set.  If there are bits set that we don't know how to interpret, we 
can bail out with a nice error message rather than trying to loading garbage 
and possibly corrupting the fsimage.  We probably should use this approach for 
cache directives as well.

{code}
int mode = Integer.parseInt(modeString, 8);
info.setMode(new FsPermission((short)mode));
{code}
hey, there's a {{Short.parseShort}} too :)

About terminology: isn't maximum a better name for what we're implementing 
here than quota?  If we implement something more sophisticated later, it 
could get confusing if we just use the term quota here.  I also think we 
should rip out weight completely if we're not going to support it any more.  I 
see a few places where weight is lingering now.  The feature flag stuff 
should allow us to add it forwards-compatibly (although not 
backwards-compatibly) in the future, if we want to.  I feel the same way about 
reservation.

I'm not sure that we want a cache directive addition to fail when the maximum 
has been exceeded.  The problem is, there isn't any good way to implement this 
kind of simple check for more sophisticated quota methods like fair share or 
minimum share, etc.  Also, this is dependent on things like what we think the 
sizes are of files and directories in the cluster, which may change.  The 
result is very inconsistent behavior from the user's point of view.  For 
example, maybe he can add cache directives if a datanode has not come up, but 
can't add them once it comes up and we determine the full size of a certain 
file.  Or maybe he could add them by manually editing the edit log, but not 
from the command-line.  It just feels inconsistent.  I would rather we teach 
people to rely on looking at {{bytesNeeded}} versus {{bytesCached}} to 
determine if they had enough space.

I wonder if we should add another metric that somehow allows users to 
disambiguate between bytes not cached because of maximums / quotas / other 
executive decision and bytes not cached because the DN had an issue.  Right 
now all the user can do is subtract bytesNeeded from bytesCached and see that 
there is some gap, but he would have to check the logs to know why.

 support cachepool-based quota management in path-based caching
 --

 Key: HDFS-5431
 URL: https://issues.apache.org/jira/browse/HDFS-5431
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Andrew Wang
 Attachments: hdfs-5431-1.patch


 We should support cachepool-based quota management in path-based caching.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841655#comment-13841655
 ] 

Haohui Mai commented on HDFS-4983:
--

bq. About the static block, the intention is that it always get executed when 
the class is loaded, so to assure the default is initialized correctly.

The execution order of static blocks has been a long headache for us. The order 
is determined by the Java class loader and it is unpredictable. I would much 
rather to make the initialization explicit. It makes things much easier to 
debug.

bq. I made the userPattern string static, and made it retrievable by the test 
code (please refer to the change in TestParam.java), and annotated as 
visibleForTesting.

What I'm saying is that you don't need the userPattern field at all -- you can 
just compile the pattern and store it in the domain field. The code seems 
redundant to me and can go away.

bq. About the property name, I think webhdfs., meant to be consistent 
with the counterpart httpfs And this is what the original bug requested.

What httpfs does is irrelevant. We have dfs.webhdfs.enabled already, so please 
make it consistent.

bq. I made the public interface setUserPattern(Configuration) instead of 
setUserPattern(String), so if in the future we want to set something 
differently based on other configuration, we can just change inside UserParam 
class without changing the caller.

Configuration in UserPattern seems a wrong abstraction to me. For no apparent 
reasons UserParams depends on Configuration now. If later we want to build a 
thin webhdfs client and to reuse the class, it can be problematic. If this is 
not a big deal for you. I appreciate you can fix it.

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5636) Enforce a max TTL per cache pool

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841653#comment-13841653
 ] 

Colin Patrick McCabe commented on HDFS-5636:


I'm not sure about this one.  Generally different cache pools are associated 
with different users, and users don't age out in most of the clusters I'm 
familiar with.  Let's see how people make use of the existing TTL stuff.  We 
can always add this later if need be.

 Enforce a max TTL per cache pool
 

 Key: HDFS-5636
 URL: https://issues.apache.org/jira/browse/HDFS-5636
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang

 It'd be nice for administrators to be able to specify a maximum TTL for 
 directives in a cache pool. This forces all directives to eventually age out.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841664#comment-13841664
 ] 

Hadoop QA commented on HDFS-5629:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617341/HDFS-5629.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5661//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5661//console

This message is automatically generated.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid

2013-12-06 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841685#comment-13841685
 ] 

Todd Lipcon commented on HDFS-5182:
---

bq. It's edge-triggered in the sense that both the DN and the client send 
notifications when something changes

Hrm, so that means that the data xceiver protocol would become bidirectional 
instead of a simple request/response protocol? This seems like it could get 
kind of hairy.

 BlockReaderLocal must allow zero-copy  reads only when the DN believes it's 
 valid
 -

 Key: HDFS-5182
 URL: https://issues.apache.org/jira/browse/HDFS-5182
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 BlockReaderLocal must allow zero-copy reads only when the DN believes it's 
 valid.  This implies adding a new field to the response to 
 REQUEST_SHORT_CIRCUIT_FDS.  We also need some kind of heartbeat from the 
 client to the DN, so that the DN can inform the client when the mapped region 
 is no longer locked into memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841688#comment-13841688
 ] 

Hadoop QA commented on HDFS-3405:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617434/HDFS-3405.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-client hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDistributedFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5662//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5662//console

This message is automatically generated.

 Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged 
 fsimages
 

 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
Reporter: Aaron T. Myers
Assignee: Vinay
 Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch


 As Todd points out in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
  the current scheme for a checkpointing daemon to upload a merged fsimage 
 file to an NN is to issue an HTTP get request to tell the target NN to issue 
 another GET request back to the checkpointing daemon to retrieve the merged 
 fsimage file. There's no fundamental reason the checkpointing daemon can't 
 just use an HTTP POST or PUT to send back the merged fsimage file, rather 
 than the double-GET scheme.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841717#comment-13841717
 ] 

Chris Nauroth commented on HDFS-5541:
-

Hi [~stevebovy],

I had a chance to look at this a bit more today.  Thanks again for sharing your 
work.  To help move it forward, I'd like to suggest that we close out this 
issue and replace it with the following set of issues focused on completing 
more specific tasks.  Splitting the work up helps make code review easier and 
ultimately helps get the code committed.

# libHDFS Windows compatibility - This would be the bare minimum patch required 
for Windows compatibility.  I think the scope would include things like the JVM 
mutex macros, uthash, build script changes and C89 compatibility stuff like 
declaring variables at the top of the function.
# libHDFS AIX compatibility - Colin has suggested a build script change to 
support this instead of changing the comment style.
# libHDFS performance improvements - The above issues would not include any of 
the performance improvement work, so if you want to keep pursuing that, then 
we'd do it in a separate patch here.  Depending on the scope, this also might 
split into multiple performance improvement patches.

For each of these, the process would be to post patch files applicable to 
trunk, and we can code review and test them.  From your notes, it sounds like 
you're also interested in getting this into branch-1 or branch-1-win.  If so, 
then you can provide patches for those branches too.  Do you think this plan 
makes sense?

I have a couple of comments related to the code I saw in the attachment:

* This version splits the headers and implementation files into separate inc 
and src directories.  If you want to propose a change in the source layout, 
then let's handle that in its own separate issue, without any actual code 
changes mixed in.
* I saw the include for {{uthash.h}}, but I also still saw calls to {{hcreate}} 
and {{hsearch}}.  I was expecting to see these call sites switch to using the 
uthash equivalents.
* We'll want to add .sln and .vcxproj files, similar to what we do for 
winutils.exe and hadoop.dll.  The supported compilers on Windows are the free 
Windows SDK or Visual Studio 2010 Professional.

bq. And How do I get the NativeCodeLoader to work ??

Assuming you have a build of libhadoop.so or hadoop.dll, you'd need to enable 
the libhdfs process to dynamically link to it.  One way to do this is to launch 
the JVM with -Djava.library.path=path to libhadoop.so or hadoop.dll.  You can 
set environment variable {{LIBHDFS_OPTS}} to control the JVM arguments that 
libhdfs passes to its embedded JVM.  The other way to do it is using the 
dynamic linking capabilities provided by the OS, i.e. {{LD_LIBRARY_PATH}} on 
Linux or {{PATH}} on Windows.

bq. Dag nab it  I cannot figure this one out  the append does not work

Sorry for the late reply, but this is due to append being disabled by default 
in the 1.x line.  I think you've figured this part out already.


 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt 

[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Stephen Bovy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841733#comment-13841733
 ] 

Stephen Bovy commented on HDFS-5541:


Thanks Chris,

All of this sounds good.

Can we discuss this further off-line  ?  

Since Teradata and HortonWorks are partners 
( I have some private issues and questions )


 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Stephen Bovy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841735#comment-13841735
 ] 

Stephen Bovy commented on HDFS-5541:


My Phone # is 310-616-2325

If anyone else is following this thread please do not flood me with phone calls 
:)

 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841737#comment-13841737
 ] 

Hadoop QA commented on HDFS-5639:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617443/HDFS-5639.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5663//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5663//console

This message is automatically generated.

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5636) Enforce a max TTL per cache pool

2013-12-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841754#comment-13841754
 ] 

Andrew Wang commented on HDFS-5636:
---

The issue is that if a pool is hanging on to forgotten cached data, that 
prevents another pool from using it. This is true even with quotas/limits. It's 
somewhat fixed if we also add minimum reservations for pools, but even then you 
might want max-TTLs so that forgotten cache is returned to be used by fair 
share.

Basically, right now TTLs are opt in, rather than opt out, and admins might 
sometimes instead want them to be opt out. There's an impedance mismatch here 
between admins and the users of a pool; just because a user has access to a 
cache pool, doesn't necessarily mean they should be able to do whatever they 
want with it because it might be bad for overall system performance. I see 
max-TTL as an admin-friendly feature, since it'll help them avoid manually 
cleaning up cache pools.

A couple imagined use cases:

* A scratch / temp cache pool with a low max TTL (say, 1 hr) and 0777 
permissions, so all users can do some adhoc data exploration. The admin doesn't 
need to worry about constantly cleaning up forgotten directives.
* When caching time-series data, you might only care about caching the last day 
of data. Thus, the admin could set a max TTL of 24H to enforce this.

Sort of related, we might also want a command like {{hdfs cacheadmin 
-removeExpiredDirectives [-pool pool]}} to help people clean up their expired 
directives. Maybe even a trash-like functionality where directives that have 
been expired for so long are automatically removed.

 Enforce a max TTL per cache pool
 

 Key: HDFS-5636
 URL: https://issues.apache.org/jira/browse/HDFS-5636
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang

 It'd be nice for administrators to be able to specify a maximum TTL for 
 directives in a cache pool. This forces all directives to eventually age out.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5629:
-

Attachment: HDFS-5629.001.patch

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5318) Account for shared storage in NameNode replica counting algorithm

2013-12-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841760#comment-13841760
 ] 

Suresh Srinivas commented on HDFS-5318:
---

Currently core Hadoop probably does not require this functionality. Should we 
consider pluggable interface to enable this?

 Account for shared storage in NameNode replica counting algorithm
 -

 Key: HDFS-5318
 URL: https://issues.apache.org/jira/browse/HDFS-5318
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Eric Sirianni

 There are several use cases for using shared-storage for datanode block 
 storage in an HDFS environment (storing cold blocks on a NAS device, Amazon 
 S3, etc.).
 With shared-storage, there is a distinction between:
 # a distinct physical copy of a block
 # an access-path to that block via a datanode.  
 A single 'replication count' metric cannot accurately capture both aspects.  
 However, for most of the current uses of 'replication count' in the Namenode, 
 the number of physical copies aspect seems to be the appropriate semantic.
 I propose altering the replication counting algorithm in the Namenode to 
 accurately infer distinct physical copies in a shared storage environment.  
 With HDFS-5115, a {{StorageID}} is a UUID.  I propose associating some minor 
 additional semantics to the {{StorageID}} - namely that multiple datanodes 
 attaching to the same physical shared storage pool should report the same 
 {{StorageID}} for that pool.  A minor modification would be required in the 
 DataNode to enable the generation of {{StorageID}} s to be pluggable behind 
 the {{FsDatasetSpi}} interface.  
 With those semantics in place, the number of physical copies of a block in a 
 shared storage environment can be calculated as the number of _distinct_ 
 {{StorageID}} s associated with that block.
 Consider the following combinations for two {{(DataNode ID, Storage ID)}} 
 pairs {{(DN_A, S_A) (DN_B, S_B)}} for a given block B:
 * {{DN_A != DN_B  S_A != S_B}} - *different* access paths to *different* 
 physical replicas (i.e. the traditional HDFS case with local disks)
 ** rarr; Block B has {{ReplicationCount == 2}}
 * {{DN_A != DN_B  S_A == S_B}} - *different* access paths to the *same* 
 physical replica (e.g. HDFS datanodes mounting the same NAS share)
 ** rarr; Block B has {{ReplicationCount == 1}}
 For example, if block B has the following location tuples:
 * {{DN_1, STORAGE_A}}
 * {{DN_2, STORAGE_A}}
 * {{DN_3, STORAGE_B}}
 * {{DN_4, STORAGE_B}},
 the effect of this proposed change would be to calculate the replication 
 factor in the namenode as *2* instead of *4*.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841765#comment-13841765
 ] 

Hadoop QA commented on HDFS-5023:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617455/HDFS-5023.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5665//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5665//console

This message is automatically generated.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841783#comment-13841783
 ] 

Haohui Mai commented on HDFS-5629:
--

The v1 patch also implements HTTPS support for secondary namenode. SNN can do 
the checkpoint through HTTPS when the policy is set to HTTPS_ONLY.

I've also tested the JN in a secure HA set up. The NN is getting the journal 
and fsimage through HTTPS as expected.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5353) Short circuit reads fail when dfs.encrypt.data.transfer is enabled

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841799#comment-13841799
 ] 

Haohui Mai commented on HDFS-5353:
--

The patch looks good to me. The only comment I have is that it might be better 
to move some of the javadoc in DomainPeer#hasSecureChannel() to 
Peer#hasSecureChannel().

+1 once this is addressed.

 Short circuit reads fail when dfs.encrypt.data.transfer is enabled
 --

 Key: HDFS-5353
 URL: https://issues.apache.org/jira/browse/HDFS-5353
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-5353.001.patch


 DataXceiver tries to establish secure channels via sasl when 
 dfs.encrypt.data.transfer is turned on. However, domain socket traffic seems 
 to be unencrypted therefore the client cannot communicate with the data node 
 via domain sockets, which makes short circuit reads unfunctional.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5353) Short circuit reads fail when dfs.encrypt.data.transfer is enabled

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841800#comment-13841800
 ] 

Haohui Mai commented on HDFS-5353:
--

btw, there's trailing whitespace in the patch. You might want to fix it when 
you update the patch.

 Short circuit reads fail when dfs.encrypt.data.transfer is enabled
 --

 Key: HDFS-5353
 URL: https://issues.apache.org/jira/browse/HDFS-5353
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-5353.001.patch


 DataXceiver tries to establish secure channels via sasl when 
 dfs.encrypt.data.transfer is turned on. However, domain socket traffic seems 
 to be unencrypted therefore the client cannot communicate with the data node 
 via domain sockets, which makes short circuit reads unfunctional.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5353) Short circuit reads fail when dfs.encrypt.data.transfer is enabled

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841807#comment-13841807
 ] 

Jing Zhao commented on HDFS-5353:
-

+1. Thanks Colin and Haohui!

 Short circuit reads fail when dfs.encrypt.data.transfer is enabled
 --

 Key: HDFS-5353
 URL: https://issues.apache.org/jira/browse/HDFS-5353
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-5353.001.patch


 DataXceiver tries to establish secure channels via sasl when 
 dfs.encrypt.data.transfer is turned on. However, domain socket traffic seems 
 to be unencrypted therefore the client cannot communicate with the data node 
 via domain sockets, which makes short circuit reads unfunctional.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841815#comment-13841815
 ] 

Jing Zhao commented on HDFS-5023:
-

Thanks for the fix Mit! So could you please also comment the cause of the 
failure? This can help us prevent similar issue in the future.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841826#comment-13841826
 ] 

Jing Zhao commented on HDFS-5629:
-

The current patch looks good to me. [~vinayrpet], do you also want to take a 
look at the patch and try it in your setup?

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841838#comment-13841838
 ] 

Mit Desai commented on HDFS-5023:
-

Sure Jing.
-Main reason for the test failure was JAVA Version. The test were written 
keeping in mind the nature of Java 6 where it would work fine. But with Java 7 
running the tests in a random order, the tests would start failing.
-The test creates files/snapshots during the entire test series. The problem 
was not cleaning these up.
-When a test that created a snapshot ends, it was not actually deleting the 
snapshot and that was being misinterpreted by the test that would be running 
next and the assert was failing due to inconsistent values.
-I think, making sure that the tests are independent of each other and that 
every test that makes any modification is just limited to that test would not 
give rise to such issues in future.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5629:
-

Attachment: HDFS-5629.002.patch

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841850#comment-13841850
 ] 

Haohui Mai commented on HDFS-5629:
--

The v2 patch fixes a typo in NameNodeHttpServer.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841875#comment-13841875
 ] 

Andrew Wang commented on HDFS-2832:
---

Hi folks, sorry for just now looping back to this JIRA. I read the updated 
design doc, and had a few more questions:

* Storage is not always truly hierarchical, it depends on your provisioning. 
The current strategy of always falling back from SSD to HDD is more ambiguous 
when you have more than two storage types, especially with something like a 
tape or NAS tier. Maybe this should be configurable somehow.
* I'd like to see more discussion in the doc of migrating blocks that are 
currently open for short-circuit read. SCR is very common, and if this isn't 
handled, it makes it hard to use for an application like HBase which tries to 
opens all of its files once via SCR. FWIW, the HBase committers I've talked to 
are very interested in HSM and SSDs, so it might be helpful to get their 
thoughts on this topic and the feature more generally.
* Is this going to work with rolling upgrades?
* Do you forsee heartbeats and block reports always being combined in realistic 
scenarios? Or are there reasons to split it? Is there any additional overhead 
from splitting? Can we save any complexity by not supporting split reports? I 
see this on the test matrix.
* Have you looked at the additional memory overhead on the NN and DN from 
splitting up storages? With 10 disks on a DN, this could mean effectively 10x 
the number of DNs as before. I think this is still insignificant, but you all 
know better than me.
* I'd like to see more description of the client API, namely the file attribute 
APIs. I'll also note that LocatedBlock is not a public API; you can hack around 
by downcasting BlockLocation to HdfsBlockLocation to fish out the LocatedBlock, 
but ultimately we probably want to expose StorageType in BlockLocation itself. 
API examples would be great, from both the command line and the programmatic 
API.
* Have you put any thought about metrics and tooling to help users and admins 
debug their quota usage and issues with migrating files to certain storage 
types? Especially because of SCR.
* One of the mentioned potential uses is to do automatic migration between 
storage types based on usage patterns. In this kind of scenario, it's necessary 
to support more expressive forms of resource management, e.g. YARN's fair 
scheduler. Quotas by themselves aren't sufficient.
* I think this earlier question/answer didn't make it into the doc: what 
happens when this file is distcp'd or copied? Arpit's earlier answer of 
clearing this field makes sense (or maybe we need a {{cp -a}} command).

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf, 
 20131125-HeterogeneousStorage-TestPlan.pdf, 
 20131125-HeterogeneousStorage.pdf, 
 20131202-HeterogeneousStorage-TestPlan.pdf, 
 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
 editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
 h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
 h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
 h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
 h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
 h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
 h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
 h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
 h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
 h2832_20131203.patch


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5637) try to refeatchToken while local read InvalidToken occurred

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841883#comment-13841883
 ] 

stack commented on HDFS-5637:
-

+1 on the patch.  Will commit in next few days unless objection.

 try to refeatchToken while local read InvalidToken occurred
 ---

 Key: HDFS-5637
 URL: https://issues.apache.org/jira/browse/HDFS-5637
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client, security
Affects Versions: 2.0.5-alpha, 2.2.0
Reporter: Liang Xie
Assignee: Liang Xie
 Attachments: HDFS-5637.txt


 we observed several warning logs like below from region server nodes:
 2013-12-05,13:22:26,042 WARN org.apache.hadoop.hdfs.DFSClient: Failed to 
 connect to /10.2.201.110:11402 for block, add to deadNodes and continue. 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
 block_token_identifier (expiryDate=1386060141977, keyId=-333530248, 
 userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, 
 blockId=-190217754078101701, access modes=[READ]) is expired.
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with 
 block_token_identifier (expiryDate=1386060141977, keyId=-333530248, 
 userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, 
 blockId=-190217754078101701, access modes=[READ]) is expired.
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280)
 at 
 org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
 at 
 

[jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841882#comment-13841882
 ] 

Colin Patrick McCabe commented on HDFS-5182:


The data xceiver protocol would stay the same as it is now.  It's just that 
after the SCR request, we would hang on to the socket and use a different 
protocol, which I call the SCR notification protocol.  The data xceiver 
threads, protocols, etc. are not involved at that point, just the same as now.

 BlockReaderLocal must allow zero-copy  reads only when the DN believes it's 
 valid
 -

 Key: HDFS-5182
 URL: https://issues.apache.org/jira/browse/HDFS-5182
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 BlockReaderLocal must allow zero-copy reads only when the DN believes it's 
 valid.  This implies adding a new field to the response to 
 REQUEST_SHORT_CIRCUIT_FDS.  We also need some kind of heartbeat from the 
 client to the DN, so that the DN can inform the client when the mapped region 
 is no longer locked into memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841893#comment-13841893
 ] 

Colin Patrick McCabe commented on HDFS-5541:


Great proposals, Chris.  I agree with all of them.  I guess the next step is to 
file the follow-up JIRAs.

bq. Can we discuss this further off-line ?  If anyone else is following this 
thread please do not flood me with phone calls

As this is an Apache project, I think these discussions belong in the open.  If 
you want to call a webex, that would be fine, but you should allow the whole 
community to participate, not just a subset.

Personally, I don't really see a need for a webex since there seems to be 
agreement here (I think?), but feel free to call one if you want.

 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841905#comment-13841905
 ] 

Colin Patrick McCabe commented on HDFS-5541:


By the way, thanks for looking at these issues.  I realize libhdfs can be a 
little intimidating when you first look at it, but it's actually simpler than 
it looks.  If you have any questions, even if they seem trivial, you should 
post them hdfs-dev, and we'll try to answer as best we can.  Probably a lot of 
other people have the same question.

Another thing I would kind of like to see is the Windows build using CMake.  
That would avoid the need to manually generate vcproj files and maintain two 
sets of build information files.  Windows support for libhdfs might be a good 
trial run for this.

 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841920#comment-13841920
 ] 

Hadoop QA commented on HDFS-5629:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617481/HDFS-5629.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServer
  org.apache.hadoop.hdfs.server.namenode.TestCorruptFilesJsp
  org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
  org.apache.hadoop.hdfs.server.namenode.TestHostsFiles
  org.apache.hadoop.hdfs.server.namenode.ha.TestHAWebUI
  org.apache.hadoop.hdfs.TestMissingBlocksAlert

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5666//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5666//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5666//console

This message is automatically generated.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2

2013-12-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841927#comment-13841927
 ] 

Jing Zhao commented on HDFS-5023:
-

Thanks Mit! Your explanation is very helpful. 

So since the current patch deletes snapshot for every test case, how about 
always using the same snapshot name, and moving delete-snapshot and 
disallow-snapshot code to an After method? This can make the code more 
clean.

 TestSnapshotPathINodes.testAllowSnapshot is failing in branch-2
 ---

 Key: HDFS-5023
 URL: https://issues.apache.org/jira/browse/HDFS-5023
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots, test
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Mit Desai
  Labels: test
 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, 
 TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, 
 org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt


 The assertion on line 91 is failing. I am using Fedora 19 + JDK7. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841932#comment-13841932
 ] 

Chris Nauroth commented on HDFS-5541:
-

bq. Great proposals, Chris. I agree with all of them. I guess the next step is 
to file the follow-up JIRAs.

Thanks, Colin.  I plan to file these later today and close out HDFS-5541.

bq. As this is an Apache project, I think these discussions belong in the open.

Stephen just wanted to know when I, personally, would have availability to put 
significant effort into this.  (Answer: probably not until January.)  I promise 
that nothing relevant has been withheld from the discussion here on Apache.  
Stephen, feel free to email me directly for similar questions in the future.

bq. Another thing I would kind of like to see is the Windows build using CMake.

Part of the rationale for checking in the vcxproj files was to provide a good 
dev experience for a typical Windows developer using Visual Studio.  When I had 
looked at using CMake, it seemed we would have to compromise on that.  If we 
revisit this, then I'd like to get feedback from a Visual Studio user.  (I 
don't use it myself.)  We definitely have a maintenance cost right now due to 
duplicated build logic, particularly around optional things like Snappy support.

 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841935#comment-13841935
 ] 

Suresh Srinivas commented on HDFS-2832:
---

bq. Hi folks, sorry for just now looping back to this JIRA.
Is this not too late to loop back now, after design published and work started 
many months ago? Doing this after the merge vote is called (with 3 days to wrap 
up the voting) seems like a strange choice of timing to me. As regards to 
client APIs, we can certainly discuss them post phase 1 merge and when the work 
starts on phase 2 in relevant jiras.

Hopefully [~arpitagarwal] can provide answers to the technical questions.

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf, 
 20131125-HeterogeneousStorage-TestPlan.pdf, 
 20131125-HeterogeneousStorage.pdf, 
 20131202-HeterogeneousStorage-TestPlan.pdf, 
 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
 editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
 h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
 h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
 h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
 h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
 h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
 h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
 h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
 h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
 h2832_20131203.patch


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-4983:


Attachment: HDFS-4983.006.patch

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch, 
 HDFS-4983.006.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841947#comment-13841947
 ] 

Yongjun Zhang commented on HDFS-4983:
-

HI Haohui,

Many thanks for the comments. I just uploaded a new version with all of them 
addressed.

Best regards,

--Yongjun


 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch, 
 HDFS-4983.006.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5353) Short circuit reads fail when dfs.encrypt.data.transfer is enabled

2013-12-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5353:
---

Attachment: HDFS-5353.002.patch

 Short circuit reads fail when dfs.encrypt.data.transfer is enabled
 --

 Key: HDFS-5353
 URL: https://issues.apache.org/jira/browse/HDFS-5353
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-5353.001.patch, HDFS-5353.002.patch


 DataXceiver tries to establish secure channels via sasl when 
 dfs.encrypt.data.transfer is turned on. However, domain socket traffic seems 
 to be unencrypted therefore the client cannot communicate with the data node 
 via domain sockets, which makes short circuit reads unfunctional.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5541) LIBHDFS questions and performance suggestions

2013-12-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841956#comment-13841956
 ] 

Colin Patrick McCabe commented on HDFS-5541:


I have not looked into it, but I had heard that CMake's integration with Visual 
Studio was actually very good.  Certainly, it generates .vcproj files, which 
allow you to use Visual Studio natively.  Hopefully, a Visual Studio user can 
check it out at some point.

I also wrote a maven cmake plugin that I would like to get working at some 
point.  It's been on the back burner for a while, but it would be nicer than 
the ant-based way of executing C builds we have now.

But this is all somewhat of a tangent.  Long story short, thanks for looking at 
this, guys, looking forward to seeing the follow-up.

 LIBHDFS questions and performance suggestions
 -

 Key: HDFS-5541
 URL: https://issues.apache.org/jira/browse/HDFS-5541
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen Bovy
Priority: Minor
 Attachments: pdclibhdfs.zip


 Since libhdfs is a client interface,  and esspecially because it is a C 
 interface , it should be assumed that the code will be used accross many 
 different platforms, and many different compilers.
 1) The code should be cross platform ( no Linux extras )
 2) The code should compile on standard c89 compilers, the
   {least common denominator rule applies here} !!   
 C  code with  c   extension should follow the rules of the c standard  
 All variables must be declared at the begining of scope , and no (//) 
 comments allowed 
  I just spent a week white-washing the code back to nornal C standards so 
  that it could compile and build accross a wide range of platforms  
 Now on-to  performance questions 
 1) If threads are not used why do a thread attach ( when threads are not used 
 all the thread attach nonesense is a waste of time and a performance killer ) 
 2) The JVM  init  code should not be imbedded within the context of every 
 function call   .  The  JVM init code should be in a stand-alone  LIBINIT 
 function that is only invoked once.   The JVM * and the JNI * should be 
 global variables for use when no threads are utilized.  
 3) When threads are utilized the attach fucntion can use the GLOBAL  jvm * 
 created by the LIBINIT  { WHICH IS INVOKED ONLY ONCE } and thus safely 
 outside the scope of any LOOP that is using the functions 
 4) Hash Table and Locking  Why ?
 When threads are used the hash table locking is going to hurt perfromance .  
 Why not use thread local storage for the hash table,that way no locking is 
 required either with or without threads.   
  
 5) FINALLY Windows  Compatibility 
 Do not use posix features if they cannot easilly be replaced on other 
 platforms   !!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Attachment: HDFS-5639-2.patch

Fix the javadoc and findbugs issues.

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639-2.patch, HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Status: Open  (was: Patch Available)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639-2.patch, HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5639) rpc scheduler abstraction

2013-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-5639:
--

Status: Patch Available  (was: Open)

 rpc scheduler abstraction
 -

 Key: HDFS-5639
 URL: https://issues.apache.org/jira/browse/HDFS-5639
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
 Attachments: HDFS-5639-2.patch, HDFS-5639.patch


 We have run into various issues in namenode and hbase w.r.t. rpc handling in 
 multi-tenant clusters. The examples are
 https://issues.apache.org/jira/i#browse/HADOOP-9640
  https://issues.apache.org/jira/i#browse/HBASE-8836
 There are different ideas on how to prioritize rpc requests. It could be 
 based on user id, or whether it is read request or write request, or it could 
 use specific rule like datanode's RPC is more important than client RPC.
 We want to enable people to implement and experiiment different rpc 
 schedulers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841974#comment-13841974
 ] 

Andrew Wang commented on HDFS-2832:
---

Most of these questions pertain to phase 2 features, not phase 1. If phase 
1 is done and phase 2 starting, it seems now is the appropriate time to be 
asking phase 2 design questions. I don't have any technical issues with the 
code in the branch right now.

I'd really appreciate if phase 1 and phase 2 (and phase x?) features could be 
divided up in the design doc though, since I don't think this phased 
implementation plan is mentioned in there right now. I'm sure it'd help other 
reviewers too.

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf, 
 20131125-HeterogeneousStorage-TestPlan.pdf, 
 20131125-HeterogeneousStorage.pdf, 
 20131202-HeterogeneousStorage-TestPlan.pdf, 
 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
 editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
 h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
 h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
 h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
 h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
 h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
 h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
 h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
 h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
 h2832_20131203.patch


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841987#comment-13841987
 ] 

Hadoop QA commented on HDFS-5629:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617501/HDFS-5629.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5667//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5667//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5667//console

This message is automatically generated.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5629:
-

Attachment: HDFS-5629.003.patch

Fix a findbugs warning.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch, HDFS-5629.003.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841991#comment-13841991
 ] 

Haohui Mai commented on HDFS-4983:
--

+1. Thanks [~yzhangal]!

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch, 
 HDFS-4983.006.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2013-12-06 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842011#comment-13842011
 ] 

Vinay commented on HDFS-3405:
-

I am almost sure that this failure is not because of patch. 

 Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged 
 fsimages
 

 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
Reporter: Aaron T. Myers
Assignee: Vinay
 Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch


 As Todd points out in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
  the current scheme for a checkpointing daemon to upload a merged fsimage 
 file to an NN is to issue an HTTP get request to tell the target NN to issue 
 another GET request back to the checkpointing daemon to retrieve the merged 
 fsimage file. There's no fundamental reason the checkpointing daemon can't 
 just use an HTTP POST or PUT to send back the merged fsimage file, rather 
 than the double-GET scheme.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842027#comment-13842027
 ] 

Arpit Agarwal commented on HDFS-2832:
-

{quote}
Is there any additional overhead from splitting?
{quote}
DNs are not splitting block reports right now and there is no extra overhead in 
NN to handle it.

{quote}
Have you looked at the additional memory overhead on the NN and DN from 
splitting up storages? With 10 disks on a DN, this could mean effectively 10x 
the number of DNs as before. I think this is still insignificant, but you all 
know better than me.
{quote}
Yes, the bulk of the space is consumed by the block information, the volumes 
themselves are insignificant.

{quote}
I'd like to see more description of the client API, namely the file attribute 
APIs. I'll also note that LocatedBlock is not a public API; you can hack around 
by downcasting BlockLocation to HdfsBlockLocation to fish out the LocatedBlock, 
but ultimately we probably want to expose StorageType in BlockLocation itself. 
API examples would be great, from both the command line and the programmatic 
API.
I think this earlier question/answer didn't make it into the doc: what happens 
when this file is distcp'd or copied? Arpit's earlier answer of clearing this 
field makes sense (or maybe we need a cp -a command).
{quote}
As mentioned earlier we'll document these in more detail once we start work on 
them i.e. post phase-1 merge.

{quote}
I'd like to see more discussion in the doc of migrating blocks that are 
currently open for short-circuit read. SCR is very common
{quote}
It's in the doc. The operating system, be it Unix or Windows will not let you 
remove a file as long as any process has a handle open to it. Even if the file 
looks deleted, it will remain on disk at least until the last open handle goes 
away. Quota permitting, the application can create additional replicas on 
alternate storage media while it keeps the existing handle open.

{quote}
Is this going to work with rolling upgrades?
{quote}
HDFS does not support rolling upgrades today.

{quote}
Storage is not always truly hierarchical, it depends on your provisioning. The 
current strategy of always falling back from SSD to HDD is more ambiguous when 
you have more than two storage types, especially with something like a tape or 
NAS tier. Maybe this should be configurable somehow.
One of the mentioned potential uses is to do automatic migration between 
storage types based on usage patterns. In this kind of scenario, it's necessary 
to support more expressive forms of resource management, e.g. YARN's fair 
scheduler. Quotas by themselves aren't sufficient.
{quote}
We have tried to avoid the term 'hierarchical' because we are not adding any 
awareness of storage hierarchy. HDD is just the most sensible default although 
as we mention in the future section we can look into adding support for tapes 
etc. Automatic migration is mentioned for completeness but we are not thinking 
about the design for it now. Anyone from the community is free to post ideas 
however.

Andrew, the design is the same you reviewed in August and we have discussed 
some of these earlier so I encourage you to read through the comment history.

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf, 
 20131125-HeterogeneousStorage-TestPlan.pdf, 
 20131125-HeterogeneousStorage.pdf, 
 20131202-HeterogeneousStorage-TestPlan.pdf, 
 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
 editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
 h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
 h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
 h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
 h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
 h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
 h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
 h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
 h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
 h2832_20131203.patch


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 



--
This 

[jira] [Commented] (HDFS-5353) Short circuit reads fail when dfs.encrypt.data.transfer is enabled

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842040#comment-13842040
 ] 

Hadoop QA commented on HDFS-5353:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617516/HDFS-5353.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
-14 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5668//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5668//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5668//console

This message is automatically generated.

 Short circuit reads fail when dfs.encrypt.data.transfer is enabled
 --

 Key: HDFS-5353
 URL: https://issues.apache.org/jira/browse/HDFS-5353
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-5353.001.patch, HDFS-5353.002.patch


 DataXceiver tries to establish secure channels via sasl when 
 dfs.encrypt.data.transfer is turned on. However, domain socket traffic seems 
 to be unencrypted therefore the client cannot communicate with the data node 
 via domain sockets, which makes short circuit reads unfunctional.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842054#comment-13842054
 ] 

Hadoop QA commented on HDFS-4983:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617515/HDFS-4983.006.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5669//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5669//console

This message is automatically generated.

 Numeric usernames do not work with WebHDFS FS
 -

 Key: HDFS-4983
 URL: https://issues.apache.org/jira/browse/HDFS-4983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Yongjun Zhang
  Labels: patch
 Attachments: HDFS-4983.001.patch, HDFS-4983.002.patch, 
 HDFS-4983.003.patch, HDFS-4983.004.patch, HDFS-4983.005.patch, 
 HDFS-4983.006.patch


 Per the file 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
 Given this, using a username such as 123 seems to fail for some reason 
 (tried on insecure setup):
 {code}
 [123@host-1 ~]$ whoami
 123
 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
 -ls: Invalid value: 123 does not belong to the domain 
 ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
 Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...]
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5629) Support HTTPS in JournalNode

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842067#comment-13842067
 ] 

Hadoop QA commented on HDFS-5629:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617527/HDFS-5629.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
-2 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5670//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5670//console

This message is automatically generated.

 Support HTTPS in JournalNode
 

 Key: HDFS-5629
 URL: https://issues.apache.org/jira/browse/HDFS-5629
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-5629.000.patch, HDFS-5629.001.patch, 
 HDFS-5629.002.patch, HDFS-5629.003.patch


 Currently JournalNode has only HTTP support only. This jira tracks the effort 
 to add HTTPS support into JournalNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4201) NPE in BPServiceActor#sendHeartBeat

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HDFS-4201:
--

Status: Patch Available  (was: Open)

 NPE in BPServiceActor#sendHeartBeat
 ---

 Key: HDFS-4201
 URL: https://issues.apache.org/jira/browse/HDFS-4201
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Eli Collins
Assignee: Jimmy Xiang
Priority: Critical
 Fix For: 3.0.0

 Attachments: trunk-4201.patch, trunk-4201_v2.patch


 Saw the following NPE in a log.
 Think this is likely due to {{dn}} or {{dn.getFSDataset()}} being null, (not 
 {{bpRegistration}}) due to a configuration or local directory failure.
 {code}
 2012-09-25 04:33:20,782 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
 For namenode svsrs00127/11.164.162.226:8020 using DELETEREPORT_INTERVAL of 
 30 msec  BLOCKREPORT_INTERVAL of 2160msec Initial delay: 0msec; 
 heartBeatInterval=3000
 2012-09-25 04:33:20,782 ERROR 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService 
 for Block pool BP-1678908700-11.164.162.226-1342785481826 (storage id 
 DS-1031100678-11.164.162.251-5010-1341933415989) service to 
 svsrs00127/11.164.162.226:8020
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:434)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:520)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673)
 at java.lang.Thread.run(Thread.java:722)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >