[jira] [Commented] (HDFS-5163) miscellaneous cache pool RPC fixes

2013-09-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759979#comment-13759979
 ] 

Andrew Wang commented on HDFS-5163:
---

Thanks Colin, deferring the PB changes to another JIRA is fine. Mostly nitty 
comments this go around, and a few small things I missed last time:

* earlier comment about modifyCachePool still applies, it should be annotated 
AtMostOnce in ClientProtocol.
* In CachePool, we can assign the mode via the FsPermission copy constructor, 
rather than converting into a short and back. {{#setMode}} should probably also 
use the copy constructor, since FsPermission is mutable (which is kinda uncool).
* in CachePoolInfo#toString, we can just use FsPermission's toString rather 
than calling String.format to zero-pad the short.
* shouldn't we clear the pool maps as well in {{CacheManager#clear}}?
* removing a pool should also remove entries associated with that pool, you 
could reuse the currently committed code.

+1 once these are addresed, TYs.

 miscellaneous cache pool RPC fixes
 --

 Key: HDFS-5163
 URL: https://issues.apache.org/jira/browse/HDFS-5163
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-5163-caching.001.patch, HDFS-5163-caching.002.patch


 some minor fixes-- see below.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759983#comment-13759983
 ] 

Hadoop QA commented on HDFS-5167:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601763/HDFS-5167.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4937//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4937//console

This message is automatically generated.

 Add metrics about the NameNode retry cache
 --

 Key: HDFS-5167
 URL: https://issues.apache.org/jira/browse/HDFS-5167
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Priority: Minor
 Attachments: HDFS-5167.1.patch


 It will be helpful to have metrics in NameNode about the retry cache, such as 
 the retry count etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5118) Provide testing support for DFSClient to drop RPC responses

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759984#comment-13759984
 ] 

Hadoop QA commented on HDFS-5118:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601746/HDFS-5118.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4936//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4936//console

This message is automatically generated.

 Provide testing support for DFSClient to drop RPC responses
 ---

 Key: HDFS-5118
 URL: https://issues.apache.org/jira/browse/HDFS-5118
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-5118.000.patch, HDFS-5118.001.patch, 
 HDFS-5118.002.patch, HDFS-5118.003.patch, HDFS-5118.004.patch


 We plan to add capability to DFSClient so that the client is able to 
 intentionally drop responses of NameNode RPC calls according to settings in 
 configuration. In this way we can do better system test for NameNode retry 
 cache, especially when NN failover happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5164) deleteSnapshot should check if OperationCategory.WRITE is possible before taking write lock

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759987#comment-13759987
 ] 

Hudson commented on HDFS-5164:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4376 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4376/])
HDFS-5164.  DeleteSnapshot should check if OperationCategory.WRITE is possible 
before taking write lock (contributed by Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520492)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 deleteSnapshot should check if OperationCategory.WRITE is possible before 
 taking write lock
 ---

 Key: HDFS-5164
 URL: https://issues.apache.org/jira/browse/HDFS-5164
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.3.0

 Attachments: HDFS-5164.001.patch


 deleteSnapshot should check if OperationCategory.WRITE is possible before 
 taking the write lock, to help avoid lock contention

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4491) Parallel testing HDFS

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760123#comment-13760123
 ] 

Hudson commented on HDFS-4491:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #324 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/324/])
HDFS-4491. Add/delete files missed in prior commit. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520482)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/URLConnectionFactory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/URLUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/test/PathUtils.java
HDFS-4491. Parallel testing HDFS. Contributed by Andrey Klochkov. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520479)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/pom.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HsftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/fs/TestFiRename.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestResolveHdfsSymlink.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUrlStreamHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/loadGenerator/TestLoadGenerator.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientReportBadBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFSInputChecker.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppendRestart.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHDFSServerPorts.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpURLTimeouts.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/TestNNWithQJM.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournalNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestOverReplicatedBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAllowFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java
* 

[jira] [Commented] (HDFS-5164) deleteSnapshot should check if OperationCategory.WRITE is possible before taking write lock

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760122#comment-13760122
 ] 

Hudson commented on HDFS-5164:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #324 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/324/])
HDFS-5164.  DeleteSnapshot should check if OperationCategory.WRITE is possible 
before taking write lock (contributed by Colin Patrick McCabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520492)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 deleteSnapshot should check if OperationCategory.WRITE is possible before 
 taking write lock
 ---

 Key: HDFS-5164
 URL: https://issues.apache.org/jira/browse/HDFS-5164
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.3.0

 Attachments: HDFS-5164.001.patch


 deleteSnapshot should check if OperationCategory.WRITE is possible before 
 taking the write lock, to help avoid lock contention

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5159) Secondary NameNode fails to checkpoint if error occurs downloading edits on first checkpoint

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760120#comment-13760120
 ] 

Hudson commented on HDFS-5159:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #324 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/324/])
HDFS-5159. Secondary NameNode fails to checkpoint if error occurs downloading 
edits on first checkpoint. Contributed by Aaron T. Myers. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520363)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java


 Secondary NameNode fails to checkpoint if error occurs downloading edits on 
 first checkpoint
 

 Key: HDFS-5159
 URL: https://issues.apache.org/jira/browse/HDFS-5159
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.1.0-beta
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 2.1.1-beta

 Attachments: HDFS-5159.patch, HDFS-5159.patch


 The 2NN will avoid downloading/loading a new fsimage if its local copy of 
 fsimage is the same as the version on the NN. However, the decision to *load* 
 the fsimage from disk into memory is based only on the on-disk fsimage 
 version. If an error occurs between downloading and loading the fsimage on 
 the first checkpoint attempt, the 2NN will never load the fsimage, and then 
 on subsequent checkpoint attempts it will not load the on-disk fsimage and 
 thus will never checkpoint successfully.
 Example error message in the first comment of this ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (HDFS-5168) BlockPlacementPolicy does not work for cross rack/node group dependencies

2013-09-06 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran moved HADOOP-9936 to HDFS-5168:
--

Key: HDFS-5168  (was: HADOOP-9936)
Project: Hadoop HDFS  (was: Hadoop Common)

 BlockPlacementPolicy does not work for cross rack/node group dependencies
 -

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Nikola Vujic
Priority: Critical

 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5168) BlockPlacementPolicy does not work for cross rack/node group dependencies

2013-09-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760132#comment-13760132
 ] 

Steve Loughran commented on HDFS-5168:
--

Moved to HDFS issues

 BlockPlacementPolicy does not work for cross rack/node group dependencies
 -

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Nikola Vujic
Priority: Critical

 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5118) Provide testing support for DFSClient to drop RPC responses

2013-09-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5118:


   Resolution: Fixed
Fix Version/s: 2.3.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. 

 Provide testing support for DFSClient to drop RPC responses
 ---

 Key: HDFS-5118
 URL: https://issues.apache.org/jira/browse/HDFS-5118
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.3.0

 Attachments: HDFS-5118.000.patch, HDFS-5118.001.patch, 
 HDFS-5118.002.patch, HDFS-5118.003.patch, HDFS-5118.004.patch


 We plan to add capability to DFSClient so that the client is able to 
 intentionally drop responses of NameNode RPC calls according to settings in 
 configuration. In this way we can do better system test for NameNode retry 
 cache, especially when NN failover happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache

2013-09-06 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760353#comment-13760353
 ] 

Tsuyoshi OZAWA commented on HDFS-5167:
--

[~sureshms], I see. NameNode has RPCMetrics, so your idea looks good to me. 
Should we create new jira on hadoop-common?

 Add metrics about the NameNode retry cache
 --

 Key: HDFS-5167
 URL: https://issues.apache.org/jira/browse/HDFS-5167
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Priority: Minor
 Attachments: HDFS-5167.1.patch


 It will be helpful to have metrics in NameNode about the retry cache, such as 
 the retry count etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache

2013-09-06 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760367#comment-13760367
 ] 

Tsuyoshi OZAWA commented on HDFS-5167:
--

[~jingzhao], OK, thanks.

 Add metrics about the NameNode retry cache
 --

 Key: HDFS-5167
 URL: https://issues.apache.org/jira/browse/HDFS-5167
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Priority: Minor
 Attachments: HDFS-5167.1.patch


 It will be helpful to have metrics in NameNode about the retry cache, such as 
 the retry count etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache

2013-09-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760354#comment-13760354
 ] 

Jing Zhao commented on HDFS-5167:
-

[~ozawa], you can move this jira to hadoop-common if necessary. No need to 
create a new jira.

 Add metrics about the NameNode retry cache
 --

 Key: HDFS-5167
 URL: https://issues.apache.org/jira/browse/HDFS-5167
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, namenode
Affects Versions: 3.0.0
Reporter: Jing Zhao
Priority: Minor
 Attachments: HDFS-5167.1.patch


 It will be helpful to have metrics in NameNode about the retry cache, such as 
 the retry count etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5169) hdfs.c: translateZCRException: null pointer deref when translating some exceptions

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5169:
---

Attachment: HDFS-5169-caching.001.patch

 hdfs.c: translateZCRException: null pointer deref when translating some 
 exceptions
 --

 Key: HDFS-5169
 URL: https://issues.apache.org/jira/browse/HDFS-5169
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-5169-caching.001.patch


 hdfs.c: translateZCRException: there is a null pointer deref when translating 
 some exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HDFS-5169) hdfs.c: translateZCRException: null pointer deref when translating some exceptions

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-5169 started by Colin Patrick McCabe.

 hdfs.c: translateZCRException: null pointer deref when translating some 
 exceptions
 --

 Key: HDFS-5169
 URL: https://issues.apache.org/jira/browse/HDFS-5169
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-5169-caching.001.patch


 hdfs.c: translateZCRException: there is a null pointer deref when translating 
 some exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5118) Provide testing support for DFSClient to drop RPC responses

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760484#comment-13760484
 ] 

Hudson commented on HDFS-5118:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4378 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4378/])
Move HDFS-5118 to 2.1.1-beta section. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520650)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Provide testing support for DFSClient to drop RPC responses
 ---

 Key: HDFS-5118
 URL: https://issues.apache.org/jira/browse/HDFS-5118
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.3.0

 Attachments: HDFS-5118.000.patch, HDFS-5118.001.patch, 
 HDFS-5118.002.patch, HDFS-5118.003.patch, HDFS-5118.004.patch


 We plan to add capability to DFSClient so that the client is able to 
 intentionally drop responses of NameNode RPC calls according to settings in 
 configuration. In this way we can do better system test for NameNode retry 
 cache, especially when NN failover happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5029) Token operations should not block read operations

2013-09-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760491#comment-13760491
 ] 

Kihwal Lee commented on HDFS-5029:
--

Since they are all acquiring read lock, if there are two token operations going 
on against a same token, the order of operations and the order of edit logging 
can be different.

 Token operations should not block read operations
 -

 Key: HDFS-5029
 URL: https://issues.apache.org/jira/browse/HDFS-5029
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-5029.2.patch, HDFS-5029.branch-23.patch, 
 HDFS-5029.patch, HDFS-5029.patch


 Token operations unnecessarily obtain the write lock on the namespace.  Edits 
 for token operations are independent of edits for other namespace write 
 operations, and the edits have no ordering requirement with respect to 
 namespace changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5118) Provide testing support for DFSClient to drop RPC responses

2013-09-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760494#comment-13760494
 ] 

Jing Zhao commented on HDFS-5118:
-

Also committed to branch-2.1-beta.

 Provide testing support for DFSClient to drop RPC responses
 ---

 Key: HDFS-5118
 URL: https://issues.apache.org/jira/browse/HDFS-5118
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.1.1-beta

 Attachments: HDFS-5118.000.patch, HDFS-5118.001.patch, 
 HDFS-5118.002.patch, HDFS-5118.003.patch, HDFS-5118.004.patch


 We plan to add capability to DFSClient so that the client is able to 
 intentionally drop responses of NameNode RPC calls according to settings in 
 configuration. In this way we can do better system test for NameNode retry 
 cache, especially when NN failover happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4879) Add blocked ArrayList collection to avoid CMS full GCs

2013-09-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760529#comment-13760529
 ] 

Hudson commented on HDFS-4879:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4380 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4380/])
HDFS-4879. Add BlockedArrayList collection to avoid CMS full GCs (Contributed 
by Todd Lipcon) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1520667)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/ChunkedArrayList.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestChunkedArrayList.java


 Add blocked ArrayList collection to avoid CMS full GCs
 

 Key: HDFS-4879
 URL: https://issues.apache.org/jira/browse/HDFS-4879
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0, 2.0.4-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 2.3.0

 Attachments: hdfs-4879.txt, hdfs-4879.txt, hdfs-4879.txt, 
 hdfs-4879.txt


 We recently saw an issue where a large deletion was issued which caused 25M 
 blocks to be collected during {{deleteInternal}}. Currently, the list of 
 collected blocks is an ArrayList, meaning that we had to allocate a 
 contiguous 25M-entry array (~400MB). After a NN has been running for a long 
 amount of time, the old generation may become fragmented such that it's hard 
 to find a 400MB contiguous chunk of heap.
 In general, we should try to design the NN such that the only large objects 
 are long-lived and created at startup time. We can improve this particular 
 case (and perhaps some others) by introducing a new List implementation which 
 is made of a linked list of arrays, each of which is size-limited (eg to 1MB).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5118) Provide testing support for DFSClient to drop RPC responses

2013-09-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5118:


Fix Version/s: (was: 2.3.0)
   2.1.1-beta
 Release Note: Used for testing when NameNode HA is enabled. Users can use 
a new configuration property dfs.client.test.drop.namenode.response.number to 
specify the number of responses that DFSClient will drop in each RPC call. This 
feature can help testing functionalities such as NameNode retry cache.

 Provide testing support for DFSClient to drop RPC responses
 ---

 Key: HDFS-5118
 URL: https://issues.apache.org/jira/browse/HDFS-5118
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.1.1-beta

 Attachments: HDFS-5118.000.patch, HDFS-5118.001.patch, 
 HDFS-5118.002.patch, HDFS-5118.003.patch, HDFS-5118.004.patch


 We plan to add capability to DFSClient so that the client is able to 
 intentionally drop responses of NameNode RPC calls according to settings in 
 configuration. In this way we can do better system test for NameNode retry 
 cache, especially when NN failover happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4879) Add blocked ArrayList collection to avoid CMS full GCs

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4879:
---

  Resolution: Fixed
   Fix Version/s: 2.3.0
Target Version/s: 2.3.0  (was: 3.0.0)
  Status: Resolved  (was: Patch Available)

committed to branch 2.3

 Add blocked ArrayList collection to avoid CMS full GCs
 

 Key: HDFS-4879
 URL: https://issues.apache.org/jira/browse/HDFS-4879
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0, 2.0.4-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 2.3.0

 Attachments: hdfs-4879.txt, hdfs-4879.txt, hdfs-4879.txt, 
 hdfs-4879.txt


 We recently saw an issue where a large deletion was issued which caused 25M 
 blocks to be collected during {{deleteInternal}}. Currently, the list of 
 collected blocks is an ArrayList, meaning that we had to allocate a 
 contiguous 25M-entry array (~400MB). After a NN has been running for a long 
 amount of time, the old generation may become fragmented such that it's hard 
 to find a 400MB contiguous chunk of heap.
 In general, we should try to design the NN such that the only large objects 
 are long-lived and created at startup time. We can improve this particular 
 case (and perhaps some others) by introducing a new List implementation which 
 is made of a linked list of arrays, each of which is size-limited (eg to 1MB).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5169) hdfs.c: translateZCRException: null pointer deref when translating some exceptions

2013-09-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760545#comment-13760545
 ] 

Andrew Wang commented on HDFS-5169:
---

+1, thanks colin.

 hdfs.c: translateZCRException: null pointer deref when translating some 
 exceptions
 --

 Key: HDFS-5169
 URL: https://issues.apache.org/jira/browse/HDFS-5169
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-5169-caching.001.patch


 hdfs.c: translateZCRException: there is a null pointer deref when translating 
 some exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-5169) hdfs.c: translateZCRException: null pointer deref when translating some exceptions

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-5169.


   Resolution: Fixed
Fix Version/s: HDFS-4949

 hdfs.c: translateZCRException: null pointer deref when translating some 
 exceptions
 --

 Key: HDFS-5169
 URL: https://issues.apache.org/jira/browse/HDFS-5169
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: HDFS-4949

 Attachments: HDFS-5169-caching.001.patch


 hdfs.c: translateZCRException: there is a null pointer deref when translating 
 some exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-5163) miscellaneous cache pool RPC fixes

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-5163.


Resolution: Fixed

committed to branch, thanks

 miscellaneous cache pool RPC fixes
 --

 Key: HDFS-5163
 URL: https://issues.apache.org/jira/browse/HDFS-5163
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-5163-caching.001.patch, HDFS-5163-caching.002.patch


 some minor fixes-- see below.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-09-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760509#comment-13760509
 ] 

Arpit Agarwal commented on HDFS-2832:
-

Thanks for the feedback Eric. Both of these would be good to have and came up 
during design discussions but we have not addressed either.

For #2, in addition to your points there are other locations where storage 
directories are assumed to be File-addressable. I am not sure of the amount of 
work involved here.

Multiple replicas per Datanode looks easier and can be done on top of the 
Heterogeneous Storage work. We would need phase 1 of the feature to support 
multiple storages.

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5041) Add the time of last heartbeat to dead server Web UI

2013-09-06 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-5041:
-

Attachment: NameNode-dfsnodelist-dead.png

I attach a prototype image of dead datanodes list.

 Add the time of last heartbeat to dead server Web UI
 

 Key: HDFS-5041
 URL: https://issues.apache.org/jira/browse/HDFS-5041
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ted Yu
Priority: Minor
 Attachments: NameNode-dfsnodelist-dead.png


 In Live Server page, there is a column 'Last Contact'.
 On the dead server page, similar column can be added which shows when the 
 last heartbeat came from the respective dead node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5041) Add the time of last heartbeat to dead server Web UI

2013-09-06 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-5041:
-

Attachment: HDFS-5041.patch

I attach the patch which I showed in the image.

 Add the time of last heartbeat to dead server Web UI
 

 Key: HDFS-5041
 URL: https://issues.apache.org/jira/browse/HDFS-5041
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ted Yu
Priority: Minor
 Attachments: HDFS-5041.patch, NameNode-dfsnodelist-dead.png


 In Live Server page, there is a column 'Last Contact'.
 On the dead server page, similar column can be added which shows when the 
 last heartbeat came from the respective dead node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5041) Add the time of last heartbeat to dead server Web UI

2013-09-06 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-5041:
-

 Target Version/s: 3.0.0
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

 Add the time of last heartbeat to dead server Web UI
 

 Key: HDFS-5041
 URL: https://issues.apache.org/jira/browse/HDFS-5041
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Ted Yu
Priority: Minor
 Attachments: HDFS-5041.patch, NameNode-dfsnodelist-dead.png


 In Live Server page, there is a column 'Last Contact'.
 On the dead server page, similar column can be added which shows when the 
 last heartbeat came from the respective dead node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5041) Add the time of last heartbeat to dead server Web UI

2013-09-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760627#comment-13760627
 ] 

Ted Yu commented on HDFS-5041:
--

Looks good.

 Add the time of last heartbeat to dead server Web UI
 

 Key: HDFS-5041
 URL: https://issues.apache.org/jira/browse/HDFS-5041
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ted Yu
Priority: Minor
 Attachments: NameNode-dfsnodelist-dead.png


 In Live Server page, there is a column 'Last Contact'.
 On the dead server page, similar column can be added which shows when the 
 last heartbeat came from the respective dead node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760682#comment-13760682
 ] 

Owen O'Malley commented on HDFS-4953:
-

Colin, please read my suggestion and my analysis of the difference before 
commenting.

The simplified API absolutely provides a means to releasing the ByteBuffer and 
yet it is 2 lines long instead of 20. Furthermore, I didn't even realize that I 
was supposed to close the zero copy cursor, since it just came in from closable.

My complaint stands. The API as currently in this branch is very error-prone 
and difficult to explain. Using it is difficult and requires complex handling 
including exception handlers to handle arbitrary file systems.

 enable HDFS local reads via mmap
 

 Key: HDFS-4953
 URL: https://issues.apache.org/jira/browse/HDFS-4953
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: HDFS-4949

 Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
 HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
 HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch


 Currently, the short-circuit local read pathway allows HDFS clients to access 
 files directly without going through the DataNode.  However, all of these 
 reads involve a copy at the operating system level, since they rely on the 
 read() / pread() / etc family of kernel interfaces.
 We would like to enable HDFS to read local files via mmap.  This would enable 
 truly zero-copy reads.
 In the initial implementation, zero-copy reads will only be performed when 
 checksums were disabled.  Later, we can use the DataNode's cache awareness to 
 only perform zero-copy reads when we know that checksum has already been 
 verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5085) Support Kerberos authentication in NFSv3 gateway

2013-09-06 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5085:
-

Assignee: Jing Zhao

 Support Kerberos authentication in NFSv3 gateway
 

 Key: HDFS-5085
 URL: https://issues.apache.org/jira/browse/HDFS-5085
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Jing Zhao



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5067) Support symlink operations

2013-09-06 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5067:
-

Assignee: Brandon Li

 Support symlink operations
 --

 Key: HDFS-5067
 URL: https://issues.apache.org/jira/browse/HDFS-5067
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li

 Given the symlink issues(e.g., HDFS-4765) are getting fixed. NFS can support 
 the symlinke related requests, which includes NFSv3 calls SYMLINK and 
 READLINK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5086) Support RPCSEC_GSS authentication in NFSv3 gateway

2013-09-06 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5086:
-

Assignee: Jing Zhao

 Support RPCSEC_GSS authentication in NFSv3 gateway
 --

 Key: HDFS-5086
 URL: https://issues.apache.org/jira/browse/HDFS-5086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Jing Zhao



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5041) Add the time of last heartbeat to dead server Web UI

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760748#comment-13760748
 ] 

Hadoop QA commented on HDFS-5041:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601909/HDFS-5041.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4938//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4938//console

This message is automatically generated.

 Add the time of last heartbeat to dead server Web UI
 

 Key: HDFS-5041
 URL: https://issues.apache.org/jira/browse/HDFS-5041
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Ted Yu
Priority: Minor
 Attachments: HDFS-5041.patch, NameNode-dfsnodelist-dead.png


 In Live Server page, there is a column 'Last Contact'.
 On the dead server page, similar column can be added which shows when the 
 last heartbeat came from the respective dead node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760761#comment-13760761
 ] 

Colin Patrick McCabe commented on HDFS-4953:


Your proposed API doesn't address one of the big asks we had when designing 
ZCR, which is to provide a mechanism for notifying the user that he cannot get 
an mmap.  As I mentioned earlier, for performance reasons, many users who might 
like to have access to a 128 MB mmap segment do not want to copy into a 128MB 
backing buffer.  Doing such a large copy would blow the L2   cache (and 
possibly the page cache), and rather than improving performance, might degrade 
it.  Similarly, users don't want to get multiple byte buffers back-- the big 
advantage of mmap is getting a single buffer back (in the cases where that's 
possible).

What if the user wants to use a direct byte buffer as his fallback buffer?  
With the current code, that is easy-- I just call 
setFallbackBuffer(ByteBuffer.allocateDirect(...)).  With your proposed API,   
there's no way to do this.

Creating a new ByteBuffer for each read is going to be slower than reusing the 
same ByteBuffer-- especially for direct ByteBuffers.  Sure, we could have some 
kind of ByteBuffer cache inside the FSDataInputStream, but that's going to be 
very complicated.  What if someone needs a ByteBuffer of size 100 but we only 
have ones of size 10 and 900 in the cache?  Do we use the big one for the small 
read or leave it around?  How long do we cache them?  Do we prefer to the 
direct ones?  And so on.  Really, the only design that makes sense is having 
the user pass in the fallback buffer.  We do not want to be re-inventing malloc 
inside FSDataInputStream.

The design principles of the current API are:
* some users want a fallback path, and some don't.  We have to satisfy both.
* we don't want to manage buffers inside FSDataInputStream.  It's a messy and 
hard problem with no optimal solutions that fit all cases.
* nobody wants to receive more than one buffer in response to a read.
* most programmers don't correctly handle short reads, so there should be an 
option to disable them.

One thing that we could and should do is provide a generic fallback path that 
is independent of filesystem.

 enable HDFS local reads via mmap
 

 Key: HDFS-4953
 URL: https://issues.apache.org/jira/browse/HDFS-4953
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: HDFS-4949

 Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
 HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
 HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch


 Currently, the short-circuit local read pathway allows HDFS clients to access 
 files directly without going through the DataNode.  However, all of these 
 reads involve a copy at the operating system level, since they rely on the 
 read() / pread() / etc family of kernel interfaces.
 We would like to enable HDFS to read local files via mmap.  This would enable 
 truly zero-copy reads.
 In the initial implementation, zero-copy reads will only be performed when 
 checksums were disabled.  Later, we can use the DataNode's cache awareness to 
 only perform zero-copy reads when we know that checksum has already been 
 verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760804#comment-13760804
 ] 

Colin Patrick McCabe commented on HDFS-4953:


I've been thinking about this, and I think it might be possible to improve on 
the current API.

Maybe all we need is something like this:
{code}
in DFSInputStream:
  ZeroBuffer readZero(ByteBuffer fallback, int maxLength);

ZeroBuffer:
  implements Closeable (for close)
  implements eof() (returns true if there are no more bytes to read)
  implements all ByteBuffer methods by forwarding them to the enclosed 
ByteBuffer
{code}

This API would be implemented for every filesystem, not just HDFS.

The constraints here would be:
* maxLength = 0
* you can't reuse a fallback buffer until you close the associated ZeroBuffer 
(we can enforce this by throwing an exception in this case)
* ZeroBuffers are immutable once created-- until you call close on them.

This gets rid of a few of the awkward issues with the current API, which I 
think are:
* the current API requires users to special-case HDFS (since other FSes throw 
ZeroCopyUnavailableException)
* the current API shares the file position between the cursors and the stream, 
which is unintuitive.
* the current API puts the read call inside the cursor object, which is 
different than the other read methods.

 enable HDFS local reads via mmap
 

 Key: HDFS-4953
 URL: https://issues.apache.org/jira/browse/HDFS-4953
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: HDFS-4949

 Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
 HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
 HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch


 Currently, the short-circuit local read pathway allows HDFS clients to access 
 files directly without going through the DataNode.  However, all of these 
 reads involve a copy at the operating system level, since they rely on the 
 read() / pread() / etc family of kernel interfaces.
 We would like to enable HDFS to read local files via mmap.  This would enable 
 truly zero-copy reads.
 In the initial implementation, zero-copy reads will only be performed when 
 checksums were disabled.  Later, we can use the DataNode's cache awareness to 
 only perform zero-copy reads when we know that checksum has already been 
 verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5120) add command-line support for manipulating cache pools

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5120:
---

Attachment: HDFS-5163-caching.004.patch

I rebased this on the current branch, and added some help text about mode and 
weight.

The description of mode is more detailed now.

I realize the description of weight is inadequate, but we plan on adding many 
more resource management tunables, so lets leave it in for now as a placeholder.

 add command-line support for manipulating cache pools
 -

 Key: HDFS-5120
 URL: https://issues.apache.org/jira/browse/HDFS-5120
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5120-caching.001.patch, HDFS-5163-caching.004.patch


 We should add command-line support for creating, removing, and listing cache 
 directives and manipulating cache pools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-5170) BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging

2013-09-06 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-5170:
-

 Summary: BlockPlacementPolicyDefault uses the wrong classname when 
alerting to enable debug logging
 Key: HDFS-5170
 URL: https://issues.apache.org/jira/browse/HDFS-5170
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial


{code}
  private static final String enableDebugLogging =
For more information, please enable DEBUG log level on 
+ LOG.getClass().getName();
{code}

This inserts the LOG's class rather than BlockPlacementPolicy's class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5120) add command-line support for manipulating cache pools

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5120:
---

Attachment: HDFS-5120-caching.002.patch

 add command-line support for manipulating cache pools
 -

 Key: HDFS-5120
 URL: https://issues.apache.org/jira/browse/HDFS-5120
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5120-caching.001.patch, HDFS-5120-caching.002.patch


 We should add command-line support for creating, removing, and listing cache 
 directives and manipulating cache pools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5120) add command-line support for manipulating cache pools

2013-09-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5120:
---

Attachment: (was: HDFS-5163-caching.004.patch)

 add command-line support for manipulating cache pools
 -

 Key: HDFS-5120
 URL: https://issues.apache.org/jira/browse/HDFS-5120
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5120-caching.001.patch, HDFS-5120-caching.002.patch


 We should add command-line support for creating, removing, and listing cache 
 directives and manipulating cache pools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5170) BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging

2013-09-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760822#comment-13760822
 ] 

Colin Patrick McCabe commented on HDFS-5170:


+1 pending jenkins

 BlockPlacementPolicyDefault uses the wrong classname when alerting to enable 
 debug logging
 --

 Key: HDFS-5170
 URL: https://issues.apache.org/jira/browse/HDFS-5170
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: HDFS-5170-1.patch


 {code}
   private static final String enableDebugLogging =
 For more information, please enable DEBUG log level on 
 + LOG.getClass().getName();
 {code}
 This inserts the LOG's class rather than BlockPlacementPolicy's class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5170) BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging

2013-09-06 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5170:
--

Attachment: HDFS-5170-1.patch

Trivial patch attached, compile tested.

 BlockPlacementPolicyDefault uses the wrong classname when alerting to enable 
 debug logging
 --

 Key: HDFS-5170
 URL: https://issues.apache.org/jira/browse/HDFS-5170
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: HDFS-5170-1.patch


 {code}
   private static final String enableDebugLogging =
 For more information, please enable DEBUG log level on 
 + LOG.getClass().getName();
 {code}
 This inserts the LOG's class rather than BlockPlacementPolicy's class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-5169) hdfs.c: translateZCRException: null pointer deref when translating some exceptions

2013-09-06 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-5169:
--

 Summary: hdfs.c: translateZCRException: null pointer deref when 
translating some exceptions
 Key: HDFS-5169
 URL: https://issues.apache.org/jira/browse/HDFS-5169
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: HDFS-4949
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


hdfs.c: translateZCRException: there is a null pointer deref when translating 
some exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-5170) BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging

2013-09-06 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5170:
--

Affects Version/s: 2.1.0-beta
   Status: Patch Available  (was: Open)

 BlockPlacementPolicyDefault uses the wrong classname when alerting to enable 
 debug logging
 --

 Key: HDFS-5170
 URL: https://issues.apache.org/jira/browse/HDFS-5170
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: HDFS-5170-1.patch


 {code}
   private static final String enableDebugLogging =
 For more information, please enable DEBUG log level on 
 + LOG.getClass().getName();
 {code}
 This inserts the LOG's class rather than BlockPlacementPolicy's class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5168) BlockPlacementPolicy does not work for cross rack/node group dependencies

2013-09-06 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760872#comment-13760872
 ] 

Junping Du commented on HDFS-5168:
--

That's a good idea. Nichola! I think you are mentioning case for VMs running on 
shared storages, i.e. SAN. Isn't it? 
Previously, DAS (HDD) is considered as default and the only one backed storage 
type for Hadoop even in virtualization case. Now, we are addressing different 
storage tiers (including SSD, remote storage, etc.) under HDFS-2832. I had 
similar comments there that storage failure group should be addressed when we 
are enabling remote storage. I would prefer the first approach especially it 
could be easier after enabling storage type awareness. The second approach will 
break some basic assumptions of Hadoop - hierarchical network topology, which 
seems unnecessary for me. Thoughts? 

 BlockPlacementPolicy does not work for cross rack/node group dependencies
 -

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Nikola Vujic
Priority: Critical

 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5170) BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760912#comment-13760912
 ] 

Hadoop QA commented on HDFS-5170:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601941/HDFS-5170-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4939//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4939//console

This message is automatically generated.

 BlockPlacementPolicyDefault uses the wrong classname when alerting to enable 
 debug logging
 --

 Key: HDFS-5170
 URL: https://issues.apache.org/jira/browse/HDFS-5170
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: HDFS-5170-1.patch


 {code}
   private static final String enableDebugLogging =
 For more information, please enable DEBUG log level on 
 + LOG.getClass().getName();
 {code}
 This inserts the LOG's class rather than BlockPlacementPolicy's class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira