[jira] [Commented] (HDFS-6292) Display HDFS per user and per group usage on the webUI

2014-04-28 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982786#comment-13982786
 ] 

Vinayakumar B commented on HDFS-6292:
-

Good one Ravi.

I think calculating in Secondary NN side is  OK. But I have a feeling like, 
just to get these statistics user needs to navigate to SNN page is not a good 
idea. 
How about keeping track of these in NameNode side from the starting itself and 
update these statistics (same as other metrics.) for every operation which 
modifies these and avoid re-calculation of whole statistics in between to avoid 
holding namesystem lock for more time.

 Display HDFS per user and per group usage on the webUI
 --

 Key: HDFS-6292
 URL: https://issues.apache.org/jira/browse/HDFS-6292
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-6292.patch, HDFS-6292.png


 It would be nice to show HDFS usage per user and per group on a web ui.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6291) FSImage may be left unclosed in BootstrapStandby#doRun()

2014-04-28 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982788#comment-13982788
 ] 

Vinayakumar B commented on HDFS-6291:
-

Yes! you are right. 
image.close() should be finally clause.

 FSImage may be left unclosed in BootstrapStandby#doRun()
 

 Key: HDFS-6291
 URL: https://issues.apache.org/jira/browse/HDFS-6291
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor

 At around line 203:
 {code}
   if (!checkLogsAvailableForRead(image, imageTxId, curTxId)) {
 return ERR_CODE_LOGS_UNAVAILABLE;
   }
 {code}
 If we return following the above check, image is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5168) BlockPlacementPolicy does not work for cross node group dependencies

2014-04-28 Thread Nikola Vujic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikola Vujic updated HDFS-5168:
---

Attachment: HDFS-5168.patch

attaching original patch again.

 BlockPlacementPolicy does not work for cross node group dependencies
 

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Nikola Vujic
Assignee: Nikola Vujic
Priority: Critical
 Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, 
 HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch


 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6261) Add document for enabling node group layer in HDFS

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982798#comment-13982798
 ] 

Hadoop QA commented on HDFS-6261:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642186/HDFS-6261.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDistributedFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6753//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6753//console

This message is automatically generated.

 Add document for enabling node group layer in HDFS
 --

 Key: HDFS-6261
 URL: https://issues.apache.org/jira/browse/HDFS-6261
 Project: Hadoop HDFS
  Issue Type: Task
  Components: documentation
Reporter: Wenwu Peng
Assignee: Binglin Chang
  Labels: documentation
 Attachments: 2-layer-topology.png, 3-layer-topology.png, 
 3layer-topology.png, 4layer-topology.png, HDFS-6261.v1.patch, 
 HDFS-6261.v1.patch, HDFS-6261.v2.patch


 Most of patches from Umbrella JIRA HADOOP-8468  have committed, However there 
 is no site to introduce NodeGroup-aware(HADOOP Virtualization Extensisons) 
 and how to do configuration. so we need to doc it.
 1.  Doc NodeGroup-aware relate in http://hadoop.apache.org/docs/current 
 2.  Doc NodeGroup-aware properties in core-default.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5168) BlockPlacementPolicy does not work for cross node group dependencies

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982933#comment-13982933
 ] 

Hadoop QA commented on HDFS-5168:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642193/HDFS-5168.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6754//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6754//console

This message is automatically generated.

 BlockPlacementPolicy does not work for cross node group dependencies
 

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Nikola Vujic
Assignee: Nikola Vujic
Priority: Critical
 Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, 
 HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch


 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6258) Support XAttrs from NameNode and implements XAttr APIs for DistributedFileSystem

2014-04-28 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983007#comment-13983007
 ] 

Charles Lamb commented on HDFS-6258:


Hi Yi,

Here are a few minor things that I picked up on as well as some javadoc fixups.

Charles

Index: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/XAttr.java
===
--- 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/XAttr.java   
(revision 0)
+++ 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/XAttr.java   
(working copy)
@@ -0,0 +1,138 @@
+/**
+ * XAttr is POSIX Extended Attribute model, similar to the one in traditional 
Operating Systems.
+ * Extended Attribute consists of a name and associated data, and 4 namespaces 
are defined: user, 
+ * trusted, security and system.
+ *   1). USER namespace extended attribute may be assigned for storing 
arbitrary additional 
+ *   information, and its access permissions are defined by file/directory 
permission bits.
+ *   2). TRUSTED namespace extended attribute are visible and accessible only 
to privilege user 
+ *   (file/directory owner or fs admin), and it is available from both user 
space (filesystem 
+ *   API) and fs kernel.
+ *   3). SYSTEM namespace extended attribute is used by fs kernel to store 
system objects, 
+ *   and only available in fs kernel. It's not visible to users.
+ *   4). SECURITY namespace extended attribute is used by fs kernel for 
security features, and 
+ *   it's not visible to users.

XAttr is the POSIX Extended Attribute model similar to that found in
traditional Operating Systems.  Extended Attributes consist of one
or more name/value pairs associated with a file or directory. Four
namespaces are defined: user, trusted, security and system.
  1) USER namespace attributes may be used by any user to store
  arbitrary information. Access permissions in this namespace are
  defined by a file directory's permission bits.
  2) TRUSTED namespace attributes are only visible and accessible to
  privileged users (a file or directory's owner or the fs
  admin). This namespace is available from both user space
  (filesystem API) and fs kernel.
  3) SYSTEM namespace attributes are used by the fs kernel to store
  system objects.  This namespace is only available in the fs
  kernel. It is not visible to users.
  4) SECURITY namespace attributes are used by the fs kernel for
  security features. It is not visible to users.

Index: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
===
--- 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
 (revision 1589028)
+++ 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
 (working copy)
@@ -109,6 +109,8 @@
 import org.apache.hadoop.fs.MD5MD5CRC32FileChecksum;
 import org.apache.hadoop.fs.MD5MD5CRC32GzipFileChecksum;
 import org.apache.hadoop.fs.Options;
+import org.apache.hadoop.fs.XAttr;
+import org.apache.hadoop.fs.XAttrSetFlag;
 import org.apache.hadoop.fs.Options.ChecksumOpt;
 import org.apache.hadoop.fs.ParentNotDirectoryException;
 import org.apache.hadoop.fs.Path;
@@ -191,6 +193,8 @@
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Joiner;
 import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
 import com.google.common.net.InetAddresses;
 
 /
@@ -2757,6 +2761,127 @@
  UnresolvedPathException.class);
 }
   }
+  
+  XAttr constructXAttr(String name, byte[] value) {
+if (name == null) {
+  throw new NullPointerException(XAttr name can not be null.);
+}
+
+int prefixIndex = name.indexOf(.);
+if (prefixIndex == -1) {
+  throw new IllegalArgumentException(XAttr name must be prefixed with 
user/trusted/security/system which followed by '.');

s/which/and/

+}
+
+XAttr.NameSpace ns;
+String prefix = name.substring(0, prefixIndex).toUpperCase();
+if (prefix.equals(XAttr.NameSpace.USER.toString())) {
+  ns = XAttr.NameSpace.USER;
+} else if (prefix.equals(XAttr.NameSpace.TRUSTED.toString())) {
+  ns = XAttr.NameSpace.TRUSTED;
+} else if (prefix.equals(XAttr.NameSpace.SECURITY.toString())) {
+  ns = XAttr.NameSpace.SECURITY;
+} else if (prefix.equals(XAttr.NameSpace.SYSTEM.toString())) {
+  ns = XAttr.NameSpace.SYSTEM;
+} else {
+  throw new IllegalArgumentException(XAttr name must be prefixed with 
user/trusted/security/system which followed by '.');

s/which/and/

I'm unclear as to whether namespaces are case-sensitive or insensitive
(I believe they are case-insensitive). The 

[jira] [Updated] (HDFS-6218) Audit log should use true client IP for proxied webhdfs operations

2014-04-28 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6218:
-

   Resolution: Fixed
Fix Version/s: 2.5.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. Thanks for the fix, Daryn.

 Audit log should use true client IP for proxied webhdfs operations
 --

 Key: HDFS-6218
 URL: https://issues.apache.org/jira/browse/HDFS-6218
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, webhdfs
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6218.patch


 When using a http proxy, it's not very useful for the audit log to contain 
 the proxy's IP address.  Similar to proxy superusers, the NN should allow 
 configuration of trusted proxy servers and use the X-Forwarded-For header 
 when logging the client request.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6269) NameNode Audit Log should differentiate between webHDFS open and HDFS open.

2014-04-28 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6269:
-

Target Version/s: 2.5.0  (was: 2.4.1)

 NameNode Audit Log should differentiate between webHDFS open and HDFS open.
 ---

 Key: HDFS-6269
 URL: https://issues.apache.org/jira/browse/HDFS-6269
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, webhdfs
Affects Versions: 2.4.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: HDFS-6269-AuditLogWebOpen.txt, 
 HDFS-6269-AuditLogWebOpen.txt, HDFS-6269-AuditLogWebOpen.txt


 To enhance traceability, the NameNode audit log should use a different string 
 for open in the cmd= part of the audit entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6269) NameNode Audit Log should differentiate between webHDFS open and HDFS open.

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983024#comment-13983024
 ] 

Kihwal Lee commented on HDFS-6269:
--

The test failure is due to HDFS-6250.

 NameNode Audit Log should differentiate between webHDFS open and HDFS open.
 ---

 Key: HDFS-6269
 URL: https://issues.apache.org/jira/browse/HDFS-6269
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, webhdfs
Affects Versions: 2.4.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: HDFS-6269-AuditLogWebOpen.txt, 
 HDFS-6269-AuditLogWebOpen.txt, HDFS-6269-AuditLogWebOpen.txt


 To enhance traceability, the NameNode audit log should use a different string 
 for open in the cmd= part of the audit entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6218) Audit log should use true client IP for proxied webhdfs operations

2014-04-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983030#comment-13983030
 ] 

Hudson commented on HDFS-6218:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5579 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5579/])
HDFS-6218. Audit log should use true client IP for proxied webhdfs operations. 
Contributed by Daryn Sharp. (kihwal: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1590640)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java


 Audit log should use true client IP for proxied webhdfs operations
 --

 Key: HDFS-6218
 URL: https://issues.apache.org/jira/browse/HDFS-6218
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, webhdfs
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6218.patch


 When using a http proxy, it's not very useful for the audit log to contain 
 the proxy's IP address.  Similar to proxy superusers, the NN should allow 
 configuration of trusted proxy servers and use the X-Forwarded-For header 
 when logging the client request.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6269) NameNode Audit Log should differentiate between webHDFS open and HDFS open.

2014-04-28 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983031#comment-13983031
 ] 

Eric Payne commented on HDFS-6269:
--

The patch for this JIRA did not cause the unit test failure (see 
https://builds.apache.org/job/PreCommit-HDFS-Build/6742/, which also has this 
same failure).

I ran the specific unit test 
(org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup) in y build 
environment and it passes successfully, both with and without the patch in this 
JIRA.

 NameNode Audit Log should differentiate between webHDFS open and HDFS open.
 ---

 Key: HDFS-6269
 URL: https://issues.apache.org/jira/browse/HDFS-6269
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, webhdfs
Affects Versions: 2.4.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: HDFS-6269-AuditLogWebOpen.txt, 
 HDFS-6269-AuditLogWebOpen.txt, HDFS-6269-AuditLogWebOpen.txt


 To enhance traceability, the NameNode audit log should use a different string 
 for open in the cmd= part of the audit entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-6293:


 Summary: Issues with OIV processing PB-based fsimages
 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker


There are issues with OIV when processing fsimages in protobuf. 

Due to the internal layout changes introduced by the protobuf-based fsimage, 
OIV consumes excessive amount of memory.  We have tested with a fsimage with 
about 140M files/directories. The peak heap usage when processing this image in 
pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting the 
image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of heap 
(max new size was 1GB).  It should be possible to process any image with the 
default heap size of 1.5GB.

Another issue is the complete change of format/content in OIV's XML output.  I 
also noticed that the secret manager section has no tokens while there were 
unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6176) Remove assignments to method arguments

2014-04-28 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983066#comment-13983066
 ] 

Charles Lamb commented on HDFS-6176:


Hi Suresh,

Thanks. I've been tied up with other things. I'll try to generate a patch in a 
while. There are several hundred places where the code does this.

Charles


 Remove assignments to method arguments
 --

 Key: HDFS-6176
 URL: https://issues.apache.org/jira/browse/HDFS-6176
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Charles Lamb
Priority: Minor

 There are many places in the code where assignments are made to method 
 arguments. Eclipse is quite happy to flag this if the appropriate warning is 
 enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983095#comment-13983095
 ] 

Kihwal Lee commented on HDFS-6293:
--

Outputting in the new XML format is fast and consumes little memory because it 
is essentially dumping what is in the image in order. It does not provide 
readily usable directory/file information as it used to in pre-2.4/protobuf 
versions. 

Using something like the ls -l format or any custom visitor for dumping file 
system tree will require loading of all inodes upfront and linking them 
afterwards.  This requires considerably larger amount of memory. The smallest 
footprint will be similar to NN's without triplets.  It is clearly 
unacceptable.   Reducing memory consumption at the price of considerably longer 
processing time is also unacceptable.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker

 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6293:
-

Attachment: Heap Histogram.html

Attaching heap histogram of OIV. The max heap was set to 2GB to make it go out 
of heap early and dump the heap.  I only loaded up about 3M files/dirs before 
crashing. If we optimize the PB inefficiencies, we might be able to make it 
work with 50% of the heap. But that will still be too much.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983105#comment-13983105
 ] 

Kihwal Lee edited comment on HDFS-6293 at 4/28/14 3:33 PM:
---

Attaching heap histogram of OIV. The max heap was set to 2GB to make it go out 
of heap early and dump the heap.  It only loaded  about 3M files/dirs before 
crashing. If we optimize the PB inefficiencies, we might be able to make it 
work with 50% of the heap. But that will still be too much.


was (Author: kihwal):
Attaching heap histogram of OIV. The max heap was set to 2GB to make it go out 
of heap early and dump the heap.  I only loaded up about 3M files/dirs before 
crashing. If we optimize the PB inefficiencies, we might be able to make it 
work with 50% of the heap. But that will still be too much.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983131#comment-13983131
 ] 

Kihwal Lee commented on HDFS-6293:
--

The 2.4.0 pb-fsimage does contain tokens, but OIV does not show any tokens.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-4793) uploading file larger than the spaceQuota limit should not create 0 byte file

2014-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze resolved HDFS-4793.
---

Resolution: Duplicate

Duplicate HDFS-172.

 uploading file larger than the spaceQuota limit should not create 0 byte file
 -

 Key: HDFS-4793
 URL: https://issues.apache.org/jira/browse/HDFS-4793
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0
Reporter: Yesha Vora

 Set the spaceQuota size for Dir A = 64MB. 
 Try to upload a largefile of 1GB in Dir A. 
 The copyFromLocal operation fails but creates 0 byte file on HDFS. 
 [User1@Machine1]$ hadoop fs -ls /A
 Found 1 items
 -rwx--   1 User1 User1  0 2013-05-02 22:29 /A/1GB
 Expected Behavior:
 It should not create any 0 byte File



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6287) Add vecsum test of libhdfs read access times

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983196#comment-13983196
 ] 

Colin Patrick McCabe commented on HDFS-6287:


OK, time to implement auto-detection of SSE, I guess...

 Add vecsum test of libhdfs read access times
 

 Key: HDFS-6287
 URL: https://issues.apache.org/jira/browse/HDFS-6287
 Project: Hadoop HDFS
  Issue Type: Test
  Components: libhdfs, test
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6282.001.patch, HDFS-6287.002.patch


 Add vecsum, a benchmark that tests libhdfs access times.  This includes 
 short-circuit, zero-copy, and standard libhdfs access modes.  It also has a 
 local filesystem mode for comparison.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6286) adding a timeout setting for local read io

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983194#comment-13983194
 ] 

Colin Patrick McCabe commented on HDFS-6286:


It seems like enabling hedged reads (which has been merged as HDFS-5776) is a 
better solution to the problem of high-latency local reads.

bq. Per my knowledge, there's no good mechanism to cancel a running read 
io(Please correct me if it's wrong),

You are correct that there is no mechanism for userspace to cancel a 
synchronous I/O operation in the kernel.

bq. my opinion is adding a future around the read request, and we could set a 
timeout there, if the threshold reached, we can add the local node into 
deadnode probably... Any thought?

We can't afford to construct a future on each read.  Reads are often quite 
small and that would generate too much garbage.  We could potentially calculate 
the time each read took, by calling {{System.nanoTime}} or similar.  (On most 
Linux variants, this is a low-cost call which doesn't need to transition to 
kernel space.)

But setting a timeout is going to be very problematic.  For one thing, if the 
client gets a GC, all of its local reads might then shut down due to the 
timeout, which would just make performance worse.  I've seen perfectly good 
disks become slow when under heavy load, but only occasionally.

I think it's better just to use hedged reads when latency is a concern (such as 
in HBase.)  This gets you all the same benefits, and doesn't require any code 
changes.  It also benefits you when you are doing non-local reads, which this 
change would not.

 adding a timeout setting for local read io
 --

 Key: HDFS-6286
 URL: https://issues.apache.org/jira/browse/HDFS-6286
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 3.0.0, 2.4.0
Reporter: Liang Xie
Assignee: Liang Xie

 Currently, if a write or remote read requested into a sick disk, 
 DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to 
 return back. but it doesn't work on local read. Take HBase scan for example,
 DFSInputStream.read - readWithStrategy - readBuffer - 
 BlockReaderLocal.read -  dataIn.read - FileChannelImpl.read
 if it hits a bad disk, the low read io probably takes tens of seconds,  and 
 what's worse is, the DFSInputStream.read hold a lock always.
 Per my knowledge, there's no good mechanism to cancel a running read 
 io(Please correct me if it's wrong), so my opinion is adding a future around 
 the read request, and we could set a timeout there, if the threshold reached, 
 we can add the local node into deadnode probably...
 Any thought?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983198#comment-13983198
 ] 

Colin Patrick McCabe commented on HDFS-6288:


+1.  Thanks, Juan.

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Attachments: HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983206#comment-13983206
 ] 

Colin Patrick McCabe commented on HDFS-6288:


Oops, there is a test failure that needs to be addressed.  The test failure in 
{{TestPread}} is because of this new addition:

{code}
  private void doPread(FSDataInputStream stm, long position, byte[] buffer,
   int offset, int length) throws IOException {
int nread = 0;
if (!(stm.getWrappedStream() instanceof DFSInputStream)) {
  throw new IOException(not DFSInputStream);
}
...
{code}

We need to support non-{{DFSInputStream}} objects here so that we can test 
{{LocalFS}} and so forth.

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6288:
--

Attachment: HDFS-6288.002.patch

Fix unit test testPreadLocalFS

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983215#comment-13983215
 ] 

Andrew Wang commented on HDFS-6288:
---

I think Juan's new patch handles this case, so I'm +1 pending Jenkins.

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6287) Add vecsum test of libhdfs read access times

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6287:
---

Attachment: HDFS-6287.003.patch

Here's a version that tries to compile with SSE intrinsics, and falls back on a 
cross-simple platform loop if that fails.

 Add vecsum test of libhdfs read access times
 

 Key: HDFS-6287
 URL: https://issues.apache.org/jira/browse/HDFS-6287
 Project: Hadoop HDFS
  Issue Type: Test
  Components: libhdfs, test
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6282.001.patch, HDFS-6287.002.patch, 
 HDFS-6287.003.patch


 Add vecsum, a benchmark that tests libhdfs access times.  This includes 
 short-circuit, zero-copy, and standard libhdfs access modes.  It also has a 
 local filesystem mode for comparison.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-1309) FileSystem.rename will fail silently

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983259#comment-13983259
 ] 

Colin Patrick McCabe commented on HDFS-1309:


Is this still an issue in branch 2?  It looks like 
{{DistributedFileSystem#rename}} throws {{IOException}} if something goes 
wrong, and has for a while.  Unless I'm missing something, rename is not a 
filesystem operation that fails silently.

 FileSystem.rename will fail silently
 

 Key: HDFS-1309
 URL: https://issues.apache.org/jira/browse/HDFS-1309
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 0.20.2
 Environment: Linux version 2.6.31-302-ec2 (buildd@yellow) (gcc 
 version 4.4.1 (Ubuntu 4.4.1-4ubuntu7) ) #7-Ubuntu SMP Tue Oct 13 19:55:22 UTC 
 2009
Reporter: Kris Nuttycombe
Priority: Minor

 Some filesystem operations (such as rename) will fail silently. In the 
 attached example, a failure message will be written to the hadoop log, but it 
 would be much better if the operation were to fail fast by throwing a 
 checked exception and forcing the caller to handle the problem; failing to do 
 so can easily lead to inadvertent data corruption.
   val coalesceBasePath = new Path(eventLog.basePath, coalesceTo)
   val backupBasePath = new Path(eventLog.basePath, relocateTo)
 eventLog.fs.mkdirs(backupBasePath)
 for (path - coalesced; time - HDFSEventLog.timePart(path, eventType)) {
   val backupPath = HDFSEventLog.path(backupBasePath, eventType, time)
   log.info(Relocating  + path +  to  + backupPath)
   eventLog.fs.rename(path, backupPath)
 }
 INF [20100715-16:11:20.727] reporting: Relocating 
 hdfs://localhost:9000/test-batchEventLog/metrics/metrics_1279226067707 to 
 hdfs://localhost:9000/test-batchEventLog/pre-coalesce/metrics/metrics_1279226067707
 INF [20100715-16:11:20.752] reporting: Relocating 
 hdfs://localhost:9000/test-batchEventLog/metrics/metrics_1279226077707 to 
 hdfs://localhost:9000/test-batchEventLog/pre-coalesce/metrics/metrics_1279226077707
 INF [20100715-16:11:20.754] reporting: Relocating 
 hdfs://localhost:9000/test-batchEventLog/metrics/metrics_1279226457707 to 
 hdfs://localhost:9000/test-batchEventLog/pre-coalesce/metrics/metrics_1279226457707
 INF [20100715-16:11:20.757] reporting: Relocating 
 hdfs://localhost:9000/test-batchEventLog/metrics/metrics_1279229126727 to 
 hdfs://localhost:9000/test-batchEventLog/pre-coalesce/metrics/metrics_1279229126727
 Complete.
 [knuttycombe@floorshow reporting (reporting-coalesce)]$ hadoop fs -ls 
 hdfs://localhost:9000/test-batchEventLog/
 Found 3 items
 drwxr-xr-x   - knuttycombe supergroup  0 2010-07-15 14:54 
 /test-batchEventLog/coalesced
 drwxr-xr-x   - knuttycombe supergroup  0 2010-07-15 14:35 
 /test-batchEventLog/metrics
 drwxr-xr-x   - knuttycombe supergroup  0 2010-07-15 16:11 
 /test-batchEventLog/pre-coalesce
 [knuttycombe@floorshow reporting (reporting-coalesce)]$ hadoop fs -ls 
 hdfs://localhost:9000/test-batchEventLog/metrics
 Found 4 items
 -rw-r--r--   3 knuttycombe supergroup2017122 2010-07-15 14:34 
 /test-batchEventLog/metrics/metrics_1279226067707
 -rw-r--r--   3 knuttycombe supergroup4122951 2010-07-15 14:34 
 /test-batchEventLog/metrics/metrics_1279226077707
 -rw-r--r--   3 knuttycombe supergroup512 2010-07-15 14:35 
 /test-batchEventLog/metrics/metrics_1279226457707
 -rw-r--r--   3 knuttycombe supergroup8638301 2010-07-15 14:26 
 /test-batchEventLog/metrics/metrics_1279229126727
 [knuttycombe@floorshow reporting (reporting-coalesce)]$ hadoop fs -ls 
 hdfs://localhost:9000/test-batchEventLog/pre-coalesce
 [knuttycombe@floorshow reporting (reporting-coalesce)]$



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6269) NameNode Audit Log should differentiate between webHDFS open and HDFS open.

2014-04-28 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983265#comment-13983265
 ] 

Daryn Sharp commented on HDFS-6269:
---

+1 Looks good to me.

 NameNode Audit Log should differentiate between webHDFS open and HDFS open.
 ---

 Key: HDFS-6269
 URL: https://issues.apache.org/jira/browse/HDFS-6269
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, webhdfs
Affects Versions: 2.4.0
Reporter: Eric Payne
Assignee: Eric Payne
 Attachments: HDFS-6269-AuditLogWebOpen.txt, 
 HDFS-6269-AuditLogWebOpen.txt, HDFS-6269-AuditLogWebOpen.txt


 To enhance traceability, the NameNode audit log should use a different string 
 for open in the cmd= part of the audit entry.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6294:
--

 Summary: Use INode IDs to avoid conflicts when a file open for 
write is renamed
 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


Now that we have a unique INode ID for each INode, clients with files that are 
open for write can use this unique ID rather than a file path when they are 
requesting more blocks or closing the open file.  This will avoid conflicts 
when a file which is open for write is renamed, and another file with that name 
is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6294:
---

Attachment: HDFS-6294.001.patch

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983282#comment-13983282
 ] 

Colin Patrick McCabe commented on HDFS-6294:


One reason why this comes up is because of the NFS gateway.  Since the gateway 
keeps files open for about 10 minutes after the last packet arrives from the 
client (by default, at least), there are a lot of times when someone copies a 
file to NFS via the gateway and then moves it.

My approach was just to use inode IDs for all the operations done by a file 
open for write: {{complete}}, {{addBlock}}, {{fsync}},  {{abandonBlock}}, and 
{{getAdditionalDataNodes}}.  In the cases where an inode ID was not being 
passed over the wire, I added one to the protobuf.  This is backwards 
compatible because the new protobuf fields are optional.  If the inode ID is 
not present, we fall back on the old behavior of using the full path instead.

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983286#comment-13983286
 ] 

Steve Loughran commented on HDFS-6294:
--

There's a test for this in HADOOP-9361 which attempts to [rename a file being 
appended 
to|https://github.com/steveloughran/hadoop-trunk/blob/stevel/HADOOP-9361-filesystem-contract/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractAppendContractTest.java#L113].
 Presumably that test will pass once this patch has gone through?

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6287) Add vecsum test of libhdfs read access times

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983287#comment-13983287
 ] 

Hadoop QA commented on HDFS-6287:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642283/HDFS-6287.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6756//console

This message is automatically generated.

 Add vecsum test of libhdfs read access times
 

 Key: HDFS-6287
 URL: https://issues.apache.org/jira/browse/HDFS-6287
 Project: Hadoop HDFS
  Issue Type: Test
  Components: libhdfs, test
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6282.001.patch, HDFS-6287.002.patch, 
 HDFS-6287.003.patch


 Add vecsum, a benchmark that tests libhdfs access times.  This includes 
 short-circuit, zero-copy, and standard libhdfs access modes.  It also has a 
 local filesystem mode for comparison.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6294:
---

Attachment: (was: HDFS-6294.001.patch)

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983288#comment-13983288
 ] 

Marcelo Vanzin commented on HDFS-6293:
--

Hi Kihwal,

We have developed some code internally that mitigates (but does not eliminate) 
some of these problems. For an image with 140M entries it would need in the 
ballpark of 7-8GB of heap space, from my pencil-and-napkin calculations. Also, 
it does not generate entries in order like LsrPBImage does, and it's tailored 
for the use case of listing the contents of the file system (so it completely 
ignores things like snapshots).

(The reason it still requires a lot of memory is, as you note, that it needs to 
load information about all inodes in memory; our code is just a little smarter 
about what information it loads. I don't think it's possible to make it much 
better without changing the data in the fsimage itself.)

If people are ok with those limitations, we could clean up our code and post it 
as a patch.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6294:
---

Attachment: HDFS-6294.001.patch

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6294:
---

Status: Patch Available  (was: Open)

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6287) Add vecsum test of libhdfs read access times

2014-04-28 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6287:
---

Attachment: HDFS-6287.004.patch

Looks like on older glibc versions like the one our jenkins machines are using, 
you needed to link with librt to use {{clock_gettime}}.  Added.  I also fixed a 
warning message in {{test_libhdfs_threaded}}

 Add vecsum test of libhdfs read access times
 

 Key: HDFS-6287
 URL: https://issues.apache.org/jira/browse/HDFS-6287
 Project: Hadoop HDFS
  Issue Type: Test
  Components: libhdfs, test
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6282.001.patch, HDFS-6287.002.patch, 
 HDFS-6287.003.patch, HDFS-6287.004.patch


 Add vecsum, a benchmark that tests libhdfs access times.  This includes 
 short-circuit, zero-copy, and standard libhdfs access modes.  It also has a 
 local filesystem mode for comparison.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6252) Namenode old webUI should be deprecated

2014-04-28 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983314#comment-13983314
 ] 

Jing Zhao commented on HDFS-6252:
-

The current patch looks good to me. The failed tests should be unrelated. +1

Maybe we should wait for a couple days to let others who are still using the 
old WebUI review the patch also. We may also need to open new jiras to add 
those that are only in the old UI to the new UI.

 Namenode old webUI should be deprecated
 ---

 Key: HDFS-6252
 URL: https://issues.apache.org/jira/browse/HDFS-6252
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Fengdong Yu
Assignee: Haohui Mai
Priority: Minor
 Attachments: HDFS-6252.000.patch, HDFS-6252.001.patch, 
 HDFS-6252.002.patch, HDFS-6252.003.patch, HDFS-6252.004.patch, 
 HDFS-6252.005.patch, HDFS-6252.006.patch


 We've deprecated hftp and hsftp in HDFS-5570, so if we always download file 
 from download this file on the browseDirectory.jsp, it will throw an error:
 Problem accessing /streamFile/***
 because streamFile servlet was deleted in HDFS-5570.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6252) Namenode old webUI should be deprecated

2014-04-28 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6252:


Target Version/s: 3.0.0

 Namenode old webUI should be deprecated
 ---

 Key: HDFS-6252
 URL: https://issues.apache.org/jira/browse/HDFS-6252
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Fengdong Yu
Assignee: Haohui Mai
Priority: Minor
 Attachments: HDFS-6252.000.patch, HDFS-6252.001.patch, 
 HDFS-6252.002.patch, HDFS-6252.003.patch, HDFS-6252.004.patch, 
 HDFS-6252.005.patch, HDFS-6252.006.patch


 We've deprecated hftp and hsftp in HDFS-5570, so if we always download file 
 from download this file on the browseDirectory.jsp, it will throw an error:
 Problem accessing /streamFile/***
 because streamFile servlet was deleted in HDFS-5570.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983334#comment-13983334
 ] 

Colin Patrick McCabe commented on HDFS-6294:


bq. There's a test for this in HADOOP-9361 which attempts to rename a file 
being appended to. Presumably that test will pass once this patch has gone 
through?

Yeah, I believe that will pass after this patch.

There is also a test included as part of this patch which is HDFS-specific, 
called {{testLeaseAfterRenameAndRecreate}}, which tests a similar thing.  The 
HDFS test also looks at some HDFS-specific stuff like leases.

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983361#comment-13983361
 ] 

Haohui Mai commented on HDFS-6293:
--

bq. Another issue is the complete change of format/content in OIV's XML output.

The XML format in both the legacy and the PB-base code intends to match the 
physical layout of the FSImage for fast processing. The layout of the FSImage 
is totally private, which means that there are very few compatibility 
guarantees that you can rely on. We should have clarify it early on.

bq.  It does not provide readily usable directory/file information as it used 
to in pre-2.4/protobuf versions.

This is by design. A format based on records instead of hierarchical structure 
is more robust (especially with snapshot), and it allows parallel processing. 
The rationale has been articulated in the document attached on HDFS-5698.

With a FSImage that is as big as yours, I suggest parsing the protobuf records 
directly and importing them to hive / pig for more efficient queries. This has 
been articulated in HDFS-5952.

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983362#comment-13983362
 ] 

Kihwal Lee commented on HDFS-6293:
--

[~va...@rededc.com.br]: Thanks for sharing your experience. That's certainly an 
improvement, but that's still too big and 140M is not the largest name space we 
have to deal with.  

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983412#comment-13983412
 ] 

Kihwal Lee commented on HDFS-6293:
--

bq. This is by design.
I understand that it has merits over the old way. But you cannot simply ignore 
existing use cases.  

 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6258) Support XAttrs from NameNode and implements XAttr APIs for DistributedFileSystem

2014-04-28 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983413#comment-13983413
 ] 

Charles Lamb commented on HDFS-6258:


Yi,

Probably the easiest thing to do is for you to commit your patch and then I'll 
generate a patch with my comments.

Charles


 Support XAttrs from NameNode and implements XAttr APIs for 
 DistributedFileSystem
 

 Key: HDFS-6258
 URL: https://issues.apache.org/jira/browse/HDFS-6258
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS XAttrs (HDFS-2006)
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6258.1.patch, HDFS-6258.2.patch, HDFS-6258.3.patch, 
 HDFS-6258.patch


 This JIRA is to implement extended attributes in HDFS: support XAttrs from 
 NameNode, implements XAttr APIs for DistributedFileSystem and so on.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983459#comment-13983459
 ] 

Hadoop QA commented on HDFS-6288:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642270/HDFS-6288.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6755//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6755//console

This message is automatically generated.

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6288:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks for the contribution Juan!

 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983508#comment-13983508
 ] 

Hudson commented on HDFS-6288:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5581 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5581/])
HDFS-6288. DFSInputStream Pread doesn't update ReadStatistics. Contributed by 
Juan Yu. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1590776)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java


 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6288) DFSInputStream Pread doesn't update ReadStatistics

2014-04-28 Thread Juan Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983511#comment-13983511
 ] 

Juan Yu commented on HDFS-6288:
---

cool, thanks.





 DFSInputStream Pread doesn't update ReadStatistics
 --

 Key: HDFS-6288
 URL: https://issues.apache.org/jira/browse/HDFS-6288
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6288.002.patch, HDFS-6288.1.patch


 DFSInputStream Pread doesn't update ReadStatistics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6287) Add vecsum test of libhdfs read access times

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983547#comment-13983547
 ] 

Hadoop QA commented on HDFS-6287:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642293/HDFS-6287.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6757//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6757//console

This message is automatically generated.

 Add vecsum test of libhdfs read access times
 

 Key: HDFS-6287
 URL: https://issues.apache.org/jira/browse/HDFS-6287
 Project: Hadoop HDFS
  Issue Type: Test
  Components: libhdfs, test
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6282.001.patch, HDFS-6287.002.patch, 
 HDFS-6287.003.patch, HDFS-6287.004.patch


 Add vecsum, a benchmark that tests libhdfs access times.  This includes 
 short-circuit, zero-copy, and standard libhdfs access modes.  It also has a 
 local filesystem mode for comparison.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-5851) Support memory as a storage medium

2014-04-28 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981608#comment-13981608
 ] 

Sanjay Radia edited comment on HDFS-5851 at 4/28/14 10:23 PM:
--

Added comparison to Tachyon in the doc. The is also an implementation 
difference that I don't cover (Tachyon I believe uses RamFs rather than a 
memory that is mapped to a HDFS file -- but need to verify that).

I have reproduced the text from the updated doc here for convenience:
Recently, Spark has added an RDD implementation called Tachyon [4]. Tachyon is 
outside the address space of an application and allows sharing RDDs across 
applications. Both Tachyon and DDMs use memory mapped files and lazy writing to 
reduce the need to recompute. Tachyon, since it is an RDD implementation, 
records the computation in order to regenerate the data in case of loss whereas 
DDMs relies on the application to regenerate. Tachyon and RDDs do not have a 
notion of discardability, which is fundamental to DDMs where data can be 
discarded when it is under memory and/or backing store pressure. DDMs are 
closest to virtual memory/anti-caching in that they virtualize memory, with the 
twist that data can be discarded.



was (Author: sanjay.radia):
Added comparison to Tachyon in the doc. The is also an implementation 
difference that I don't cover (Tachyon I believe uses RamFs rather than a 
memory that is mapped to a HDFS file -- but need to verify that).

I have reproduced the text from the updated doc here for convenience:
Recently, Spark has added an RDD implementation called Tachyon [4]. Tachyon is 
outside the address space of an application and allows sharing RDDs across 
applications. Both Tachyon and DDMs use memory mapped files and lazy writing to 
reduce the need to recompute. Tachyon, since it is an RDD implementation, 
records the computation in order to regenerate the data in case of loss whereas 
DDMs relies on the application to regenerate. Tachyon and RDDs do not have a 
notion of discardability, which is fundamental to DDMs where data can be 
discarded when it is under memory and/or backing store pressure.


 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983679#comment-13983679
 ] 

Aaron T. Myers commented on HDFS-6289:
--

I feel confident that the TestBalancerWithNodeGroup failure is spurious. It 
passes fine on my box, isn't really related to this code, and has been flaky 
off and on for a long time.

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6294) Use INode IDs to avoid conflicts when a file open for write is renamed

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983692#comment-13983692
 ] 

Hadoop QA commented on HDFS-6294:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642289/HDFS-6294.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.TestSymlinkHdfsFileContext
  org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication
  
org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks
  org.apache.hadoop.hdfs.server.namenode.TestMetaSave
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion
  
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
  org.apache.hadoop.hdfs.TestFileCreation
  org.apache.hadoop.hdfs.server.namenode.TestFSDirectory
  org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
  org.apache.hadoop.hdfs.TestQuota
  org.apache.hadoop.hdfs.TestFileAppend3
  org.apache.hadoop.fs.TestHDFSFileContextMainOperations
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
  org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode
  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.fs.TestSymlinkHdfsFileSystem
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
  org.apache.hadoop.hdfs.server.namenode.TestINodeFile
  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat
  org.apache.hadoop.hdfs.server.namenode.TestFileLimit
  org.apache.hadoop.cli.TestHDFSCLI
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport
  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks
org.apache.hadoop.hdfs.TestSetrepDecreasing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6758//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6758//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6758//console

This message is automatically generated.

 Use INode IDs to avoid conflicts when a file open for write is renamed
 --

 Key: HDFS-6294
 URL: https://issues.apache.org/jira/browse/HDFS-6294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.20.1
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6294.001.patch


 Now that we have a unique INode ID for each INode, clients with files that 
 are open for write can use this unique ID rather than a file path when they 
 are requesting more blocks or closing the open file.  This will avoid 
 conflicts when a file which is open for write is renamed, and another file 
 with that name is created.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6293) Issues with OIV processing PB-based fsimages

2014-04-28 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983734#comment-13983734
 ] 

Suresh Srinivas commented on HDFS-6293:
---

OfflineImageViewer just dumps the fsimage in a readable format. In the past 
given hierarchical nature of the fsimage, the information printed was 
consumable. Now it is no longer so.

One solution we can do is - Add an option to print directory tree information 
(along the lines ls -r) that works against fsimage. Given that the information 
is printed is no longer dependent on fsimage structure itself, this can be 
backward compatible output (with the caveats tools having to deal with extra 
information for newly added features such as ACLs). Once this is in place, we 
can have backward compatibility expectations on that. What do you guys think? 
We could also consider either building a tool that works efficiently in memory 
or reorganize the fsimage to make that possible (hope we do not have to change 
fsimage, due to incompatibility issues).

[~kihwal], can you please provide the use cases you are using OIV for?


 Issues with OIV processing PB-based fsimages
 

 Key: HDFS-6293
 URL: https://issues.apache.org/jira/browse/HDFS-6293
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: Heap Histogram.html


 There are issues with OIV when processing fsimages in protobuf. 
 Due to the internal layout changes introduced by the protobuf-based fsimage, 
 OIV consumes excessive amount of memory.  We have tested with a fsimage with 
 about 140M files/directories. The peak heap usage when processing this image 
 in pre-protobuf (i.e. pre-2.4.0) format was about 350MB.  After converting 
 the image to the protobuf format on 2.4.0, OIV would OOM even with 80GB of 
 heap (max new size was 1GB).  It should be possible to process any image with 
 the default heap size of 1.5GB.
 Another issue is the complete change of format/content in OIV's XML output.  
 I also noticed that the secret manager section has no tokens while there were 
 unexpired tokens in the original image (pre-2.4.0).  I did not check whether 
 they were also missing in the new pb fsimage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983802#comment-13983802
 ] 

Todd Lipcon commented on HDFS-6289:
---

{code}
+// TODO(atm): This should be s/storedBlock/block, since we should be
+// postponing the info of the reported block, not the stored block,
+// though that actually exacerbates the bug, doesn't fix it.
{code}

Out of context, this comment won't make much sense -- what's the bug it's refe
rring to? Maybe you should file a separate follow-up JIRA here for this second i
ssue, since you aren't fixing it here?

Otherwise lgtm.

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983805#comment-13983805
 ] 

Aaron T. Myers commented on HDFS-6289:
--

Thanks for the review, Todd.

bq. Maybe you should file a separate follow-up JIRA here for this second issue, 
since you aren't fixing it here?

I could also just fix it here. It seems pretty transparently obvious that we 
should make that change. Do you agree? If so, I'll just post a patch fixing 
that as well.

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983825#comment-13983825
 ] 

Todd Lipcon commented on HDFS-6289:
---

Is there any test you could write to show that bug? I agree with your logic, 
but surprised that there isn't some bug that it causes. Given that the current 
test isn't a regression test for that bug, maybe should tackle it separately?

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5851) Support memory as a storage medium

2014-04-28 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983835#comment-13983835
 ] 

Arpit Agarwal commented on HDFS-5851:
-

I scheduled a Google+ hangout for 4/30 3-4pm PDT - [link 
here|https://plus.google.com/events/ckvo7ui46qihd6cfq0sqptrhogo?authkey=CMvgrcTOv9n12wE].

Let me know if you are unable to access it.

 Support memory as a storage medium
 --

 Key: HDFS-5851
 URL: https://issues.apache.org/jira/browse/HDFS-5851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, 
 SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf


 Memory can be used as a storage medium for smaller/transient files for fast 
 write throughput.
 More information/design will be added later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6289:
-

Attachment: HDFS-6289.patch

You're right, should probably take care of this separately. I only incidentally 
discovered it while looking into this issue, but it really is a separate bug. 
I'll file another JIRA once this one's committed.

Latest patch makes the TODO more clear when read out of context, and addresses 
Yongjun's feedback.

Todd, this look OK to you?

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch, HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6295) Add decommissioning state and node state filtering to dfsadmin

2014-04-28 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-6295:
-

 Summary: Add decommissioning state and node state filtering to 
dfsadmin
 Key: HDFS-6295
 URL: https://issues.apache.org/jira/browse/HDFS-6295
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang


One of the few admin-friendly ways of viewing the list of decommissioning nodes 
is via hdfs dfsadmin -report. However, this lists *all* the datanodes on the 
cluster, which is prohibitive for large clusters, and also requires manual 
parsing to look at the decom status. It'd be nicer if we could fetch and 
display only decommissioning nodes (or just live and dead nodes for that 
matter).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6295) Add decommissioning state and node state filtering to dfsadmin

2014-04-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6295:
--

Attachment: hdfs-6295-1.patch

Patch attached. This adds a new DN state decommissioning, and lets users 
query just certain states through dfsadmin.

This meant changing the output format of -report slightly. I also took the 
opportunity to fix up the whitespace in the report function (had an extra 
indent). Also improved the whitespace for dfsadmin's usage/help related to 
report.

 Add decommissioning state and node state filtering to dfsadmin
 

 Key: HDFS-6295
 URL: https://issues.apache.org/jira/browse/HDFS-6295
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6295-1.patch


 One of the few admin-friendly ways of viewing the list of decommissioning 
 nodes is via hdfs dfsadmin -report. However, this lists *all* the datanodes 
 on the cluster, which is prohibitive for large clusters, and also requires 
 manual parsing to look at the decom status. It'd be nicer if we could fetch 
 and display only decommissioning nodes (or just live and dead nodes for that 
 matter).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6295) Add decommissioning state and node state filtering to dfsadmin

2014-04-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6295:
--

Status: Patch Available  (was: Open)

 Add decommissioning state and node state filtering to dfsadmin
 

 Key: HDFS-6295
 URL: https://issues.apache.org/jira/browse/HDFS-6295
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6295-1.patch


 One of the few admin-friendly ways of viewing the list of decommissioning 
 nodes is via hdfs dfsadmin -report. However, this lists *all* the datanodes 
 on the cluster, which is prohibitive for large clusters, and also requires 
 manual parsing to look at the decom status. It'd be nicer if we could fetch 
 and display only decommissioning nodes (or just live and dead nodes for that 
 matter).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2014-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983870#comment-13983870
 ] 

Tsz Wo Nicholas Sze commented on HDFS-2882:
---

Vinay, the patch cannot be applied anymore.  Could you update it?

 DN continues to start up, even if block pool fails to initialize
 

 Key: HDFS-2882
 URL: https://issues.apache.org/jira/browse/HDFS-2882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Vinayakumar B
 Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
 HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, hdfs-2882.txt


 I started a DN on a machine that was completely out of space on one of its 
 drives. I saw the following:
 2012-02-02 09:56:50,499 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
 block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
 DS-507718931-172.29.5.194-11072-12978
 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
 java.io.IOException: Mkdirs failed to create 
 /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
 at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335)
 but the DN continued to run, spewing NPEs when it tried to do block reports, 
 etc. This was on the HDFS-1623 branch but may affect trunk as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-4211) failed volume causes DataNode#getVolumeInfo NPEs on multi-BP DN

2014-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze resolved HDFS-4211.
---

Resolution: Duplicate

Resolving this as a duplicate of HDFS-2882.

 failed volume causes DataNode#getVolumeInfo NPEs on multi-BP DN
 ---

 Key: HDFS-4211
 URL: https://issues.apache.org/jira/browse/HDFS-4211
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson

 On a DN with {{failed.volumes.tolerated=0}} a disk went bad. After restarting 
 the DN, the following backtrace was observed when accessing {{/jmx}}:
 {code}
 2012-06-12 16:21:43,248 ERROR org.apache.hadoop.jmx.JMXJsonServlet:
 getting attribute VolumeInfo of
 Hadoop:service=DataNode,name=DataNodeInfo threw an exception
 javax.management.RuntimeMBeanException: java.lang.NullPointerException
at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:856)
at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:869)
at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:670)
at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at 
 org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:315)
at 
 org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:293)
at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:193)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at 
 org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:947)
at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.lang.NullPointerException
at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getVolumeInfo(DataNode.java:2130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
 com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:167)
at 
 com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:96)
at 
 com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:33)
at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at 
 com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
 {code}
 Since tolerated=0 the DN should have errored out rather than starting up, but 
 due to having multiple BPs configured the DN does not exit correctly in this 
 situation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983920#comment-13983920
 ] 

Hadoop QA commented on HDFS-2882:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615147/HDFS-2882.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6760//console

This message is automatically generated.

 DN continues to start up, even if block pool fails to initialize
 

 Key: HDFS-2882
 URL: https://issues.apache.org/jira/browse/HDFS-2882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Vinayakumar B
 Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
 HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, hdfs-2882.txt


 I started a DN on a machine that was completely out of space on one of its 
 drives. I saw the following:
 2012-02-02 09:56:50,499 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
 block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
 DS-507718931-172.29.5.194-11072-12978
 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
 java.io.IOException: Mkdirs failed to create 
 /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
 at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335)
 but the DN continued to run, spewing NPEs when it tried to do block reports, 
 etc. This was on the HDFS-1623 branch but may affect trunk as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983926#comment-13983926
 ] 

Hadoop QA commented on HDFS-2882:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615147/HDFS-2882.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6761//console

This message is automatically generated.

 DN continues to start up, even if block pool fails to initialize
 

 Key: HDFS-2882
 URL: https://issues.apache.org/jira/browse/HDFS-2882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Vinayakumar B
 Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
 HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, hdfs-2882.txt


 I started a DN on a machine that was completely out of space on one of its 
 drives. I saw the following:
 2012-02-02 09:56:50,499 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
 block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
 DS-507718931-172.29.5.194-11072-12978
 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
 java.io.IOException: Mkdirs failed to create 
 /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
 at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335)
 but the DN continued to run, spewing NPEs when it tried to do block reports, 
 etc. This was on the HDFS-1623 branch but may affect trunk as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5168) BlockPlacementPolicy does not work for cross node group dependencies

2014-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-5168:
--

Component/s: namenode

 BlockPlacementPolicy does not work for cross node group dependencies
 

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Nikola Vujic
Assignee: Nikola Vujic
Priority: Critical
 Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, 
 HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch


 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5168) BlockPlacementPolicy does not work for cross node group dependencies

2014-04-28 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983960#comment-13983960
 ] 

Tsz Wo Nicholas Sze commented on HDFS-5168:
---

DNSToSwitchMapping is a \@Public \@Evolving interface so that we have to change 
it in a compatible manner (otherwise, we cannot commit this to branch-2) .  We 
should avoid adding the new getDependency(..) method to it.  How about we add 
another interface class, say DNSToSwitchMappingWithDependency, and keep 
DNSToSwitchMapping unchanged?

More details:
- DNSToSwitchMappingWithDependency extends DNSToSwitchMapping and adds the new 
getDependency(..) method.
- ScriptBasedMappingWithDependency extends ScriptBasedMapping and 
RawScriptBasedMappingWithDependency extends RawScriptBasedMapping; change 
ScriptBasedMapping and RawScriptBasedMapping to allow inheritance.
- Add dependency cache support to ScriptBasedMappingWithDependency.
- DatanodeManager checks if dnsToSwitchMapping instanceof 
DNSToSwitchMappingWithDependency.  If yes, cast the object and get 
dependencies; otherwise, use empty list.
- CachedDNSToSwitchMapping and TableMapping remains unchanged.

 BlockPlacementPolicy does not work for cross node group dependencies
 

 Key: HDFS-5168
 URL: https://issues.apache.org/jira/browse/HDFS-5168
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Nikola Vujic
Assignee: Nikola Vujic
Priority: Critical
 Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, 
 HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch


 Block placement policies do not work for cross rack/node group dependencies. 
 In reality this is needed when compute servers and storage fall in two 
 independent fault domains, then both BlockPlacementPolicyDefault and 
 BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
 placement.
 Let's suppose that we have Hadoop cluster with one rack with two servers, and 
 we run 2 VMs per server. Node group topology for this cluster would be:
  server1-vm1 - /d1/r1/n1
  server1-vm2 - /d1/r1/n1
  server2-vm1 - /d1/r1/n2
  server2-vm2 - /d1/r1/n2
 This is working fine as long as server and storage fall into the same fault 
 domain but if storage is in a different fault domain from the server, we will 
 not be able to handle that. For example, if storage of server1-vm1 is in the 
 same fault domain as storage of server2-vm1, then we must not place two 
 replicas on these two nodes although they are in different node groups.
 Two possible approaches:
 - One approach would be to define cross rack/node group dependencies and to 
 use them when excluding nodes from the search space. This looks as the 
 cleanest way to fix this as it requires minor changes in the 
 BlockPlacementPolicy classes.
 - Other approach would be to allow nodes to fall in more than one node group. 
 When we chose a node to hold a replica we have to exclude from the search 
 space all nodes from the node groups where the chosen node belongs. This 
 approach may require major changes in the NetworkTopology.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983964#comment-13983964
 ] 

Hadoop QA commented on HDFS-6289:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642380/HDFS-6289.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6759//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6759//console

This message is automatically generated.

 HA failover can fail if there are pending DN messages for DNs which no longer 
 exist
 ---

 Key: HDFS-6289
 URL: https://issues.apache.org/jira/browse/HDFS-6289
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical
 Attachments: HDFS-6289.patch, HDFS-6289.patch


 In an HA setup, the standby NN may receive messages from DNs for blocks which 
 the standby NN is not yet aware of. It queues up these messages and replays 
 them when it next reads from the edit log or fails over. On a failover, all 
 of these pending DN messages must be processed successfully in order for the 
 failover to succeed. If one of these pending DN messages refers to a DN 
 storageId that no longer exists (because the DN with that transfer address 
 has been reformatted and has re-registered with the same transfer address) 
 then on transition to active the NN will not be able to process this DN 
 message and will suicide with an error like the following:
 {noformat}
 2014-04-25 14:23:17,922 FATAL namenode.NameNode 
 (NameNode.java:doImmediateShutdown(1525)) - Error encountered requiring NN 
 shutdown. Shutting down immediately.
 java.io.IOException: Cannot mark 
 blk_1073741825_900(stored=blk_1073741825_1001) as corrupt because datanode 
 127.0.0.1:33324 does not exist
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6165) hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory

2014-04-28 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6165:


Attachment: HDFS-6165.005.patch

 hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory 
 --

 Key: HDFS-6165
 URL: https://issues.apache.org/jira/browse/HDFS-6165
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6165.001.patch, HDFS-6165.002.patch, 
 HDFS-6165.003.patch, HDFS-6165.004.patch, HDFS-6165.004.patch, 
 HDFS-6165.005.patch


 Given a directory owned by user A with WRITE permission containing an empty 
 directory owned by user B, it is not possible to delete user B's empty 
 directory with either hdfs dfs -rm -r or hdfs dfs -rmdir. Because the 
 current implementation requires FULL permission of the empty directory, and 
 throws exception. 
 On the other hand, on linux, rm -r and rmdir command can remove empty 
 directory as long as the parent directory has WRITE permission (and prefix 
 component of the path have EXECUTE permission), For the tested OSes, some 
 prompt user asking for confirmation, some don't.
 Here's a reproduction:
 {code}
 [root@vm01 ~]# hdfs dfs -ls /user/
 Found 4 items
 drwxr-xr-x   - userabc users   0 2013-05-03 01:55 /user/userabc
 drwxr-xr-x   - hdfssupergroup  0 2013-05-03 00:28 /user/hdfs
 drwxrwxrwx   - mapred  hadoop  0 2013-05-03 00:13 /user/history
 drwxr-xr-x   - hdfssupergroup  0 2013-04-14 16:46 /user/hive
 [root@vm01 ~]# hdfs dfs -ls /user/userabc
 Found 8 items
 drwx--   - userabc users  0 2013-05-02 17:00 /user/userabc/.Trash
 drwxr-xr-x   - userabc users  0 2013-05-03 01:34 /user/userabc/.cm
 drwx--   - userabc users  0 2013-05-03 01:06 
 /user/userabc/.staging
 drwxr-xr-x   - userabc users  0 2013-04-14 18:31 /user/userabc/apps
 drwxr-xr-x   - userabc users  0 2013-04-30 18:05 /user/userabc/ds
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:54 /user/userabc/foo
 drwxr-xr-x   - userabc users  0 2013-04-30 16:18 
 /user/userabc/maven_source
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:40 
 /user/userabc/test-restore
 [root@vm01 ~]# hdfs dfs -ls /user/userabc/foo/
 [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -r -skipTrash /user/userabc/foo
 rm: Permission denied: user=userabc, access=ALL, 
 inode=/user/userabc/foo:hdfs:users:drwxr-xr-x
 {code}
 The super user can delete the directory.
 {code}
 [root@vm01 ~]# sudo -u hdfs hdfs dfs -rm -r -skipTrash /user/userabc/foo
 Deleted /user/userabc/foo
 {code}
 The same is not true for files, however. They have the correct behavior.
 {code}
 [root@vm01 ~]# sudo -u hdfs hdfs dfs -touchz /user/userabc/foo-file
 [root@vm01 ~]# hdfs dfs -ls /user/userabc/
 Found 8 items
 drwx--   - userabc users  0 2013-05-02 17:00 /user/userabc/.Trash
 drwxr-xr-x   - userabc users  0 2013-05-03 01:34 /user/userabc/.cm
 drwx--   - userabc users  0 2013-05-03 01:06 
 /user/userabc/.staging
 drwxr-xr-x   - userabc users  0 2013-04-14 18:31 /user/userabc/apps
 drwxr-xr-x   - userabc users  0 2013-04-30 18:05 /user/userabc/ds
 -rw-r--r--   1 hdfsusers  0 2013-05-03 02:11 
 /user/userabc/foo-file
 drwxr-xr-x   - userabc users  0 2013-04-30 16:18 
 /user/userabc/maven_source
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:40 
 /user/userabc/test-restore
 [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -skipTrash /user/userabc/foo-file
 Deleted /user/userabc/foo-file
 {code}
 Using hdfs dfs -rmdir command:
 {code}
 bash-4.1$ hadoop fs -lsr /
 lsr: DEPRECATED: Please use 'ls -R' instead.
 drwxr-xr-x   - hdfs supergroup  0 2014-03-25 16:29 /user
 drwxr-xr-x   - hdfs   supergroup  0 2014-03-25 16:28 /user/hdfs
 drwxr-xr-x   - usrabc users   0 2014-03-28 23:39 /user/usrabc
 drwxr-xr-x   - abcabc 0 2014-03-28 23:39 
 /user/usrabc/foo-empty1
 [root@vm01 usrabc]# su usrabc
 [usrabc@vm01 ~]$ hdfs dfs -rmdir /user/usrabc/foo-empty1
 rmdir: Permission denied: user=usrabc, access=ALL, 
 inode=/user/usrabc/foo-empty1:abc:abc:drwxr-xr-x
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6165) hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory

2014-04-28 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983980#comment-13983980
 ] 

Yongjun Zhang commented on HDFS-6165:
-

Hi,

Thanks a lot for your earlier comments, and thanks Andrew a lot for the 
detailed review!

 I just updated patch version 005 to address all.

For rmdir, it's the solution I described in above;
For rmr solution, I actually did
{code}
void checkPermission(String path, INodeDirectory root, boolean doCheckOwner,
  FsAction ancestorAccess, FsAction parentAccess, FsAction access,
  FsAction subAccess, boolean ignoreEmptyDir, boolean resolveLink)
{code}
The two parameters subAccess and ignoreEmptyDir work together, 
- if subAccess is not NULL, access permission of subDirs are checked, 
- when subAccess is checked, if ignoreEmptyDir is true, ignore
  empty directories.

To address Andrew's comments
{quote}
I think the semantics for a recursive delete via DistributedFileSystem#delete 
are still not quite right. The change you made will work for the shell since it 
does its own recursion, but we need to do the same remove if empty dir with 
read when recursing via recursive = true too. You might be able to do this by 
modifying FSPermissionChecker#checkSubAccess appropriately, but a new flag or 
new code would be safer.
{quote}
Thanks a lot for pointing this out, indeed it's a problem there. See above 
described solution, except we agreed that we don't need to check permission for 
empty dir.

{quote}
isDirectory, can we add per-parameter javadoc rather than stacking on the 
@return? I think renaming empty to isEmpty would also help.
Nit, also need a space in the ternary empty? and dir.isEmptyDirectory(src)?.
{quote}
These are now gone with new solution.

{code}
In Delete, I think it's a bit cleaner to do an instanceof 
PathIsNotEmptyDirectoryException.class check instead.
{code}
This is handled in a better way now. I discovered a bug HADOOP-10543 (and 
posted a patch) when looking at this. With HADOOP-10543 committed, I would be 
able to do exactly what Andrew suggested. But I think what I have in this new 
revision should fine too.
{quote}
Some lines longer than 80 chars
{quote}
Hopefully all addressed:-)

{quote}
TestFsShellPrivilege:
I gave this a quick pass, but overall it may be better to rewrite these to use 
the DFS API instead of the shell. We need to test recursive delete, which the 
shell doesn't do, and we don't really have any shell changes in the latest rev, 
which lessens the importance of having new shell tests.
{quote}
I think adding a test infra like what I added give another  option here, 
hopefully the new revision looks better:-)

{quote}
execCmd needs to do some try/finally to close and restore the streams if 
there's an exception. Also an extra commented line there.
{quote}
This FsShell actually took care of catching exception, so the stream will not 
get lost. Extra comment line removed.

{quote}
Could we rename this file to TestFsShellPermission? Permission is a more 
standard term.
{quote}
Done.

{quote}
This file also should not be in hadoop-tools, but rather hadoop-common.
{quote}
Because it uses MiniDFSCluster, it can not be in hadoop-common. But I moved to 
hdfs test area now.

{quote}
This does a lot of starting and stopping of a MiniCluster for running 
single-line tests. Can we combine these into a single test? We also don't need 
any DNs for this cluster, since we're just testing perms.
{quote}
I refactored the code to take care of this. Since we create file, I still keep 
DNs.

{quote}
We have FileSystemTestHelper#createFile for creating files, can save some 
code
Use of @Before and @After blocks might also clarify what's going on.
This also should be a JUnit4 test with @Test annotations, not JUnit3.
USER_UGI should not be all caps, it's not static final
It's a bit ugly how we pass UNEXPECTED_RESULT in for a lot of tests. Can we 
just pass a boolean for expectSuccess or expectFailure, or maybe a String that 
we can call assertExceptionContains on?
{quote} 
All are taken care of, except I forgot @before and @After, but hoopefully it 
looks much better now.

{quote}
FileEntry looks basically like a FileStatus, can we just use that instead?
{quote}
FileEntry only have the fields needed for this test, and it's easier to manage 
in test area. I'm worried using FileStatus would be not easy to control. So I 
didn't do this. Hope it's acceptable.

Thanks in advance for a further review.



 hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory 
 --

 Key: HDFS-6165
 URL: https://issues.apache.org/jira/browse/HDFS-6165
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0

[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2014-04-28 Thread Sanghyun Yun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983998#comment-13983998
 ] 

Sanghyun Yun commented on HDFS-5147:


hdfs dfsadmin -safemode enter 
This command's result is first namenode(it doesn't matter that active) status 
changes to safemode. I think should be changed active namenode or both.

hdfs dfsadmin -fs hdfs://CLUSTERNAME -safemode enter
This command's result is same.

Is it correct?

I performed in r2.2.0

 Certain dfsadmin commands such as safemode do not interact with the active 
 namenode in ha setup
 ---

 Key: HDFS-5147
 URL: https://issues.apache.org/jira/browse/HDFS-5147
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao

 There are certain commands in dfsadmin return the status of the first 
 namenode specified in the configs rather than interacting with the active 
 namenode
 For example. Issue
 hdfs dfsadmin -safemode get
 and it will return the status of the first namenode in the configs rather 
 than the active namenode.
 I think all dfsadmin commands should determine which is the active namenode 
 do the operation on it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6165) hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984008#comment-13984008
 ] 

Hadoop QA commented on HDFS-6165:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642397/HDFS-6165.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6763//console

This message is automatically generated.

 hdfs dfs -rm -r and hdfs -rmdir commands can't remove empty directory 
 --

 Key: HDFS-6165
 URL: https://issues.apache.org/jira/browse/HDFS-6165
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HDFS-6165.001.patch, HDFS-6165.002.patch, 
 HDFS-6165.003.patch, HDFS-6165.004.patch, HDFS-6165.004.patch, 
 HDFS-6165.005.patch


 Given a directory owned by user A with WRITE permission containing an empty 
 directory owned by user B, it is not possible to delete user B's empty 
 directory with either hdfs dfs -rm -r or hdfs dfs -rmdir. Because the 
 current implementation requires FULL permission of the empty directory, and 
 throws exception. 
 On the other hand, on linux, rm -r and rmdir command can remove empty 
 directory as long as the parent directory has WRITE permission (and prefix 
 component of the path have EXECUTE permission), For the tested OSes, some 
 prompt user asking for confirmation, some don't.
 Here's a reproduction:
 {code}
 [root@vm01 ~]# hdfs dfs -ls /user/
 Found 4 items
 drwxr-xr-x   - userabc users   0 2013-05-03 01:55 /user/userabc
 drwxr-xr-x   - hdfssupergroup  0 2013-05-03 00:28 /user/hdfs
 drwxrwxrwx   - mapred  hadoop  0 2013-05-03 00:13 /user/history
 drwxr-xr-x   - hdfssupergroup  0 2013-04-14 16:46 /user/hive
 [root@vm01 ~]# hdfs dfs -ls /user/userabc
 Found 8 items
 drwx--   - userabc users  0 2013-05-02 17:00 /user/userabc/.Trash
 drwxr-xr-x   - userabc users  0 2013-05-03 01:34 /user/userabc/.cm
 drwx--   - userabc users  0 2013-05-03 01:06 
 /user/userabc/.staging
 drwxr-xr-x   - userabc users  0 2013-04-14 18:31 /user/userabc/apps
 drwxr-xr-x   - userabc users  0 2013-04-30 18:05 /user/userabc/ds
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:54 /user/userabc/foo
 drwxr-xr-x   - userabc users  0 2013-04-30 16:18 
 /user/userabc/maven_source
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:40 
 /user/userabc/test-restore
 [root@vm01 ~]# hdfs dfs -ls /user/userabc/foo/
 [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -r -skipTrash /user/userabc/foo
 rm: Permission denied: user=userabc, access=ALL, 
 inode=/user/userabc/foo:hdfs:users:drwxr-xr-x
 {code}
 The super user can delete the directory.
 {code}
 [root@vm01 ~]# sudo -u hdfs hdfs dfs -rm -r -skipTrash /user/userabc/foo
 Deleted /user/userabc/foo
 {code}
 The same is not true for files, however. They have the correct behavior.
 {code}
 [root@vm01 ~]# sudo -u hdfs hdfs dfs -touchz /user/userabc/foo-file
 [root@vm01 ~]# hdfs dfs -ls /user/userabc/
 Found 8 items
 drwx--   - userabc users  0 2013-05-02 17:00 /user/userabc/.Trash
 drwxr-xr-x   - userabc users  0 2013-05-03 01:34 /user/userabc/.cm
 drwx--   - userabc users  0 2013-05-03 01:06 
 /user/userabc/.staging
 drwxr-xr-x   - userabc users  0 2013-04-14 18:31 /user/userabc/apps
 drwxr-xr-x   - userabc users  0 2013-04-30 18:05 /user/userabc/ds
 -rw-r--r--   1 hdfsusers  0 2013-05-03 02:11 
 /user/userabc/foo-file
 drwxr-xr-x   - userabc users  0 2013-04-30 16:18 
 /user/userabc/maven_source
 drwxr-xr-x   - hdfsusers  0 2013-05-03 01:40 
 /user/userabc/test-restore
 [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -skipTrash /user/userabc/foo-file
 Deleted /user/userabc/foo-file
 {code}
 Using hdfs dfs -rmdir command:
 {code}
 bash-4.1$ hadoop fs -lsr /
 lsr: DEPRECATED: Please use 'ls -R' instead.
 drwxr-xr-x   - hdfs supergroup  0 2014-03-25 16:29 /user
 drwxr-xr-x   - hdfs   supergroup  0 2014-03-25 16:28 /user/hdfs
 drwxr-xr-x   - usrabc users   0 2014-03-28 23:39 /user/usrabc
 drwxr-xr-x   - abcabc 0 2014-03-28 23:39 
 /user/usrabc/foo-empty1
 [root@vm01 usrabc]# su usrabc
 [usrabc@vm01 ~]$ hdfs dfs -rmdir 

[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2014-04-28 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984016#comment-13984016
 ] 

Jing Zhao commented on HDFS-5147:
-

[~yunsh], you need to specify specific NN URI in the -fs option, i.e., instead 
of -fs hdfs://CLUSTERNAME, you may want to use -fs hdfs://NN2_HOST:NN2_PORT 
if you want to put NN2 into safemode.

 Certain dfsadmin commands such as safemode do not interact with the active 
 namenode in ha setup
 ---

 Key: HDFS-5147
 URL: https://issues.apache.org/jira/browse/HDFS-5147
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao

 There are certain commands in dfsadmin return the status of the first 
 namenode specified in the configs rather than interacting with the active 
 namenode
 For example. Issue
 hdfs dfsadmin -safemode get
 and it will return the status of the first namenode in the configs rather 
 than the active namenode.
 I think all dfsadmin commands should determine which is the active namenode 
 do the operation on it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6297) Add new CLI cases to reflect new features of dfs and dfsadmin

2014-04-28 Thread Dasha Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984021#comment-13984021
 ] 

Dasha Boudnik commented on HDFS-6297:
-

I'm looking into this.

 Add new CLI cases to reflect new features of dfs and dfsadmin
 -

 Key: HDFS-6297
 URL: https://issues.apache.org/jira/browse/HDFS-6297
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 2.3.0, 2.4.0
Reporter: Dasha Boudnik
 Fix For: 3.0.0


 Some new features of HDFS aren't covered by the existing TestCLI test cases 
 (snapshot, upgrade, a few other minor ones).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6296) Add new CLI cases to reflect new features of dfs and dfsadmin

2014-04-28 Thread Dasha Boudnik (JIRA)
Dasha Boudnik created HDFS-6296:
---

 Summary: Add new CLI cases to reflect new features of dfs and 
dfsadmin
 Key: HDFS-6296
 URL: https://issues.apache.org/jira/browse/HDFS-6296
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 2.4.0, 2.3.0
Reporter: Dasha Boudnik
 Fix For: 3.0.0


Some new features of HDFS aren't covered by the existing TestCLI test cases 
(snapshot, upgrade, a few other minor ones).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6297) Add new CLI cases to reflect new features of dfs and dfsadmin

2014-04-28 Thread Dasha Boudnik (JIRA)
Dasha Boudnik created HDFS-6297:
---

 Summary: Add new CLI cases to reflect new features of dfs and 
dfsadmin
 Key: HDFS-6297
 URL: https://issues.apache.org/jira/browse/HDFS-6297
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 2.4.0, 2.3.0
Reporter: Dasha Boudnik
 Fix For: 3.0.0


Some new features of HDFS aren't covered by the existing TestCLI test cases 
(snapshot, upgrade, a few other minor ones).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6295) Add decommissioning state and node state filtering to dfsadmin

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984044#comment-13984044
 ] 

Hadoop QA commented on HDFS-6295:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642381/hdfs-6295-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.web.TestWebHDFS

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6762//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6762//console

This message is automatically generated.

 Add decommissioning state and node state filtering to dfsadmin
 

 Key: HDFS-6295
 URL: https://issues.apache.org/jira/browse/HDFS-6295
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6295-1.patch


 One of the few admin-friendly ways of viewing the list of decommissioning 
 nodes is via hdfs dfsadmin -report. However, this lists *all* the datanodes 
 on the cluster, which is prohibitive for large clusters, and also requires 
 manual parsing to look at the decom status. It'd be nicer if we could fetch 
 and display only decommissioning nodes (or just live and dead nodes for that 
 matter).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2014-04-28 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-2882:


Attachment: HDFS-2882.patch

Attaching the updated patch.

 DN continues to start up, even if block pool fails to initialize
 

 Key: HDFS-2882
 URL: https://issues.apache.org/jira/browse/HDFS-2882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Vinayakumar B
 Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
 HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
 hdfs-2882.txt


 I started a DN on a machine that was completely out of space on one of its 
 drives. I saw the following:
 2012-02-02 09:56:50,499 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
 block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
 DS-507718931-172.29.5.194-11072-12978
 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
 java.io.IOException: Mkdirs failed to create 
 /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
 at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335)
 but the DN continued to run, spewing NPEs when it tried to do block reports, 
 etc. This was on the HDFS-1623 branch but may affect trunk as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2014-04-28 Thread Sanghyun Yun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984047#comment-13984047
 ] 

Sanghyun Yun commented on HDFS-5147:


[~jingzhao], thanks for your answers.
I known -fs option and it works well.
But, I think dfsadmin command should be affect active namenode when I don't 
specify specific NN URI or use -fs hdfs://CLUSTERNAME.
Currently, it affect first namenode that doesn't matter active or standby.

 Certain dfsadmin commands such as safemode do not interact with the active 
 namenode in ha setup
 ---

 Key: HDFS-5147
 URL: https://issues.apache.org/jira/browse/HDFS-5147
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao

 There are certain commands in dfsadmin return the status of the first 
 namenode specified in the configs rather than interacting with the active 
 namenode
 For example. Issue
 hdfs dfsadmin -safemode get
 and it will return the status of the first namenode in the configs rather 
 than the active namenode.
 I think all dfsadmin commands should determine which is the active namenode 
 do the operation on it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)