[jira] [Commented] (HDFS-4502) WebHdfsFileSystem handling of ifile field breaks compatibility and breaks HttpFS

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583096#comment-13583096
 ] 

Hudson commented on HDFS-4502:
--

Integrated in Hadoop-Yarn-trunk #134 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/134/])
HDFS-4502. JsonUtil.toFileStatus(..) should check if the fileId property 
exists.  Contributed by Brandon Li (Revision 1448502)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448502
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java


> WebHdfsFileSystem handling of ifile field breaks compatibility and breaks 
> HttpFS
> 
>
> Key: HDFS-4502
> URL: https://issues.apache.org/jira/browse/HDFS-4502
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Brandon Li
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-4502.patch
>
>
> HDFS-4340 introduced a new field {{fileId}} in WebHDFS FileStatus JSON 
> representation.
> There are a two issues with this:
> * This is changing the WebHDFS REST API, this has not been documented
> * WebHdfsFileSystem should not fail when that field is not present (This is 
> the case when using HttpFS against a FS implementation -other than HDFS- 
> which does not handle fileId)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4502) WebHdfsFileSystem handling of ifile field breaks compatibility and breaks HttpFS

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583160#comment-13583160
 ] 

Hudson commented on HDFS-4502:
--

Integrated in Hadoop-Hdfs-trunk #1323 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1323/])
HDFS-4502. JsonUtil.toFileStatus(..) should check if the fileId property 
exists.  Contributed by Brandon Li (Revision 1448502)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448502
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java


> WebHdfsFileSystem handling of ifile field breaks compatibility and breaks 
> HttpFS
> 
>
> Key: HDFS-4502
> URL: https://issues.apache.org/jira/browse/HDFS-4502
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Brandon Li
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-4502.patch
>
>
> HDFS-4340 introduced a new field {{fileId}} in WebHDFS FileStatus JSON 
> representation.
> There are a two issues with this:
> * This is changing the WebHDFS REST API, this has not been documented
> * WebHdfsFileSystem should not fail when that field is not present (This is 
> the case when using HttpFS against a FS implementation -other than HDFS- 
> which does not handle fileId)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4503) Update computeContentSummary, spaceConsumedInTree and diskspaceConsumed for snapshot

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583165#comment-13583165
 ] 

Hudson commented on HDFS-4503:
--

Integrated in Hadoop-Hdfs-Snapshots-Branch-build #108 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/108/])
HDFS-4503. Update computeContentSummary(..), spaceConsumedInTree(..) and 
diskspaceConsumed(..) in INode for snapshot. (Revision 1448373)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448373
Files : 
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EnumCounters.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Quota.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileUnderConstructionWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/diff/Diff.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestINodeFileUnderConstructionWithSnapshot.java


> Update computeContentSummary, spaceConsumedInTree and diskspaceConsumed for 
> snapshot
> 
>
> Key: HDFS-4503
> URL: https://issues.apache.org/jira/browse/HDFS-4503
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: h4503_20130214.patch, h4503_20130215.patch, 
> h4503_20130216.patch, h4503_20130218.patch, h4503_20130219b.patch, 
> h4503_20130219.patch
>
>
> There are three methods computeContentSummary, spaceConsumedInTree and 
> diskspaceConsumed for computing namespace/diskspace usages.  They need to be 
> updated to the snapshot feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4499) Fix file/directory/snapshot deletion for file diff

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583164#comment-13583164
 ] 

Hudson commented on HDFS-4499:
--

Integrated in Hadoop-Hdfs-Snapshots-Branch-build #108 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/108/])
HDFS-4499. Fix file/directory/snapshot deletion for file diff.  Contributed 
by Jing Zhao (Revision 1448504)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448504
Files : 
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileUnderConstructionWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/diff/Diff.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


> Fix file/directory/snapshot deletion for file diff
> --
>
> Key: HDFS-4499
> URL: https://issues.apache.org/jira/browse/HDFS-4499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4499.000.patch, HDFS-4499.001.patch, 
> HDFS-4499.002.patch, HDFS-4499.003.patch
>
>
> After recording file diffs, the original file/directory/snapshot deletion 
> process needs to updated. This jira revisits and fixes the whole deletion 
> process. New unit tests will also be added to cover different cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4502) WebHdfsFileSystem handling of ifile field breaks compatibility and breaks HttpFS

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583175#comment-13583175
 ] 

Hudson commented on HDFS-4502:
--

Integrated in Hadoop-Mapreduce-trunk #1351 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1351/])
HDFS-4502. JsonUtil.toFileStatus(..) should check if the fileId property 
exists.  Contributed by Brandon Li (Revision 1448502)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448502
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java


> WebHdfsFileSystem handling of ifile field breaks compatibility and breaks 
> HttpFS
> 
>
> Key: HDFS-4502
> URL: https://issues.apache.org/jira/browse/HDFS-4502
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Brandon Li
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-4502.patch
>
>
> HDFS-4340 introduced a new field {{fileId}} in WebHDFS FileStatus JSON 
> representation.
> There are a two issues with this:
> * This is changing the WebHDFS REST API, this has not been documented
> * WebHdfsFileSystem should not fail when that field is not present (This is 
> the case when using HttpFS against a FS implementation -other than HDFS- 
> which does not handle fileId)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-02-21 Thread Vadim Bondarev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Bondarev updated HDFS-4510:
-

Attachment: HADOOP-4510-trunk-b.patch
HADOOP-4510-branch-2-b.patch
HADOOP-4510-branch-0.23-b.patch

> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-2-a.patch, 
> HADOOP-4510-branch-2-b.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4508) Two minor improvements to the QJM Deployment docs

2013-02-21 Thread Daisuke Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daisuke Kobayashi reassigned HDFS-4508:
---

Assignee: Daisuke Kobayashi

> Two minor improvements to the QJM Deployment docs
> -
>
> Key: HDFS-4508
> URL: https://issues.apache.org/jira/browse/HDFS-4508
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.3-alpha
>Reporter: Harsh J
>Assignee: Daisuke Kobayashi
>Priority: Minor
>
> Suggested by ML user Azurry, the docs at 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deployment_details
>  can be improved for two specific lines:
> {quote}
> * If you have already formatted the NameNode, or are converting a 
> non-HA-enabled cluster to be HA-enabled, you should now copy over the 
> contents of your NameNode metadata directories to the other, unformatted 
> NameNode by running the command "hdfs namenode -bootstrapStandby" on the 
> unformatted NameNode. Running this command will also ensure that the 
> JournalNodes (as configured by dfs.namenode.shared.edits.dir) contain 
> sufficient edits transactions to be able to start both NameNodes.
> * If you are converting a non-HA NameNode to be HA, you should run the 
> command "hdfs -initializeSharedEdits", which will initialize the JournalNodes 
> with the edits data from the local NameNode edits directories.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4508) Two minor improvements to the QJM Deployment docs

2013-02-21 Thread Daisuke Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daisuke Kobayashi updated HDFS-4508:


Attachment: HDFS-4508.patch

modified.

> Two minor improvements to the QJM Deployment docs
> -
>
> Key: HDFS-4508
> URL: https://issues.apache.org/jira/browse/HDFS-4508
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.3-alpha
>Reporter: Harsh J
>Assignee: Daisuke Kobayashi
>Priority: Minor
> Attachments: HDFS-4508.patch
>
>
> Suggested by ML user Azurry, the docs at 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deployment_details
>  can be improved for two specific lines:
> {quote}
> * If you have already formatted the NameNode, or are converting a 
> non-HA-enabled cluster to be HA-enabled, you should now copy over the 
> contents of your NameNode metadata directories to the other, unformatted 
> NameNode by running the command "hdfs namenode -bootstrapStandby" on the 
> unformatted NameNode. Running this command will also ensure that the 
> JournalNodes (as configured by dfs.namenode.shared.edits.dir) contain 
> sufficient edits transactions to be able to start both NameNodes.
> * If you are converting a non-HA NameNode to be HA, you should run the 
> command "hdfs -initializeSharedEdits", which will initialize the JournalNodes 
> with the edits data from the local NameNode edits directories.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583245#comment-13583245
 ] 

Daryn Sharp commented on HDFS-4222:
---

bq. Not sure how to make this work. When does thread local variable get 
initialized and when is it cleared, given a thread gets used for different 
current users?

Perhaps init-ed in the same places where {{getPermissionChecker}} is being 
invoked, or ideally at a higher level to avoid all command methods from having 
"to do the right".

bq. bq. Another thought might be an option to tell a UGI to "lock-in" it's 
group list. Something earlier on at a high level, maybe the NN's RPC server, 
could call UserGroupInformation.getCurrentUser().lockGroups().
bq. Not sure I understood this.

"lockGroups" would internally fetch the groups and then make them immutable in 
the UGI.  It could be invoked where {{getPermissionChecker}} is being invoked, 
or ideally at a higher level chokepoint for calls so it's a one-line change.  
Maybe in the rpc call's doAs since a call shouldn't be running long enough that 
the groups will change.  This would inoculate future methods or overlooked 
methods from taking the lookup penalty within a lock.

In either case, I'm just trying to think of how to simplify the change and 
future-proof against similar issues.  Again though, I really like this change.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-02-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583252#comment-13583252
 ] 

Hadoop QA commented on HDFS-4510:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12570307/HADOOP-4510-trunk-b.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

  {color:red}-1 one of tests included doesn't have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3991//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3991//console

This message is automatically generated.

> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-2-a.patch, 
> HADOOP-4510-branch-2-b.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4517) Cover class RemoteBlockReader with unit tests

2013-02-21 Thread Vadim Bondarev (JIRA)
Vadim Bondarev created HDFS-4517:


 Summary: Cover class RemoteBlockReader with unit tests
 Key: HDFS-4517
 URL: https://issues.apache.org/jira/browse/HDFS-4517
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 2.0.3-alpha, 3.0.0, 0.23.6
Reporter: Vadim Bondarev




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583299#comment-13583299
 ] 

Suresh Srinivas commented on HDFS-4222:
---

bq. Perhaps init-ed in the same places where getPermissionChecker is being 
invoked, or ideally at a higher level to avoid all command methods from having 
"to do the right".
Only problem is, if subsequent methods are not passed FSPermissionChecker, they 
might end up calling getPermissionChecker (due to a bug) as well, that too 
inside the lock. The likelihood of that with parameter passing is low.

bq. "lockGroups" would internally fetch the groups and then make them immutable 
in the UGI
We should certainly explore this in a subsequent jira.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583310#comment-13583310
 ] 

Kihwal Lee commented on HDFS-4222:
--

bq. "lockGroups" would internally fetch the groups and then make them immutable 
in the UGI. 

This will ensure group lookups happen before entering a critical section. But 
since mappings are cached in Groups and the UGI is short-lived, simply calling 
getGroups() early will be enough. One concern in early lookup is whether it 
will generate excessive extra lookups. Some methods in client protocol don't 
need permission checks as they are lease-based. For example, addBlock(), 
complete() and renewLease() are lease-based and their request rate is quite 
high. In a busy grid, they add up to over 450 requests/sec. But again, caching 
may reduce actual lookups, so it may not be a big deal.

If unconditionally forced early lookup is not acceptable, we could selectively 
do early lookup/cache-filling in NameNodeRpcServer for those calls that 
actually need group lookups and leave FSNamesystem mostly unchanged.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4424) fsdataset Mkdirs failed cause nullpointexception and many files in bbw

2013-02-21 Thread Li Junjun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Junjun updated HDFS-4424:


Summary: fsdataset  Mkdirs failed  cause  nullpointexception and many files 
in bbw  (was: fsdataset  Mkdirs failed  cause  nullpointexception and other bad 
 consequence )

> fsdataset  Mkdirs failed  cause  nullpointexception and many files in bbw
> -
>
> Key: HDFS-4424
> URL: https://issues.apache.org/jira/browse/HDFS-4424
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 1.0.1
>Reporter: Li Junjun
>Assignee: Li Junjun
> Fix For: 1.0.1
>
> Attachments: patch.txt
>
>
> File: /hadoop-1.0.1/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
> from line 205:
> {code}
>   if (children == null || children.length == 0) {
> children = new FSDir[maxBlocksPerDir];
> for (int idx = 0; idx < maxBlocksPerDir; idx++) {
>   children[idx] = new FSDir(new File(dir, 
> DataStorage.BLOCK_SUBDIR_PREFIX+idx));
> }
>   }
> {code}
> in FSDir constructer method if faild (  space full,so mkdir fails), but  
> the children still in use !
> the the write comes(after I run balancer ) , when choose FSDir 
> line 192:
> File file = children[idx].addBlock(b, src, false, resetIdx);
> cause exceptions like this
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.addBlock(FSDataset.java:192)
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.addBlock(FSDataset.java:192)
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.addBlock(FSDataset.java:158)
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.addBlock(FSDataset.java:495)
> {code}
> 
> should it like this 
> {code}
>   if (children == null || children.length == 0) {
>   List childrenList = new ArrayList();
> 
> for (int idx = 0; idx < maxBlocksPerDir; idx++) {
>   try{
>childrenList .add( new FSDir(new File(dir, 
> DataStorage.BLOCK_SUBDIR_PREFIX+idx)));
>   }catch(Exception e){
>   }
>   children = childrenList.toArray();
> }
>   }
> {code}
> 
> bad consequence , in my cluster ,this datanode's num blocks became 0 .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4482) ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file.

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583332#comment-13583332
 ] 

Hudson commented on HDFS-4482:
--

Integrated in Hadoop-trunk-Commit #3373 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3373/])
HDFS-4482. ReplicationMonitor thread can exit with NPE due to the race 
between delete and replication of same file. Contributed by Uma Maheswara Rao 
G. (Revision 1448708)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448708
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java


> ReplicationMonitor thread can exit with NPE due to the race between delete 
> and replication of same file.
> 
>
> Key: HDFS-4482
> URL: https://issues.apache.org/jira/browse/HDFS-4482
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.0.1-alpha
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Blocker
> Attachments: HDFS-4482-1.patch, HDFS-4482.patch, HDFS-4482.patch
>
>
> Trace:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1442)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:269)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:163)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:131)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1157)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1063)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3085)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3047)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> What I am seeing here is:
> 1) create a file and write with 2 DNS
> 2) Close the file.
> 3) Kill one DN
> 4) Let replication start.
>   Info:
> {code}
>  // choose replication targets: NOT HOLDING THE GLOBAL LOCK
>   // It is costly to extract the filename for which chooseTargets is 
> called,
>   // so for now we pass in the block collection itself.
>   rw.targets = blockplacement.chooseTarget(rw.bc,
>   rw.additionalReplRequired, rw.srcNode, rw.liveReplicaNodes,
>   excludedNodes, rw.block.getNumBytes());{code}
> Here we are choosing target outside the global lock. Inside we will try to 
> get the src path from blockCollection(nothing but INodeFile here).
> see the code for FSDirectory#getFullPathName
>  Here it is incrementing the depth until it has parent. and Later it will 
> iterate and access parent again in next loop.
> 5) before going to secnd loop in FSDirectory#getFullPathName, if file is 
> deleted by client then that parent would have been set as null. So, here 
> accessing the parent can cause NPE because it is not under lock.
> [~brahmareddy] reported this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4482) ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file.

2013-02-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-4482:
--

   Resolution: Fixed
Fix Version/s: 2.0.4-beta
   3.0.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-2.

> ReplicationMonitor thread can exit with NPE due to the race between delete 
> and replication of same file.
> 
>
> Key: HDFS-4482
> URL: https://issues.apache.org/jira/browse/HDFS-4482
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.0.1-alpha
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Blocker
> Fix For: 3.0.0, 2.0.4-beta
>
> Attachments: HDFS-4482-1.patch, HDFS-4482.patch, HDFS-4482.patch
>
>
> Trace:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1442)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:269)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:163)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:131)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1157)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1063)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3085)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3047)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}
> What I am seeing here is:
> 1) create a file and write with 2 DNS
> 2) Close the file.
> 3) Kill one DN
> 4) Let replication start.
>   Info:
> {code}
>  // choose replication targets: NOT HOLDING THE GLOBAL LOCK
>   // It is costly to extract the filename for which chooseTargets is 
> called,
>   // so for now we pass in the block collection itself.
>   rw.targets = blockplacement.chooseTarget(rw.bc,
>   rw.additionalReplRequired, rw.srcNode, rw.liveReplicaNodes,
>   excludedNodes, rw.block.getNumBytes());{code}
> Here we are choosing target outside the global lock. Inside we will try to 
> get the src path from blockCollection(nothing but INodeFile here).
> see the code for FSDirectory#getFullPathName
>  Here it is incrementing the depth until it has parent. and Later it will 
> iterate and access parent again in next loop.
> 5) before going to secnd loop in FSDirectory#getFullPathName, if file is 
> deleted by client then that parent would have been set as null. So, here 
> accessing the parent can cause NPE because it is not under lock.
> [~brahmareddy] reported this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4087) Protocol changes for listSnapshots functionality

2013-02-21 Thread Hari Mankude (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583360#comment-13583360
 ] 

Hari Mankude commented on HDFS-4087:


Has the listSnap cli call been added?

> Protocol changes for listSnapshots functionality
> 
>
> Key: HDFS-4087
> URL: https://issues.apache.org/jira/browse/HDFS-4087
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: Snapshot (HDFS-2802)
>Reporter: Brandon Li
>Assignee: Brandon Li
>  Labels: needs-test
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4087.patch, HDFS-4087.patch, HDFS-4087.patch, 
> HDFS-4087.patch
>
>
> SnapInfo saves information about a snapshot. This jira also updates the java 
> protocol classes and translation for listSnapshot operation.
> Given a snapshot root, the snapshots create under it can be listed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583373#comment-13583373
 ] 

Suresh Srinivas commented on HDFS-4222:
---

bq. If unconditionally forced early lookup is not acceptable...
Current patch could have been much simpler, if FSPermissionChecker was always 
created. Lets do further optimizations in another patch.

If someone reviews this and +1s it, I will commit it. I plan on getting this 
into branch-1 as well.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4497) commons-daemon 1.0.3 dependency has bad group id causing build issues

2013-02-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583394#comment-13583394
 ] 

Sangjin Lee commented on HDFS-4497:
---

HDFS folks, are you OK with the patch? Thanks!

> commons-daemon 1.0.3 dependency has bad group id causing build issues
> -
>
> Key: HDFS-4497
> URL: https://issues.apache.org/jira/browse/HDFS-4497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.0.2-alpha
>Reporter: Sangjin Lee
> Attachments: HDFS-4497.patch
>
>
> The commons-daemon dependency of the hadoop-hdfs module has been at version 
> 1.0.3 for a while. However, 1.0.3 has a pretty well-known groupId error in 
> its pom ("org.apache.commons" as opposed to "commons-daemon"). This problem 
> has since been corrected on commons-daemon starting 1.0.4.
> This causes build problems for many who depend on hadoop-hdfs directly and 
> indirectly, however. Maven can skip over this metadata inconsistency. But 
> other less forgiving build systems such as ivy and gradle have much harder 
> time working around this problem. For example, in gradle, pretty much the 
> only obvious way to work around this is to override this dependency version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4479) In branch-1, logSync() may be called with the FSNamesystem lock held in commitBlockSynchronization

2013-02-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583398#comment-13583398
 ] 

Suresh Srinivas commented on HDFS-4479:
---

Jing, after the place where logSync is removed, there is a return. Should we do 
logSync only once in possibly try finally?

> In branch-1, logSync() may be called with the FSNamesystem lock held in 
> commitBlockSynchronization
> --
>
> Key: HDFS-4479
> URL: https://issues.apache.org/jira/browse/HDFS-4479
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4479.b1.001.patch
>
>
> In FSNamesystem#commitBlockSynchronization of branch-1, logSync() may be 
> called when the FSNamesystem lock is held. Similar with HDFS-4186, this may 
> cause some performance issue.
> Since logSync is called right after the synchronization section, we can 
> simply remove the logSync call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4497) commons-daemon 1.0.3 dependency has bad group id causing build issues

2013-02-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583416#comment-13583416
 ] 

Chris Nauroth commented on HDFS-4497:
-

+1 for the current patch, upgrading to 1.0.13.

BTW, I've been doing some testing of my own with 1.0.13 for an unrelated 
reason, and it's working fine.


> commons-daemon 1.0.3 dependency has bad group id causing build issues
> -
>
> Key: HDFS-4497
> URL: https://issues.apache.org/jira/browse/HDFS-4497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.0.2-alpha
>Reporter: Sangjin Lee
> Attachments: HDFS-4497.patch
>
>
> The commons-daemon dependency of the hadoop-hdfs module has been at version 
> 1.0.3 for a while. However, 1.0.3 has a pretty well-known groupId error in 
> its pom ("org.apache.commons" as opposed to "commons-daemon"). This problem 
> has since been corrected on commons-daemon starting 1.0.4.
> This causes build problems for many who depend on hadoop-hdfs directly and 
> indirectly, however. Maven can skip over this metadata inconsistency. But 
> other less forgiving build systems such as ivy and gradle have much harder 
> time working around this problem. For example, in gradle, pretty much the 
> only obvious way to work around this is to override this dependency version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4479) In branch-1, logSync() may be called with the FSNamesystem lock held in commitBlockSynchronization

2013-02-21 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4479:


Attachment: HDFS-4479.b1.002.patch

Thanks for the comments Suresh! Update the patch based on your comments.

> In branch-1, logSync() may be called with the FSNamesystem lock held in 
> commitBlockSynchronization
> --
>
> Key: HDFS-4479
> URL: https://issues.apache.org/jira/browse/HDFS-4479
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4479.b1.001.patch, HDFS-4479.b1.002.patch
>
>
> In FSNamesystem#commitBlockSynchronization of branch-1, logSync() may be 
> called when the FSNamesystem lock is held. Similar with HDFS-4186, this may 
> cause some performance issue.
> Since logSync is called right after the synchronization section, we can 
> simply remove the logSync call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-02-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583513#comment-13583513
 ] 

Chris Nauroth commented on HDFS-4510:
-

Thank you for addressing the feedback, Vadim.  I tested the new patch 
successfully.  Please disregard my earlier comments about not creating conf as 
a static in {{TestNameNodeJspHelper}}.  I can see now that it's safe for this 
test.

Here are just a few more really minor things:

{code}
  static final class ConfigurationForTestClusterJspHelper extends Configuration 
{
static {
  addDefaultResource("testClusterJspHelperProp.xml");
}
  }
{code}

Is the subclass necessary?  I think calling 
{{Configuration#addDefaultResource}} from a static initialization block in 
{{TestClusterJspHelper}} and then using {{new Configuration()}} would have the 
same effect.

{code}
--- hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hdfs-site.xml
+++ hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hdfs-site.xml
@@ -24,6 +24,5 @@
   
 hadoop.security.authentication
 simple
-  
-
+  
 
{code}

Is there actually a change in hdfs-site.xml?  If not, could you remove it from 
the patch?

{code}

  
fs.defaultFS
hdfs://localhost.localdomain:45541

  
  
hadoop.security.authentication
simple


{code}

Minor nitpicks: could you remove the empty  and indent the last 
 2 spaces instead of 4?

{quote}
-1 one of tests included doesn't have a timeout
{quote}

There was a change submitted just this morning to test-patch.sh to enforce that 
all new tests must specify a timeout in the annotation, i.e. 
@Test(timeout=3), so let's add that to the new tests.

Thanks again!


> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-2-a.patch, 
> HADOOP-4510-branch-2-b.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4518) Finer grained metrics for HDFS capacity

2013-02-21 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-4518:
---

 Summary: Finer grained metrics for HDFS capacity
 Key: HDFS-4518
 URL: https://issues.apache.org/jira/browse/HDFS-4518
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Namenode should export disk usage metrics in bytes via FSNamesystemMetrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583525#comment-13583525
 ] 

Kihwal Lee commented on HDFS-4222:
--

+1 The latest patch looks good to me. Future-proofing can be done separately if 
needed.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4491) Parallel testing HDFS

2013-02-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583545#comment-13583545
 ] 

Chris Nauroth commented on HDFS-4491:
-

I'm still working out the best way to share full build instructions for 
Windows.  (Maybe some combination of wiki page + a Windows section in 
BUILDING.txt?)

Meanwhile, I'll paste some additional notes right here about the gotchas that I 
expect are most relevant to what you need right now.  The error that you're 
currently experiencing makes me think that you need to set the Platform 
environment variable to x64 (assuming that you're running a 64-bit Windows VM).

Hope this helps,
--Chris

# Make sure your initial directory is very short or else you will end up with 
some files missing in the code because the file path length is too long. Eg. 
Setup Hadoop in c:\hdp\
# Make sure Windows platform SDKs/compilers are in the path.  Visual Studio 
Professional works, but make sure it's Visual Studio 2010.  The build won't 
work with Visual Studio 2012.  Do not use Visual Studio Express.  It does not 
support compiling for 64-bit, which is problematic if you're running a 64-bit 
VM.  If you don't have Visual Studio or don't have a license, then you can also 
download the SDK for free here:  
http://www.microsoft.com/en-us/download/details.aspx?id=8279.  Then, run your 
builds from a Windows SDK Command Prompt.  (Start, All Programs, Microsoft 
Windows SDK v7.1, Windows SDK 7.1 Command Prompt.)
# Create new system environment variable VCTargetsPath = C:\Program Files 
(x86)\MSBuild\Microsoft.Cpp\v4.0\. You may have to add the following to the 
system PATH - C:\Windows\Microsoft.NET\Framework64\vXYZ.
# Due to a bug in the build, you may need to set an additional environment 
variable, Platform=x64 or Platform=Win32, depending on whether you're running a 
32-bit or 64-bit VM.
# If you are using the VirtualBox Shared Folders feature to share a copy of the 
codebase between your host and the Windows VM, do NOT attempt a Windows build 
from that shared folder.  VirtualBox presents the volume as a file system that 
does not support Windows symbolic links, and this causes tests to fail.


> Parallel testing HDFS
> -
>
> Key: HDFS-4491
> URL: https://issues.apache.org/jira/browse/HDFS-4491
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Tsuyoshi OZAWA
>Assignee: Andrey Klochkov
> Attachments: HDFS-4491--n2.patch, HDFS-4491.patch
>
>
> Parallel execution of HDFS tests in multiple forks. See HADOOP-9287 for 
> details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4222:
--

Fix Version/s: 2.0.4-beta

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Fix For: 2.0.4-beta
>
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583552#comment-13583552
 ] 

Suresh Srinivas commented on HDFS-4222:
---

I committed the patch to trunk and branch-2. Keeping this jira open to make 
this change available for branch-1.

Thank you [~teledriver] for the original patch. Kihwal and Daryn, thanks for 
the review.

Kihwal, do you want me to commit this to branch 0.23?



> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583556#comment-13583556
 ] 

Hudson commented on HDFS-4222:
--

Integrated in Hadoop-trunk-Commit #3375 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3375/])
HDFS-4222. NN is unresponsive and loses heartbeats from DNs when configured 
to use LDAP and LDAP has issues. Contributed by Xiaobo Peng and Suresh 
Srinivas. (Revision 1448801)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1448801
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java


> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Fix For: 2.0.4-beta
>
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583584#comment-13583584
 ] 

Kihwal Lee commented on HDFS-4222:
--

bq. Kihwal, do you want me to commit this to branch 0.23?

Yes. Thanks Suresh.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Fix For: 2.0.4-beta
>
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4258) Rename of Being Written Files

2013-02-21 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583638#comment-13583638
 ] 

Aaron T. Myers commented on HDFS-4258:
--

bq. I have hard time understanding the relationship of that with rename in this 
jira. In fact when recoverLease is done an active writer will not be able to 
write whether rename has occurred or not. That behavior is not being changed 
with the changes done as a part of this.

I wasn't implying that this proposed change has anything to do with the 
recoverLease API. I was just using it as a counterpoint to Brandon's claim that 
clients would be overpowered since "for any client wants to gain a lease of any 
file, it could just rename it and then open it. Most likely it could get the 
lease immediately as long as it has write permission." i.e. that should not be 
a concern, since a client can already do that today with the recoverLease API.

{quote}
There are other ideas that have been proposed about revoking the lease on 
rename. I am -1 on it for the following reasons:
# Current behavior is when a rename occurs the current writer continues to 
write to the block that is currently allocated but fails to allocate new blocks.
# New rename behavior will be incompatible where the current writer is fenced 
from writing. Given that a lot of files in HDFS are less than a block in 
length, this could result in strange behaviors for some applications.
# I agree with the point Brandon raised above. Renaming a directory means 
walking through all the files open under it and revoking the leases. Rename 
already is a complicated operation. Doing this additional work during rename 
makes it even more heavier and the operation unpredictably large.

I actually like the direction Brandon and Nicholas have taken. We can continue 
the existing behavior. In fact with this change, we can allow current writer to 
continue to allocated new blocks (based on file ID) and continue to write, if 
we want. But that could be done in another jira.
{quote}

I agree with Daryn's reasoning as stated in HDFS-4437:

{quote}
I think supporting file descriptor behavior is a great idea (we've internally 
talked about this). Until we do, I think the lease should be revoked. My 
concerns with fd behavior would be the ever pervasive "two wrongs make a right" 
where users are relying unintentionally on renames breaking writers, and 
ensuring we get the security right to avoid attacks probing for fileids.
{quote}

Regardless of whether or not we implement HDFS-4437 as Daryn proposed, I still 
think we should move the INode ID stuff to a branch. It's a fairly involved 
change with several sub-task JIRAs, which indicates to me that it would be 
better done incrementally on a branch and then merged to trunk once it's a 
completely functional whole.

> Rename of Being Written Files
> -
>
> Key: HDFS-4258
> URL: https://issues.apache.org/jira/browse/HDFS-4258
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 3.0.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Brandon Li
> Attachments: HDFS-4258.patch, HDFS-4258.patch, HDFS-4258.patch, 
> HDFS-4258.patch
>
>
> When a being written file or it's ancestor directories is renamed, the path 
> in the file lease is also renamed.  Then the writer of the file usually will 
> fail since the file path in the writer is not updated.
> Moreover, I think there is a bug as follow:
> # Client writes 0's to F_0="/foo/file" and writes 1's to F_1="/bar/file" at 
> the same time.
> # Rename /bar to /baz
> # Rename /foo to /bar
> Then, writing to F_0 will fail since /foo/file does not exist anymore but 
> writing to F_1 may succeed since /bar/file exits as a different file.  In 
> such case, the content of /bar/file could be partly 0's and partly 1's.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4439) umask-mode does not support 4-digit umask value

2013-02-21 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HDFS-4439:
---

Attachment: HDFS-4439.patch

> umask-mode does not support 4-digit umask value
> ---
>
> Key: HDFS-4439
> URL: https://issues.apache.org/jira/browse/HDFS-4439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Andy Isaacson
> Attachments: HDFS-4439.patch
>
>
> Best practice for specifying file permissions using the legacy octal format 
> is to always add a leading "0" to ensure the value is treated as octal.  
> However the {{fs.permissions.umask-mode}} parsing code throws an error when 
> given a 4-digit string:
> {code}
> $ hdfs dfs -Dfs.permissions.umask-mode= -touchz foo.txt
> 2013-01-24 12:49:02,352 WARN  permission.FsPermission 
> (FsPermission.java:getUMask(245)) - Unable to parse configuration 
> fs.permissions.umask-mode with value  as octal or symbolic umask.
> -touchz: Unable to parse configuration fs.permissions.umask-mode with value 
>  as octal or symbolic umask.
> Usage: hadoop fs [generic options] -touchz  ...
> {code}
> There's no downside to supporting {{}}, so hdfs should handle it 
> gracefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4439) umask-mode does not support 4-digit umask value

2013-02-21 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HDFS-4439:
---

Assignee: Chu Tong
  Status: Patch Available  (was: Open)

I also tried to test this locally on my dev cluster

> umask-mode does not support 4-digit umask value
> ---
>
> Key: HDFS-4439
> URL: https://issues.apache.org/jira/browse/HDFS-4439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Andy Isaacson
>Assignee: Chu Tong
> Attachments: HDFS-4439.patch
>
>
> Best practice for specifying file permissions using the legacy octal format 
> is to always add a leading "0" to ensure the value is treated as octal.  
> However the {{fs.permissions.umask-mode}} parsing code throws an error when 
> given a 4-digit string:
> {code}
> $ hdfs dfs -Dfs.permissions.umask-mode= -touchz foo.txt
> 2013-01-24 12:49:02,352 WARN  permission.FsPermission 
> (FsPermission.java:getUMask(245)) - Unable to parse configuration 
> fs.permissions.umask-mode with value  as octal or symbolic umask.
> -touchz: Unable to parse configuration fs.permissions.umask-mode with value 
>  as octal or symbolic umask.
> Usage: hadoop fs [generic options] -touchz  ...
> {code}
> There's no downside to supporting {{}}, so hdfs should handle it 
> gracefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4519) Support override of jsvc binary and log file locations when launching secure datanode.

2013-02-21 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-4519:
---

 Summary: Support override of jsvc binary and log file locations 
when launching secure datanode.
 Key: HDFS-4519
 URL: https://issues.apache.org/jira/browse/HDFS-4519
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, scripts
Affects Versions: 1.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth


Currently, builds based on branch-1 bundle a specific version of jsvc pre-built 
for Linux, and the startup scripts hard-code the location of the output and 
error files.  Some deployments may prefer to upgrade to a different version of 
jsvc, independent of the version bundled in Hadoop, and redirect its output 
elsewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4519) Support override of jsvc binary and log file locations when launching secure datanode.

2013-02-21 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583684#comment-13583684
 ] 

Aaron T. Myers commented on HDFS-4519:
--

Hey Chris, is there not a similar issue in branch-2? There may very well not be 
- just checking.

> Support override of jsvc binary and log file locations when launching secure 
> datanode.
> --
>
> Key: HDFS-4519
> URL: https://issues.apache.org/jira/browse/HDFS-4519
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, scripts
>Affects Versions: 1.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Currently, builds based on branch-1 bundle a specific version of jsvc 
> pre-built for Linux, and the startup scripts hard-code the location of the 
> output and error files.  Some deployments may prefer to upgrade to a 
> different version of jsvc, independent of the version bundled in Hadoop, and 
> redirect its output elsewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4519) Support override of jsvc binary and log file locations when launching secure datanode.

2013-02-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583687#comment-13583687
 ] 

Chris Nauroth commented on HDFS-4519:
-

trunk already has the capability to set the JSVC_HOME environment variable to 
point at the operator's preferred version of jsvc.  Part of this change will 
back-port support for the JSVC_HOME environment variable to branch-1.  Unlike 
trunk, we will maintain the behavior of defaulting to Hadoop's bundled version 
of jsvc for backwards-compatibility if JSVC_HOME is not specified.  
Additionally, we will introduce JSVC_OUTFILE and JSVC_ERRFILE for redirecting 
the output and error streams of jsvc.

One specific usage of this is to allow use of a new feature added in recent 
jsvc versions to redirect outfile and errfile to stdout and stderr via the 
special arguments '&1' and '&2' respectively.  This feature is not supported in 
the current version bundled in Hadoop, and commons-daemon has stopped providing 
pre-built binary distributions, so we cannot upgrade the bundled version.


> Support override of jsvc binary and log file locations when launching secure 
> datanode.
> --
>
> Key: HDFS-4519
> URL: https://issues.apache.org/jira/browse/HDFS-4519
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, scripts
>Affects Versions: 1.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Currently, builds based on branch-1 bundle a specific version of jsvc 
> pre-built for Linux, and the startup scripts hard-code the location of the 
> output and error files.  Some deployments may prefer to upgrade to a 
> different version of jsvc, independent of the version bundled in Hadoop, and 
> redirect its output elsewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4519) Support override of jsvc binary and log file locations when launching secure datanode.

2013-02-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-4519:


Attachment: HDFS-4519-branch-1.1.patch

The attached patch introduces JSVC_HOME, JSVC_OUTFILE, and JSVC_ERRFILE 
environment variables in the hadoop script on branch-1.  I've successfully 
tested this change manually on a secure cluster.  I tested overriding to a 
manually built commons-daemon-1.0.13, and I also tested that the default 
behavior of using the bundled version of jsvc is still intact.

Here is an example of launching a secure data node using this feature:

{code}
JSVC_HOME=/home/cnauroth/commons-daemon-1.0.13-src/src/native/unix \
  
HADOOP_CLASSPATH=/home/cnauroth/commons-daemon-1.0.13-src/dist/commons-daemon-1.0.13.jar
 \
  HADOOP_USER_CLASSPATH_FIRST=true \
  JSVC_OUTFILE='&1' \
  JSVC_ERRFILE='&2' \
  HADOOP_SECURE_DN_USER=cnauroth \
  sudo -E bin/hadoop datanode
{code}


> Support override of jsvc binary and log file locations when launching secure 
> datanode.
> --
>
> Key: HDFS-4519
> URL: https://issues.apache.org/jira/browse/HDFS-4519
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, scripts
>Affects Versions: 1.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-4519-branch-1.1.patch
>
>
> Currently, builds based on branch-1 bundle a specific version of jsvc 
> pre-built for Linux, and the startup scripts hard-code the location of the 
> output and error files.  Some deployments may prefer to upgrade to a 
> different version of jsvc, independent of the version bundled in Hadoop, and 
> redirect its output elsewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-2350) Secure DN doesn't print output to console when started interactively

2013-02-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth reassigned HDFS-2350:
---

Assignee: Chris Nauroth

> Secure DN doesn't print output to console when started interactively
> 
>
> Key: HDFS-2350
> URL: https://issues.apache.org/jira/browse/HDFS-2350
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Chris Nauroth
> Fix For: 0.24.0
>
>
> If one starts a secure DN (using jsvc) interactively, the output is not 
> printed to the console, but instead ends up in {{$HADOOP_LOG_DIR/jsvc.err}} 
> and {{$HADOOP_LOG_DIR/jsvc.out}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4519) Support override of jsvc binary and log file locations when launching secure datanode.

2013-02-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583708#comment-13583708
 ] 

Chris Nauroth commented on HDFS-4519:
-

Hi, Aaron.

I don't see an exact match for this issue, based on the comments I gave above.  
I see HDFS-4497, which upgrades the commons-daemon version dependency on the 
Java side to resolve a build issue.  I've been watching and code reviewing that 
one.  I also found HDFS-2350, an old jira you filed about how secure data nodes 
don't print to console.  I reassigned HDFS-2350 to myself, because the patch I 
posted here will give us the capability to send jsvc outfile/errfile to 
stdout/stderr.

If you see an existing issue that I missed though, please let me know.  Thanks!


> Support override of jsvc binary and log file locations when launching secure 
> datanode.
> --
>
> Key: HDFS-4519
> URL: https://issues.apache.org/jira/browse/HDFS-4519
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, scripts
>Affects Versions: 1.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-4519-branch-1.1.patch
>
>
> Currently, builds based on branch-1 bundle a specific version of jsvc 
> pre-built for Linux, and the startup scripts hard-code the location of the 
> output and error files.  Some deployments may prefer to upgrade to a 
> different version of jsvc, independent of the version bundled in Hadoop, and 
> redirect its output elsewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4439) umask-mode does not support 4-digit umask value

2013-02-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583722#comment-13583722
 ] 

Hadoop QA commented on HDFS-4439:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12570374/HDFS-4439.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3992//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3992//console

This message is automatically generated.

> umask-mode does not support 4-digit umask value
> ---
>
> Key: HDFS-4439
> URL: https://issues.apache.org/jira/browse/HDFS-4439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Andy Isaacson
>Assignee: Chu Tong
> Attachments: HDFS-4439.patch
>
>
> Best practice for specifying file permissions using the legacy octal format 
> is to always add a leading "0" to ensure the value is treated as octal.  
> However the {{fs.permissions.umask-mode}} parsing code throws an error when 
> given a 4-digit string:
> {code}
> $ hdfs dfs -Dfs.permissions.umask-mode= -touchz foo.txt
> 2013-01-24 12:49:02,352 WARN  permission.FsPermission 
> (FsPermission.java:getUMask(245)) - Unable to parse configuration 
> fs.permissions.umask-mode with value  as octal or symbolic umask.
> -touchz: Unable to parse configuration fs.permissions.umask-mode with value 
>  as octal or symbolic umask.
> Usage: hadoop fs [generic options] -touchz  ...
> {code}
> There's no downside to supporting {{}}, so hdfs should handle it 
> gracefully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4520) Support listing snapshots under a snapshottable directory using "ls"

2013-02-21 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-4520:
---

 Summary: Support listing snapshots under a snapshottable directory 
using "ls"
 Key: HDFS-4520
 URL: https://issues.apache.org/jira/browse/HDFS-4520
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


Before developing a specific command/tool for listing snapshots with different 
criteria, we can first support snapshot listing using normal ls command. Users 
can list all the snapshots under a snapshottable directory using "ls 
/snapshottable_dir_path/.snapshot".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4520) Support listing snapshots under a snapshottable directory using "ls"

2013-02-21 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4520:


Attachment: HDFS-4520.000.patch

A simple initial patch.

> Support listing snapshots under a snapshottable directory using "ls"
> 
>
> Key: HDFS-4520
> URL: https://issues.apache.org/jira/browse/HDFS-4520
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4520.000.patch
>
>
> Before developing a specific command/tool for listing snapshots with 
> different criteria, we can first support snapshot listing using normal ls 
> command. Users can list all the snapshots under a snapshottable directory 
> using "ls /snapshottable_dir_path/.snapshot".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HDFS-4269) DatanodeManager#registerDatanode rejects all datanode registrations from localhost in single-node developer setup

2013-02-21 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik reopened HDFS-4269:
--


Reopening to merge into branch-2

> DatanodeManager#registerDatanode rejects all datanode registrations from 
> localhost in single-node developer setup
> -
>
> Key: HDFS-4269
> URL: https://issues.apache.org/jira/browse/HDFS-4269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, trunk-win
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4269.1.patch, HDFS-4269.2.patch, HDFS-4269.3.patch
>
>
> HDFS-3990 is a change that optimized some redundant DNS lookups.  As part of 
> that change, {{DatanodeManager#registerDatanode}} now rejects attempts to 
> register a datanode for which the name has not been resolved.  Unfortunately, 
> this broke single-node developer setups on Windows, because Windows does not 
> resolve 127.0.0.1 to "localhost".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4269) DatanodeManager#registerDatanode rejects all datanode registrations from localhost in single-node developer setup

2013-02-21 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583783#comment-13583783
 ] 

Konstantin Boudnik commented on HDFS-4269:
--

This original "feature" also breaks Linux especially CentOS with its silly way 
of configuring {{/etc/hosts}}

> DatanodeManager#registerDatanode rejects all datanode registrations from 
> localhost in single-node developer setup
> -
>
> Key: HDFS-4269
> URL: https://issues.apache.org/jira/browse/HDFS-4269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, trunk-win
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4269.1.patch, HDFS-4269.2.patch, HDFS-4269.3.patch
>
>
> HDFS-3990 is a change that optimized some redundant DNS lookups.  As part of 
> that change, {{DatanodeManager#registerDatanode}} now rejects attempts to 
> register a datanode for which the name has not been resolved.  Unfortunately, 
> this broke single-node developer setups on Windows, because Windows does not 
> resolve 127.0.0.1 to "localhost".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-4521:
--

 Summary: invalid network toploogies should not be cached
 Key: HDFS-4521
 URL: https://issues.apache.org/jira/browse/HDFS-4521
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.4-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


When the network topology is invalid, the DataNode refuses to start with a 
message such as this:

{quote}
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
172.29.122.23:55886: error:
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network 
topology. You cannot have a rack and a non-rack node at the same level of the 
network topology.
{quote}

This is expected if you specify a topology file or script which puts leaf nodes 
at two different depths.  However, one problem we have now is that this 
incorrect topology is cached forever.  Once the NameNode sees it, this DataNode 
can never be added to the cluster, since this exception will be rethrown each 
time.  The NameNode will not check to see if the topology file or script has 
changed.  We should clear the topology mappings when there is an 
InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4521:
---

Status: Patch Available  (was: Open)

> invalid network toploogies should not be cached
> ---
>
> Key: HDFS-4521
> URL: https://issues.apache.org/jira/browse/HDFS-4521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.4-beta
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4521.001.patch
>
>
> When the network topology is invalid, the DataNode refuses to start with a 
> message such as this:
> {quote}
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 172.29.122.23:55886: error:
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid 
> network topology. You cannot have a rack and a non-rack node at the same 
> level of the network topology.
> {quote}
> This is expected if you specify a topology file or script which puts leaf 
> nodes at two different depths.  However, one problem we have now is that this 
> incorrect topology is cached forever.  Once the NameNode sees it, this 
> DataNode can never be added to the cluster, since this exception will be 
> rethrown each time.  The NameNode will not check to see if the topology file 
> or script has changed.  We should clear the topology mappings when there is 
> an InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4521:
---

Attachment: HDFS-4521.001.patch

This patch refactors the {{DNSToSwitchMapping}} interface somewhat.  Utility 
methods are now in {{DNSToSwitchManager}}, rather than in 
{{AbstractDNSToSwitchMapping}}.  It makes more sense to use composition here.

This also eliminates the "double caching" that was formerly going on, where we 
had a map in {{CachedDNSToSwitchMapping}} storing the exact same data as was 
present in {{TableMapping}}.

I used the visitor pattern to avoid exposing the data structure that the 
mappings were stored in.

I add the clearCachedMappings function and implement it for all 
{{DNSToSwitchMapping}} subclasses.  In the DatanodeManager, invoke this 
function when there is an {{InvalidTopologyException}}.

> invalid network toploogies should not be cached
> ---
>
> Key: HDFS-4521
> URL: https://issues.apache.org/jira/browse/HDFS-4521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.4-beta
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4521.001.patch
>
>
> When the network topology is invalid, the DataNode refuses to start with a 
> message such as this:
> {quote}
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 172.29.122.23:55886: error:
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid 
> network topology. You cannot have a rack and a non-rack node at the same 
> level of the network topology.
> {quote}
> This is expected if you specify a topology file or script which puts leaf 
> nodes at two different depths.  However, one problem we have now is that this 
> incorrect topology is cached forever.  Once the NameNode sees it, this 
> DataNode can never be added to the cluster, since this exception will be 
> rethrown each time.  The NameNode will not check to see if the topology file 
> or script has changed.  We should clear the topology mappings when there is 
> an InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583812#comment-13583812
 ] 

Hadoop QA commented on HDFS-4521:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12570403/HDFS-4521.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

  {color:red}-1 one of tests included doesn't have a timeout.{color}

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3993//console

This message is automatically generated.

> invalid network toploogies should not be cached
> ---
>
> Key: HDFS-4521
> URL: https://issues.apache.org/jira/browse/HDFS-4521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.4-beta
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4521.001.patch
>
>
> When the network topology is invalid, the DataNode refuses to start with a 
> message such as this:
> {quote}
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 172.29.122.23:55886: error:
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid 
> network topology. You cannot have a rack and a non-rack node at the same 
> level of the network topology.
> {quote}
> This is expected if you specify a topology file or script which puts leaf 
> nodes at two different depths.  However, one problem we have now is that this 
> incorrect topology is cached forever.  Once the NameNode sees it, this 
> DataNode can never be added to the cluster, since this exception will be 
> rethrown each time.  The NameNode will not check to see if the topology file 
> or script has changed.  We should clear the topology mappings when there is 
> an InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4521:
---

Attachment: HDFS-4521.002.patch

fix some build errors, doxygen errors, etc.

> invalid network toploogies should not be cached
> ---
>
> Key: HDFS-4521
> URL: https://issues.apache.org/jira/browse/HDFS-4521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.4-beta
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4521.001.patch, HDFS-4521.002.patch
>
>
> When the network topology is invalid, the DataNode refuses to start with a 
> message such as this:
> {quote}
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 172.29.122.23:55886: error:
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid 
> network topology. You cannot have a rack and a non-rack node at the same 
> level of the network topology.
> {quote}
> This is expected if you specify a topology file or script which puts leaf 
> nodes at two different depths.  However, one problem we have now is that this 
> incorrect topology is cached forever.  Once the NameNode sees it, this 
> DataNode can never be added to the cluster, since this exception will be 
> rethrown each time.  The NameNode will not check to see if the topology file 
> or script has changed.  We should clear the topology mappings when there is 
> an InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4222:
--

Status: Open  (was: Patch Available)

Canceling the patch to attach 0.23 patch.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 0.23.3, 1.0.0
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Fix For: 2.0.4-beta
>
> Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues

2013-02-21 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4222:
--

Attachment: HDFS-4222.23.patch

The merge was not straightforward. Kihwal, can you also do a quick review of 
0.23 version of the patch.

> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> ---
>
> Key: HDFS-4222
> URL: https://issues.apache.org/jira/browse/HDFS-4222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
>Priority: Minor
> Fix For: 2.0.4-beta
>
> Attachments: HDFS-4222.23.patch, hdfs-4222-branch-0.23.3.patch, 
> HDFS-4222.patch, HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4521) invalid network toploogies should not be cached

2013-02-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583928#comment-13583928
 ] 

Hadoop QA commented on HDFS-4521:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12570411/HDFS-4521.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified test files.

  {color:red}-1 one of tests included doesn't have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing
  org.apache.hadoop.yarn.util.TestRackResolver
  
org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
  org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
  org.apache.hadoop.hdfs.TestDatanodeBlockScanner
  org.apache.hadoop.cli.TestHDFSCLI
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
  org.apache.hadoop.hdfs.server.namenode.TestHostsFiles
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
  org.apache.hadoop.hdfs.TestReplication

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3994//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3994//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3994//console

This message is automatically generated.

> invalid network toploogies should not be cached
> ---
>
> Key: HDFS-4521
> URL: https://issues.apache.org/jira/browse/HDFS-4521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.4-beta
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-4521.001.patch, HDFS-4521.002.patch
>
>
> When the network topology is invalid, the DataNode refuses to start with a 
> message such as this:
> {quote}
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
> 172.29.122.23:55886: error:
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid 
> network topology. You cannot have a rack and a non-rack node at the same 
> level of the network topology.
> {quote}
> This is expected if you specify a topology file or script which puts leaf 
> nodes at two different depths.  However, one problem we have now is that this 
> incorrect topology is cached forever.  Once the NameNode sees it, this 
> DataNode can never be added to the cluster, since this exception will be 
> rethrown each time.  The NameNode will not check to see if the topology file 
> or script has changed.  We should clear the topology mappings when there is 
> an InvalidTopologyException, to prevent this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4269) DatanodeManager#registerDatanode rejects all datanode registrations from localhost in single-node developer setup

2013-02-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-4269:


 Target Version/s: 3.0.0, trunk-win, 2.0.4-beta  (was: 3.0.0, trunk-win)
Affects Version/s: 2.0.4-beta

Thanks, Konstantin.  I've confirmed that the same patch can apply cleanly to 
branch-2.  I applied it and ran {{TestHDFSFileContextMainOperations}} for a 
quick test.

Adding 2.0.4-beta to Affects Versions and Target Versions.


> DatanodeManager#registerDatanode rejects all datanode registrations from 
> localhost in single-node developer setup
> -
>
> Key: HDFS-4269
> URL: https://issues.apache.org/jira/browse/HDFS-4269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, trunk-win, 2.0.4-beta
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4269.1.patch, HDFS-4269.2.patch, HDFS-4269.3.patch
>
>
> HDFS-3990 is a change that optimized some redundant DNS lookups.  As part of 
> that change, {{DatanodeManager#registerDatanode}} now rejects attempts to 
> register a datanode for which the name has not been resolved.  Unfortunately, 
> this broke single-node developer setups on Windows, because Windows does not 
> resolve 127.0.0.1 to "localhost".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4269) DatanodeManager#registerDatanode rejects all datanode registrations from localhost in single-node developer setup

2013-02-21 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584075#comment-13584075
 ] 

Konstantin Boudnik commented on HDFS-4269:
--

This is great Chris! I will commit it in the morning then!

> DatanodeManager#registerDatanode rejects all datanode registrations from 
> localhost in single-node developer setup
> -
>
> Key: HDFS-4269
> URL: https://issues.apache.org/jira/browse/HDFS-4269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, trunk-win, 2.0.4-beta
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4269.1.patch, HDFS-4269.2.patch, HDFS-4269.3.patch
>
>
> HDFS-3990 is a change that optimized some redundant DNS lookups.  As part of 
> that change, {{DatanodeManager#registerDatanode}} now rejects attempts to 
> register a datanode for which the name has not been resolved.  Unfortunately, 
> this broke single-node developer setups on Windows, because Windows does not 
> resolve 127.0.0.1 to "localhost".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira