date:20120625

[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401197#comment-13401197
 ] 

Hadoop QA commented on HDFS-3535:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533399/hdfs-3535-2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2700//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2700//console

This message is automatically generated.

> audit logging should log denied accesses as well as permitted ones
> --
>
> Key: HDFS-3535
> URL: https://issues.apache.org/jira/browse/HDFS-3535
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs-3535-1.txt, hdfs-3535-2.txt, hdfs-3535.txt
>
>
> FSNamesystem.java logs an audit log entry when a user successfully accesses 
> the filesystem:
> {code}
>   logAuditEvent(UserGroupInformation.getLoginUser(),
> Server.getRemoteIp(),
> "concat", Arrays.toString(srcs), target, resultingStat);
> {code}
> but there is no similar log when a user attempts to access the filesystem and 
> is denied due to permissions.  Competing systems do provide such logging of 
> denied access attempts; we should too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3559) DFSTestUtil: use Builder class to construct DFSTestUtil instances

2012-06-25 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3559:
---

Attachment: HDFS-3559.002.patch

* "public static" instead of "static public" class

* make some of the instance variables of DFSTestUtil final

> DFSTestUtil: use Builder class to construct DFSTestUtil instances
> -
>
> Key: HDFS-3559
> URL: https://issues.apache.org/jira/browse/HDFS-3559
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-3559.001.patch, HDFS-3559.002.patch
>
>
> The number of parameters in DFSTestUtil's constructor has grown over time.  
> It would be nice to have a Builder class similar to MiniDFSClusterBuilder, 
> which could construct an instance of DFSTestUtil.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401190#comment-13401190
 ] 

Hudson commented on HDFS-3498:
--

Integrated in Hadoop-Common-trunk-Commit #2388 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2388/])
HDFS-3498. Support replica removal in BlockPlacementPolicy and make 
BlockPlacementPolicyDefault extensible for reusing code in subclasses.  
Contributed by Junping Du (Revision 1353807)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0
>
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401189#comment-13401189
 ] 

Hudson commented on HDFS-3516:
--

Integrated in Hadoop-Common-trunk-Commit #2388 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2388/])
HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


> Check content-type in WebHdfsFileSystem
> ---
>
> Key: HDFS-3516
> URL: https://issues.apache.org/jira/browse/HDFS-3516
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.1-alpha
>
> Attachments: h3516_20120607.patch, h3516_20120608.patch, 
> h3516_20120609.patch
>
>
> WebHdfsFileSystem currently tries to parse the response as json.  It may be a 
> good idea to check the content-type before parsing it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401185#comment-13401185
 ] 

Hudson commented on HDFS-3516:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2457 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2457/])
HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


> Check content-type in WebHdfsFileSystem
> ---
>
> Key: HDFS-3516
> URL: https://issues.apache.org/jira/browse/HDFS-3516
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.1-alpha
>
> Attachments: h3516_20120607.patch, h3516_20120608.patch, 
> h3516_20120609.patch
>
>
> WebHdfsFileSystem currently tries to parse the response as json.  It may be a 
> good idea to check the content-type before parsing it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401186#comment-13401186
 ] 

Hudson commented on HDFS-3498:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2457 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2457/])
HDFS-3498. Support replica removal in BlockPlacementPolicy and make 
BlockPlacementPolicyDefault extensible for reusing code in subclasses.  
Contributed by Junping Du (Revision 1353807)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0
>
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401184#comment-13401184
 ] 

Uma Maheswara Rao G commented on HDFS-3541:
---

for the comment:
{quote}
2.This chunk of code confuses me, since you don't use written again after the 
loop, and there doesn't seem to be any need to call write(...) many times:
{quote}
try using util APIs already available for writing data.

@Kihwal, good point, worth asserting block finalization. 

> Deadlock between recovery, xceiver and packet responder
> ---
>
> Key: HDFS-3541
> URL: https://issues.apache.org/jira/browse/HDFS-3541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: suja s
>Assignee: Vinay
> Attachments: DN_dump.rar, HDFS-3541.patch
>
>
> Block Recovery initiated while write in progress at Datanode side. Found a 
> lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401175#comment-13401175
 ] 

Hadoop QA commented on HDFS-2617:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533408/hdfs-2617-1.1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2699//console

This message is automatically generated.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3551) WebHDFS CREATE does not use client location for redirection

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3551:
-

Status: Patch Available  (was: Open)

> WebHDFS CREATE does not use client location for redirection
> ---
>
> Key: HDFS-3551
> URL: https://issues.apache.org/jira/browse/HDFS-3551
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 1.0.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3551_20120620.patch, h3551_20120625.patch
>
>
> CREATE currently redirects client to a random datanode but not using the 
> client location information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3551) WebHDFS CREATE does not use client location for redirection

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3551:
-

Attachment: h3551_20120625.patch

h3551_20120625: adds a test

> WebHDFS CREATE does not use client location for redirection
> ---
>
> Key: HDFS-3551
> URL: https://issues.apache.org/jira/browse/HDFS-3551
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3551_20120620.patch, h3551_20120625.patch
>
>
> CREATE currently redirects client to a random datanode but not using the 
> client location information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401166#comment-13401166
 ] 

Hudson commented on HDFS-3498:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2407 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2407/])
HDFS-3498. Support replica removal in BlockPlacementPolicy and make 
BlockPlacementPolicyDefault extensible for reusing code in subclasses.  
Contributed by Junping Du (Revision 1353807)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0
>
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-06-25 Thread Vinay (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401163#comment-13401163
 ] 

Vinay commented on HDFS-3507:
-

Thanks Aaron., I really did not know about it..

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
> Attachments: HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3516) Check content-type in WebHdfsFileSystem

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3516:
-

   Resolution: Fixed
Fix Version/s: 2.0.1-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed this.

> Check content-type in WebHdfsFileSystem
> ---
>
> Key: HDFS-3516
> URL: https://issues.apache.org/jira/browse/HDFS-3516
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.1-alpha
>
> Attachments: h3516_20120607.patch, h3516_20120608.patch, 
> h3516_20120609.patch
>
>
> WebHdfsFileSystem currently tries to parse the response as json.  It may be a 
> good idea to check the content-type before parsing it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3498:
-

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Junping!

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0
>
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401138#comment-13401138
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3498:
--

+1 The v5 patch looks good.  Since the changes are minor, I will commit it 
without waiting for Jenkins again.

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0
>
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401136#comment-13401136
 ] 

Junping Du commented on HDFS-3498:
--

Thanks. Nicholas. I add a few comments of javadoc in new patch (without code 
change).

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-3498:
-

Attachment: HDFS-3498-v5.patch

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder

2012-06-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401135#comment-13401135
 ] 

Kihwal Lee commented on HDFS-3541:
--

The patch looks okay but I was wondering whether the test can be improved. 

The test in the current patch does not directly recreate the original race 
condition. Probably an artificial deadlock can be created by creating a thread 
which does sleep and then kills the writer inside a 
{{synchronized(datanode.data)}} block. While it's sleeping, another thread 
could try closing the {{DFSOutputStream}}. This should fail when the writer 
(i.e. the {{DataXceiver}} thread) is killed and streams get closed.  After this 
we could verify the block is not finalized. Then we know the 
{{PacketResponder}} thread didn't finalize the block. 

Does it make sense?


> Deadlock between recovery, xceiver and packet responder
> ---
>
> Key: HDFS-3541
> URL: https://issues.apache.org/jira/browse/HDFS-3541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: suja s
>Assignee: Vinay
> Attachments: DN_dump.rar, HDFS-3541.patch
>
>
> Block Recovery initiated while write in progress at Datanode side. Found a 
> lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401133#comment-13401133
 ] 

Hudson commented on HDFS-3516:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2406 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2406/])
HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


> Check content-type in WebHdfsFileSystem
> ---
>
> Key: HDFS-3516
> URL: https://issues.apache.org/jira/browse/HDFS-3516
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3516_20120607.patch, h3516_20120608.patch, 
> h3516_20120609.patch
>
>
> WebHdfsFileSystem currently tries to parse the response as json.  It may be a 
> good idea to check the content-type before parsing it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3498:
-

 Component/s: (was: data-node)
  name-node
Hadoop Flags: Reviewed

+1 patch looks good.

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass

2012-06-25 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-3498:
-

Summary: Make Replica Removal Policy pluggable and 
ReplicaPlacementPolicyDefault extensible for reusing code in subclass  (was: 
Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass)

> Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault 
> extensible for reusing code in subclass
> ---
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2988) Improve error message when storage directory lock fails

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401131#comment-13401131
 ] 

Hadoop QA commented on HDFS-2988:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533385/HDFS-2988.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2697//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2697//console

This message is automatically generated.

> Improve error message when storage directory lock fails
> ---
>
> Key: HDFS-2988
> URL: https://issues.apache.org/jira/browse/HDFS-2988
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-2988.patch, HDFS-2988.patch, HDFS-2988.patch
>
>
> Currently, the error message is fairly opaque to a non-developer ("Cannot 
> lock storage" or something). Instead, we should have some improvments:
> - when we create the in_use.lock file, we should write the hostname/PID that 
> locked the file
> - if the lock fails, and in_use.lock exists, the error message should say 
> something like "It appears that another namenode (pid 23423 on host 
> foo.example.com) has already locked the storage directory."
> - if the lock fails, and no lock file exists, the error message should say 
> something like "if this storage directory is mounted via NFS, ensure that the 
> appropriate nfs lock services are running."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3566) Custom Replication Policy for Azure

2012-06-25 Thread Sumadhur Reddy Bolli (JIRA)

Sumadhur Reddy Bolli created HDFS-3566:
--

 Summary: Custom Replication Policy for Azure
 Key: HDFS-3566
 URL: https://issues.apache.org/jira/browse/HDFS-3566
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Sumadhur Reddy Bolli


Azure has logical concepts like fault and upgrade domains. Each fault domain 
spans multiple upgrade domains and each upgrade domain spans multiple fault 
domains. Machines are spread typically evenly across both fault and upgrade 
domains. Fault domain failures are typically catastrophic/unplanned failures 
and data loss possibility is high. An upgrade domain can be taken down by azure 
for maintenance periodically. Each time an upgrade domain is taken down a small 
percentage of machines in the upgrade domain(typically 1-2%) are replaced due 
to disk failures, thus losing data. Assuming the default replication factor 3, 
any 3 data nodes going down at the same time would mean potential data loss. 
So, it is important to have a policy that spreads replicas across both fault 
and upgrade domains to ensure practically no data loss. The problem here is two 
dimensional and the default policy in hadoop is one-dimensional. This policy 
would spread the datanodes across atleast 2 fault domains and three upgrade 
domains to prevent data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401078#comment-13401078
 ] 

Hadoop QA commented on HDFS-3507:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532364/HDFS-3507.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 11 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2698//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2698//console

This message is automatically generated.

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
> Attachments: HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3565) Fix streaming job failures with WindowsResourceCalculatorPlugin

2012-06-25 Thread Bikas Saha (JIRA)

Bikas Saha created HDFS-3565:


 Summary: Fix streaming job failures with 
WindowsResourceCalculatorPlugin
 Key: HDFS-3565
 URL: https://issues.apache.org/jira/browse/HDFS-3565
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha


Some streaming jobs use local mode job runs that do not start tasks trackers. 
In these cases, the jvm context is not setup and hence local mode execution 
causes the code to crash.
Fix is to not not use ResourceCalculatorPlugin in such cases or make the local 
job run creating dummy jvm contexts. Choosing the first option because thats 
the current implicit behavior in Linux. The ProcfsBasedProcessTree (used inside 
the LinuxResourceCalculatorPlugin) does no real work when the process pid is 
not setup correctly. This is what happens when local job mode runs.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-2386) with security enabled fsck calls lead to handshake_failure and hftp fails throwing the same exception in the logs

2012-06-25 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-2386.
-

Resolution: Invalid

Fixed via HDFS-2617.

> with security enabled fsck calls lead to handshake_failure and hftp fails 
> throwing the same exception in the logs
> -
>
> Key: HDFS-2386
> URL: https://issues.apache.org/jira/browse/HDFS-2386
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-06-25 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-2617:


Attachment: hdfs-2617-1.1.patch

Here's the same patch resolving some conflicts for branch-1. This compiles, but 
I still need to test it out.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3564) Make the replication policy pluggable to allow custom replication policies

2012-06-25 Thread Sumadhur Reddy Bolli (JIRA)

Sumadhur Reddy Bolli created HDFS-3564:
--

 Summary: Make the replication policy pluggable to allow custom 
replication policies
 Key: HDFS-3564
 URL: https://issues.apache.org/jira/browse/HDFS-3564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Sumadhur Reddy Bolli


ReplicationTargetChooser currently determines the placement of replicas in 
hadoop. Making the replication policy pluggable would help in having custom 
replication policies that suit the environment. 

Eg1: Enabling placing replicas across different datacenters(not just racks)
Eg2: Enabling placing replicas across multiple(more than 2) racks
Eg3: Cloud environments like azure have logical concepts like fault and upgrade 
domains. Each fault domain spans multiple upgrade domains and each upgrade 
domain spans multiple fault domains. Machines are spread typically evenly 
across both fault and upgrade domains. Fault domain failures are typically 
catastrophic/unplanned failures and data loss possibility is high. An upgrade 
domain can be taken down by azure for maintenance periodically. Each time an 
upgrade domain is taken down a small percentage of machines in the upgrade 
domain(typically 1-2%) are replaced due to disk failures, thus losing data. 
Assuming the default replication factor 3, any 3 data nodes going down at the 
same time would mean potential data loss. So, it is important to have a policy 
that spreads replicas across both fault and upgrade domains to ensure 
practically no data loss. The problem here is two dimensional and the default 
policy in hadoop is one-dimensional. Custom policies to address issues like 
these can be written if we make the policy pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3559) DFSTestUtil: use Builder class to construct DFSTestUtil instances

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401037#comment-13401037
 ] 

Aaron T. Myers commented on HDFS-3559:
--

Patch looks really good to me. Just a few little nits:

# I think doing "public static class" is a little more common than "static 
public class" throughout the project.
# Seems like the instance vars in DFSTestUtil can reasonably be made final.

+1 once these are addressed.

> DFSTestUtil: use Builder class to construct DFSTestUtil instances
> -
>
> Key: HDFS-3559
> URL: https://issues.apache.org/jira/browse/HDFS-3559
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-3559.001.patch
>
>
> The number of parameters in DFSTestUtil's constructor has grown over time.  
> It would be nice to have a Builder class similar to MiniDFSClusterBuilder, 
> which could construct an instance of DFSTestUtil.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3535) audit logging should log denied accesses as well as permitted ones

2012-06-25 Thread Andy Isaacson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3535:


Attachment: hdfs-3535-2.txt

Attaching hdfs-3535-2.txt adopting @Before/@After annotations.

> audit logging should log denied accesses as well as permitted ones
> --
>
> Key: HDFS-3535
> URL: https://issues.apache.org/jira/browse/HDFS-3535
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs-3535-1.txt, hdfs-3535-2.txt, hdfs-3535.txt
>
>
> FSNamesystem.java logs an audit log entry when a user successfully accesses 
> the filesystem:
> {code}
>   logAuditEvent(UserGroupInformation.getLoginUser(),
> Server.getRemoteIp(),
> "concat", Arrays.toString(srcs), target, resultingStat);
> {code}
> but there is no similar log when a user attempts to access the filesystem and 
> is denied due to permissions.  Competing systems do provide such logging of 
> denied access attempts; we should too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3526) Standy NameNode is entering into Safemode even after HDFS-2914 due to resources low

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401022#comment-13401022
 ] 

Aaron T. Myers commented on HDFS-3526:
--

Hi Vinay, I'm not sure I agree with the premise of this JIRA. The issue that 
precipitated HDFS-2914 was that if the shared edits dir temporarily 
disappeared, the Standby NN should not enter safemode. If the standby starts up 
fresh and its disks are full, I see no reason it shouldn't go into safemode. 
Thoughts?

> Standy NameNode is entering into Safemode even after HDFS-2914 due to 
> resources low
> ---
>
> Key: HDFS-3526
> URL: https://issues.apache.org/jira/browse/HDFS-3526
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: Vinay
> Attachments: HDFS-3526.patch
>
>
> Scenario:
> =
> Start ANN SNN with One DN
> Make 100% disk full for SNN
> Now restart SNN..
> Here SNN is going safemode..But it shouldnot happen according to HDFS-2914

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3461) HFTP should use the same port & protocol for getting the delegation token

2012-06-25 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401013#comment-13401013
 ] 

Owen O'Malley commented on HDFS-3461:
-

This is the 1.1 branch with HDFS-2617 applied.

> HFTP should use the same port & protocol for getting the delegation token
> -
>
> Key: HDFS-3461
> URL: https://issues.apache.org/jira/browse/HDFS-3461
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 1.1.0
>
>
> Currently, hftp uses http to the Namenode's https port, which doesn't work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3554) TestRaidNode is failing

2012-06-25 Thread Weiyan Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401009#comment-13401009
 ] 

Weiyan Wang commented on HDFS-3554:
---

Do you mean I should use MiniMRYarnCluster instead of MiniMRCluster? Is there 
any example I could follow to start a job history server?

> TestRaidNode is failing
> ---
>
> Key: HDFS-3554
> URL: https://issues.apache.org/jira/browse/HDFS-3554
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/raid, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Weiyan Wang
>
> After MAPREDUCE-3868 re-enabled raid, TestRaidNode has been failing in 
> Jenkins builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401006#comment-13401006
 ] 

Aaron T. Myers commented on HDFS-3507:
--

Hi Vinay, merely marking a patch open/PA again won't trigger another build. You 
either need to attach another file (could have the same content) or get someone 
to kick the HDFS pre-commit build. I've just done the latter for you.

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
> Attachments: HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-1469) TestBlockTokenWithDFS fails on trunk

2012-06-25 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1469.
---

Resolution: Cannot Reproduce

> TestBlockTokenWithDFS fails on trunk
> 
>
> Key: HDFS-1469
> URL: https://issues.apache.org/jira/browse/HDFS-1469
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Konstantin Boudnik
>Priority: Blocker
> Attachments: failed-TestBlockTokenWithDFS.txt, log.gz
>
>
> TestBlockTokenWithDFS is failing on trunk:
> Testcase: testAppend took 31.569 sec
>   FAILED
> null
> junit.framework.AssertionFailedError: null
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS.testAppend(TestBlockTokenWithDFS.java:223)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3170) Add more useful metrics for write latency

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400985#comment-13400985
 ] 

Hadoop QA commented on HDFS-3170:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533205/hdfs-3170.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2696//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2696//console

This message is automatically generated.

> Add more useful metrics for write latency
> -
>
> Key: HDFS-3170
> URL: https://issues.apache.org/jira/browse/HDFS-3170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Matthew Jacobs
> Attachments: hdfs-3170.txt
>
>
> Currently, the only write-latency related metric we expose is the total 
> amount of time taken by opWriteBlock. This is practically useless, since (a) 
> different blocks may be wildly different sizes, and (b) if the writer is only 
> generating data slowly, it will make a block write take longer by no fault of 
> the DN. I would like to propose two new metrics:
> 1) *flush-to-disk time*: count how long it takes for each call to flush an 
> incoming packet to disk (including the checksums). In most cases this will be 
> close to 0, as it only flushes to buffer cache, but if the backing block 
> device enters congested writeback, it can take much longer, which provides an 
> interesting metric.
> 2) *round trip to downstream pipeline node*: track the round trip latency for 
> the part of the pipeline between the local node and its downstream neighbors. 
> When we add a new packet to the ack queue, save the current timestamp. When 
> we receive an ack, update the metric based on how long since we sent the 
> original packet. This gives a metric of the total RTT through the pipeline. 
> If we also include this metric in the ack to upstream, we can subtract the 
> amount of time due to the later stages in the pipeline and have an accurate 
> count of this particular link.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones

2012-06-25 Thread Andy Isaacson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400963#comment-13400963
 ] 

Andy Isaacson commented on HDFS-3535:
-

{quote}
Forgot to mention, in TestAuditLogs use @Before and @After to setup/teardown 
cluster and fs in one place (see other tests for an example)
{quote}
Thanks, that makes the tests a lot nicer!  I'll post a new patch using that.

> audit logging should log denied accesses as well as permitted ones
> --
>
> Key: HDFS-3535
> URL: https://issues.apache.org/jira/browse/HDFS-3535
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs-3535-1.txt, hdfs-3535.txt
>
>
> FSNamesystem.java logs an audit log entry when a user successfully accesses 
> the filesystem:
> {code}
>   logAuditEvent(UserGroupInformation.getLoginUser(),
> Server.getRemoteIp(),
> "concat", Arrays.toString(srcs), target, resultingStat);
> {code}
> but there is no similar log when a user attempts to access the filesystem and 
> is denied due to permissions.  Competing systems do provide such logging of 
> denied access attempts; we should too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-06-25 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400948#comment-13400948
 ] 

Owen O'Malley commented on HDFS-2617:
-

The patch in HDP-1 is just the one above. I have a variant of it for Hadoop 1.1 
that I'll upload shortly.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2988) Improve error message when storage directory lock fails

2012-06-25 Thread Miomir Boljanovic (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miomir Boljanovic updated HDFS-2988:


Attachment: HDFS-2988.patch

Supposedly, previous patch caused 
org.apache.hadoop.hdfs.TestDatanodeBlockScanner to fail because I wrongly 
instantiated StorageDirectory
I realized afterwards that MiniDFSCluster should be used to instantiate 
StorageDirectory.

> Improve error message when storage directory lock fails
> ---
>
> Key: HDFS-2988
> URL: https://issues.apache.org/jira/browse/HDFS-2988
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-2988.patch, HDFS-2988.patch, HDFS-2988.patch
>
>
> Currently, the error message is fairly opaque to a non-developer ("Cannot 
> lock storage" or something). Instead, we should have some improvments:
> - when we create the in_use.lock file, we should write the hostname/PID that 
> locked the file
> - if the lock fails, and in_use.lock exists, the error message should say 
> something like "It appears that another namenode (pid 23423 on host 
> foo.example.com) has already locked the storage directory."
> - if the lock fails, and no lock file exists, the error message should say 
> something like "if this storage directory is mounted via NFS, ensure that the 
> appropriate nfs lock services are running."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3170) Add more useful metrics for write latency

2012-06-25 Thread Matthew Jacobs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Jacobs updated HDFS-3170:
-

Status: Patch Available  (was: Open)

The attached patch adds the write-latency related metrics described in this 
JIRA. The tests verify that the metrics are added. I manually checked that the 
averaged latency values were reasonable. For example, I added a sleep before 
taking the ack end time and then verified that the resulting metric (via jmx) 
was greater than the sleep time.

> Add more useful metrics for write latency
> -
>
> Key: HDFS-3170
> URL: https://issues.apache.org/jira/browse/HDFS-3170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Matthew Jacobs
> Attachments: hdfs-3170.txt
>
>
> Currently, the only write-latency related metric we expose is the total 
> amount of time taken by opWriteBlock. This is practically useless, since (a) 
> different blocks may be wildly different sizes, and (b) if the writer is only 
> generating data slowly, it will make a block write take longer by no fault of 
> the DN. I would like to propose two new metrics:
> 1) *flush-to-disk time*: count how long it takes for each call to flush an 
> incoming packet to disk (including the checksums). In most cases this will be 
> close to 0, as it only flushes to buffer cache, but if the backing block 
> device enters congested writeback, it can take much longer, which provides an 
> interesting metric.
> 2) *round trip to downstream pipeline node*: track the round trip latency for 
> the part of the pipeline between the local node and its downstream neighbors. 
> When we add a new packet to the ack queue, save the current timestamp. When 
> we receive an ack, update the metric based on how long since we sent the 
> original packet. This gives a metric of the total RTT through the pipeline. 
> If we also include this metric in the ack to upstream, we can subtract the 
> amount of time due to the later stages in the pipeline and have an accurate 
> count of this particular link.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400888#comment-13400888
 ] 

Aaron T. Myers commented on HDFS-3541:
--

Patch looks pretty good to me. Just two small comments:

# Misspelled "interrupted": "Finalizing block from Inturrupted thread should 
fail"
# This chunk of code confuses me, since you don't use {{written}} again after 
the loop, and there doesn't seem to be any need to call {{write(...)}} many 
times:
{code}
+  int written = 0;
+  for (; written < 512;) {
+out.writeBytes(data);
+written += 4;
+  }
{code}

Kihwal, how does this patch look to you?

> Deadlock between recovery, xceiver and packet responder
> ---
>
> Key: HDFS-3541
> URL: https://issues.apache.org/jira/browse/HDFS-3541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: suja s
>Assignee: Vinay
> Attachments: DN_dump.rar, HDFS-3541.patch
>
>
> Block Recovery initiated while write in progress at Datanode side. Found a 
> lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones

2012-06-25 Thread Andy Isaacson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400859#comment-13400859
 ] 

Andy Isaacson commented on HDFS-3535:
-

{quote}
The one audit log that doesn't have a corresponding log for failure is 
logFsckEvent, though given that we get the ugi from the request it seems like 
that case could result in an ACE as well right?
{quote}
the fsck audit event is logged before the fsck command is run, so it can't fail 
to generate the audit event. Also fsck is special in that it's implemented as a 
URL fetch, so I don't think the UGI is enforced.  This is probably a bug, and 
the audit logging will need to be fixed when that bug is fixed.

{quote}
Let's use fooInternal vs fooInt to match the existing "fooInternal" methods
{quote}

That would collide with several existing uses:  concatInternal, 
createSymlinkInternal, startFileInternal, renameToInternal, etc.  I 
specifically chose a suffix not previously used to avoid code churn.  Perhaps a 
different suffix than "Int" would convey this better, LMK if you have any good 
ideas.

{quote}
Normally the checks are used before the method invocation if we're doing 
expensive things to create the args (eg lots of string concatenation) not to 
save the cost of the method invocation. Doesn't look like that's the case here 
(we're not constructing args) so we could just call logAuditEvent directly 
everywhere.
{quote}
There are a bunch of uses of logAuditEvent that do need to check if audit 
logging is enabled before constructing log messages, etc.  I considered 
refactoring them all and concluded that it was out of scope for this change.  I 
decided not to change the existing idiom (verbose though it is) before 
refactoring all users of the interface, which should be a separate change.

> audit logging should log denied accesses as well as permitted ones
> --
>
> Key: HDFS-3535
> URL: https://issues.apache.org/jira/browse/HDFS-3535
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs-3535-1.txt, hdfs-3535.txt
>
>
> FSNamesystem.java logs an audit log entry when a user successfully accesses 
> the filesystem:
> {code}
>   logAuditEvent(UserGroupInformation.getLoginUser(),
> Server.getRemoteIp(),
> "concat", Arrays.toString(srcs), target, resultingStat);
> {code}
> but there is no similar log when a user attempts to access the filesystem and 
> is denied due to permissions.  Competing systems do provide such logging of 
> denied access attempts; we should too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400851#comment-13400851
 ] 

Hudson commented on HDFS-3549:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2403 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2403/])
HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason 
Lowe via daryn) (Revision 1353695)

 Result = FAILURE
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400847#comment-13400847
 ] 

Todd Lipcon commented on HDFS-3561:
---

+1 for setting it to 0 or 1 for the graceful fence attempt.

> ZKFC retries for 45 times to connect to other NN during fencing when network 
> between NNs broken and standby Nn will not take over as active 
> 
>
> Key: HDFS-3561
> URL: https://issues.apache.org/jira/browse/HDFS-3561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Reporter: suja s
>Assignee: Vinay
>
> Scenario:
> Active NN on machine1
> Standby NN on machine2
> Machine1 is isolated from the network (machine1 network cable unplugged)
> After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
> not there.
> ZKFC tries to failover NN2 as active.
> As part of this during fencing it tries to connect to machine1 and kill NN1. 
> (sshfence technique configured)
> This connection retry happens for 45 times( as it takes  
> ipc.client.connect.max.socket.retries)
> Also after that standby NN is not able to take over as active (because of 
> fencing failure).
> Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
> retries it can consider that NN as dead and instruct the other NN to take 
> over as active as there is no chance of the other NN (NN1) retaining its 
> state as active after zk session timeout when its isolated from network
> From ZKFC log:
> {noformat}
> 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
> 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
> 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
> 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
> 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
> 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
> 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
> 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
> 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
> 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400833#comment-13400833
 ] 

Aaron T. Myers commented on HDFS-3561:
--

bq. Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
retries it can consider that NN as dead and instruct the other NN to take over 
as active as there is no chance of the other NN (NN1) retaining its state as 
active after zk session timeout when its isolated from network

This isn't acceptable. The point of fencing is to ensure that if the 
previously-active NN returns from appearing to have been down, it doesn't start 
writing to the shared directory again while the new active is also writing to 
that directory.

bq. I think we can set retries to 1/2 for avoiding unnecessary actions on small 
nw fluctuations? or we can set it to 0 as we are already setting the same 
values in ConfiguredFailoverProxyProvider for failover clients.

We set it to 0 in ConfiguredFailoverProxyProvider because we want to trying 
failing over immediately as the retry mechanism, instead of repeatedly trying 
to contact a machine that may in fact be completely down.

I agree, though, that setting it to a lower number than 45 makes sense in the 
case of the client in the ZKFC, and perhaps making it configurable separately.

> ZKFC retries for 45 times to connect to other NN during fencing when network 
> between NNs broken and standby Nn will not take over as active 
> 
>
> Key: HDFS-3561
> URL: https://issues.apache.org/jira/browse/HDFS-3561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Reporter: suja s
>Assignee: Vinay
>
> Scenario:
> Active NN on machine1
> Standby NN on machine2
> Machine1 is isolated from the network (machine1 network cable unplugged)
> After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
> not there.
> ZKFC tries to failover NN2 as active.
> As part of this during fencing it tries to connect to machine1 and kill NN1. 
> (sshfence technique configured)
> This connection retry happens for 45 times( as it takes  
> ipc.client.connect.max.socket.retries)
> Also after that standby NN is not able to take over as active (because of 
> fencing failure).
> Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
> retries it can consider that NN as dead and instruct the other NN to take 
> over as active as there is no chance of the other NN (NN1) retaining its 
> state as active after zk session timeout when its isolated from network
> From ZKFC log:
> {noformat}
> 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
> 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
> 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
> 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
> 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
> 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
> 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
> 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
> 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
> 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command

2012-06-25 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400819#comment-13400819
 ] 

Daryn Sharp commented on HDFS-3557:
---

Erg, forgot the escapes:
{code}hadoop -ls '/foobar/\{18,19,20\}'{code}

> provide means of escaping special characters to `hadoop fs` command
> ---
>
> Key: HDFS-3557
> URL: https://issues.apache.org/jira/browse/HDFS-3557
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jeff Hodges
>Priority: Minor
>
> When running an investigative job, I used a date parameter that selected 
> multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It 
> used this same date parameter when creating the output directory.
> But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex 
> operator "?" and mv to change the name (that is, `-mv 
> output/2012/06/?18,19,20? foobar").
> Shells and filesystems for other systems provide a means of escaping "special 
> characters" generically, but there seems to be no such means in HDFS/`hadoop 
> fs`. Providing one would be a great way to make accessing HDFS more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command

2012-06-25 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400816#comment-13400816
 ] 

Daryn Sharp commented on HDFS-3557:
---

Try placing quotes around the paths, otherwise your unix shell is expanding the 
glob instead of hadoop expanding the glob.  Ie.
{code}hadoop -ls '/foobar/{18,19,20}'{code}

> provide means of escaping special characters to `hadoop fs` command
> ---
>
> Key: HDFS-3557
> URL: https://issues.apache.org/jira/browse/HDFS-3557
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jeff Hodges
>Priority: Minor
>
> When running an investigative job, I used a date parameter that selected 
> multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It 
> used this same date parameter when creating the output directory.
> But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex 
> operator "?" and mv to change the name (that is, `-mv 
> output/2012/06/?18,19,20? foobar").
> Shells and filesystems for other systems provide a means of escaping "special 
> characters" generically, but there seems to be no such means in HDFS/`hadoop 
> fs`. Providing one would be a great way to make accessing HDFS more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400792#comment-13400792
 ] 

Hudson commented on HDFS-3549:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2455 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2455/])
HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason 
Lowe via daryn) (Revision 1353695)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3563) Fix findbug warnings in raid

2012-06-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned HDFS-3563:


Assignee: Weiyan Wang

Weiyan, could you look into the findbugs warnings at some point?  Thanks!

> Fix findbug warnings in raid
> 
>
> Key: HDFS-3563
> URL: https://issues.apache.org/jira/browse/HDFS-3563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/raid
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Weiyan Wang
>
> MAPREDUCE-3868 re-enabled raid but introduced 31 new findbugs warnings.  
> Those warnings should be fixed or appropriate items placed in an exclude file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command

2012-06-25 Thread Jeff Hodges (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400774#comment-13400774
 ] 

Jeff Hodges commented on HDFS-3557:
---

The version is cdhu3.2 and the commands are (where /foobar/{18,19,20} is the 
name of a real directory created by a mapreduce job, and not multiple ones)

{code}
hadoop -ls /foobar/{18,19,20}
hadoop -mv /foobar/{18,19,20} /foobar/new
hadoop -rmr /foobar/{18,19,20}
{code}

All fail with errors that say the directories don't exist or that selecting 
multiple directories does not work.

> provide means of escaping special characters to `hadoop fs` command
> ---
>
> Key: HDFS-3557
> URL: https://issues.apache.org/jira/browse/HDFS-3557
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jeff Hodges
>Priority: Minor
>
> When running an investigative job, I used a date parameter that selected 
> multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It 
> used this same date parameter when creating the output directory.
> But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex 
> operator "?" and mv to change the name (that is, `-mv 
> output/2012/06/?18,19,20? foobar").
> Shells and filesystems for other systems provide a means of escaping "special 
> characters" generically, but there seems to be no such means in HDFS/`hadoop 
> fs`. Providing one would be a great way to make accessing HDFS more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400771#comment-13400771
 ] 

Hudson commented on HDFS-3549:
--

Integrated in Hadoop-Common-trunk-Commit #2385 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2385/])
HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason 
Lowe via daryn) (Revision 1353695)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3563) Fix findbug warnings in raid

2012-06-25 Thread Jason Lowe (JIRA)

Jason Lowe created HDFS-3563:


 Summary: Fix findbug warnings in raid
 Key: HDFS-3563
 URL: https://issues.apache.org/jira/browse/HDFS-3563
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/raid
Affects Versions: 3.0.0
Reporter: Jason Lowe


MAPREDUCE-3868 re-enabled raid but introduced 31 new findbugs warnings.  Those 
warnings should be fixed or appropriate items placed in an exclude file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400757#comment-13400757
 ] 

Jason Lowe commented on HDFS-3549:
--

Thanks Daryn!  Filed HDFS-3563 to track fixing the 31 findbugs warnings.

> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3549:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, thanks Jason.

> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3551) WebHDFS CREATE does not use client location for redirection

2012-06-25 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400699#comment-13400699
 ] 

Suresh Srinivas commented on HDFS-3551:
---

Nicholas, took a quick look at the patch. It looks good. Can you please add 
some tests?

> WebHDFS CREATE does not use client location for redirection
> ---
>
> Key: HDFS-3551
> URL: https://issues.apache.org/jira/browse/HDFS-3551
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3551_20120620.patch
>
>
> CREATE currently redirects client to a random datanode but not using the 
> client location information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3481) Refactor HttpFS handling of JAX-RS query string parameters

2012-06-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400697#comment-13400697
 ] 

Hadoop QA commented on HDFS-3481:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/1258/HDFS-3481.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

-1 javac.  The applied patch generated 2070 javac compiler warnings (more 
than the trunk's current 2053 warnings).

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs-httpfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2694//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2694//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2694//console

This message is automatically generated.

> Refactor HttpFS handling of JAX-RS query string parameters
> --
>
> Key: HDFS-3481
> URL: https://issues.apache.org/jira/browse/HDFS-3481
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.1-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-3481.patch, HDFS-3481.patch, HDFS-3481.patch
>
>
> Explicit parameters in the HttpFSServer became quite messy as they are the 
> union of all possible parameters for all operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3491) HttpFs does not set permissions correctly

2012-06-25 Thread Alejandro Abdelnur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-3491:
-

Attachment: HDFS-3491.patch

updated patch adds testcase for octal shorts.

> HttpFs does not set permissions correctly
> -
>
> Key: HDFS-3491
> URL: https://issues.apache.org/jira/browse/HDFS-3491
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Romain Rigaux
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-3491.patch, HDFS-3491.patch
>
>
> HttpFs seems to have these problems:
> # can't set permissions to 777 at file creation or 1777 with setpermission
> # does not accept 01777 permissions (which is valid in WebHdfs)
> WebHdfs
> curl -X PUT 
> "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?permission=1777&op=MKDIRS&user.name=hue&doas=hue";
> {"boolean":true}
> curl  
> "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?op=GETFILESTATUS&user.name=hue&doas=hue";
> {"FileStatus":{"accessTime":0,"blockSize":0,"group":"supergroup","length":0,"modificationTime":1338581075040,"owner":"hue","pathSuffix":"","permission":"1777","replication":0,"type":"DIRECTORY"}}
> curl -X PUT 
> "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?permission=01777&op=MKDIRS&user.name=hue&doas=hue";
> {"boolean":true}
> HttpFs
> curl -X PUT 
> "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?permission=1777&op=MKDIRS&user.name=hue&doas=hue";
> {"boolean":true}
> curl  
> "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=GETFILESTATUS&user.name=hue&doas=hue";
> {"FileStatus":{"pathSuffix":"","type":"DIRECTORY","length":0,"owner":"hue","group":"supergroup","permission":"755","accessTime":0,"modificationTime":1338580912205,"blockSize":0,"replication":0}}
> curl -X PUT  
> "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=SETPERMISSION&PERMISSION=1777&user.name=hue&doas=hue";
> curl  
> "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=GETFILESTATUS&user.name=hue&doas=hue";
> {"FileStatus":{"pathSuffix":"","type":"DIRECTORY","length":0,"owner":"hue","group":"supergroup","permission":"777","accessTime":0,"modificationTime":1338581075040,"blockSize":0,"replication":0}}
> curl -X PUT 
> "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?permission=01777&op=MKDIRS&user.name=hue&doas=hue";
> {"RemoteException":{"message":"java.lang.IllegalArgumentException: Parameter 
> [permission], invalid value [01777], value must be 
> [default|[0-1]?[0-7][0-7][0-7]]","exception":"QueryParamException","javaClassName":"com.sun.jersey.api.ParamException$QueryParamException"}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3481) Refactor HttpFS handling of JAX-RS query string parameters

2012-06-25 Thread Alejandro Abdelnur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-3481:
-

Attachment: HDFS-3481.patch

Thx Eli. 

The attached patch removes the commented return in the testcase (there was only 
one occurrence of this).

The Parameter class, while a simple wrapper on Map, given its generic method 
simplifies access to parameter values significantly, making the code cleaner, 
for example:

Using Parameters:

{code}
String doAs = params.get(DoAsParam.NAME, DoAsParam.class);
{code}

Using Map:

{code}
String doAs = ((DoAsParam.class)map.get(DoAsParam.NAME, 
DoAsParam.class)).value();
{code}

And it also removes access to all the Map API which are not relevant for this 
use (if we use Map we'd had to wrap it in an unmodifiable MAP to avoid).

Regarding Using Guava ImmutableMap.of(), I'm getting similar warnings.

Finally, regarding sharing Param code with webhdfs. The idea is, once that 
webhdfs and httpfs are 100% equivalent from a functional perspective (HDFS-3113 
& HDFS-3509 would achieve that), then we can tackle unify the code (HDFS-2645).

> Refactor HttpFS handling of JAX-RS query string parameters
> --
>
> Key: HDFS-3481
> URL: https://issues.apache.org/jira/browse/HDFS-3481
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.1-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.0.1-alpha
>
> Attachments: HDFS-3481.patch, HDFS-3481.patch, HDFS-3481.patch
>
>
> Explicit parameters in the HttpFSServer became quite messy as they are the 
> union of all possible parameters for all operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

2012-06-25 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400658#comment-13400658
 ] 

Sanjay Radia commented on HDFS-3370:


Konstantine
* How can one implement hard links in a library?  If you have an alternate 
library implementation in mind please explain.
* I am fine to have hard links and renames restricted to volumes; this should 
then give you freedom to implemented a distributed NN.

> HDFS hardlink
> -
>
> Key: HDFS-3370
> URL: https://issues.apache.org/jira/browse/HDFS-3370
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Liyin Tang
> Attachments: HDFS-HardLink.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files 
> to share data without copying. Currently we will support hardlinking only 
> closed files, but it could be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are 
> primarily used in facebook:
> 1. This provides a lightweight way for applications like hbase to create a 
> snapshot;
> 2. This also allows an application like Hive to move a table to a different 
> directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2881) org.apache.hadoop.hdfs.TestDatanodeBlockScanner Fails Intermittently

2012-06-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400637#comment-13400637
 ] 

Kihwal Lee commented on HDFS-2881:
--

It failed in one of the precommit builds. It looks different this time.

https://builds.apache.org/job/PreCommit-HDFS-Build/2683//testReport/

While waiting for the two bad replicas, the blocks were fixed. So 
waitCorruptReplicas() only saw < 2 in each loop. Moreover, the first corrupt 
block was reported by DFSClient while this method was reading the file and got 
fixed (rereplicate/invalidate) before it looped, without involving 
BlockScanner. 


> org.apache.hadoop.hdfs.TestDatanodeBlockScanner Fails Intermittently
> 
>
> Key: HDFS-2881
> URL: https://issues.apache.org/jira/browse/HDFS-2881
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
> Attachments: 
> TEST-org.apache.hadoop.hdfs.TestDatanodeBlockScanner.xml, 
> org.apache.hadoop.hdfs.TestDatanodeBlockScanner-output.txt, 
> org.apache.hadoop.hdfs.TestDatanodeBlockScanner.txt
>
>
> org.apache.hadoop.hdfs.TestDatanodeBlockScanner fails intermittently durring 
> test-patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3550) raid added javadoc warnings

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400600#comment-13400600
 ] 

Hudson commented on HDFS-3550:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2400 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2400/])
HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 
1353592)

 Result = FAILURE
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> raid added javadoc warnings
> ---
>
> Key: HDFS-3550
> URL: https://issues.apache.org/jira/browse/HDFS-3550
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3550.patch
>
>
> hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the 
> following javadoc warnings and now all the builds complain about them:
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180:
>  warning - @param argument "parityFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340:
>  warning - @param argument "srcFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA,

[jira] [Commented] (HDFS-3475) Make the replication monitor multipliers configurable

2012-06-25 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400599#comment-13400599
 ] 

Aaron T. Myers commented on HDFS-3475:
--

One small comment: I think you should add some info to the hdfs-default.xml 
description for "{{dfs.namenode.invalidate.work.pct.per.iteration}}" saying 
that the value should be between 0-100, or whatever's appropriate. For that 
matter, since this is a brand new config, you might want to change it to be in 
the range 0 - 1.0, which I think is a more common way in the Hadoop code base 
to represent percentages.

Other than that the patch looks good. +1 pending a fix for the above and an 
explanation of the two test failures.

> Make the replication monitor multipliers configurable
> -
>
> Key: HDFS-3475
> URL: https://issues.apache.org/jira/browse/HDFS-3475
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Attachments: HDFS-3475.patch, HDFS-3475.patch
>
>
> BlockManager currently hardcodes the following two constants:
> {code}
> private static final int INVALIDATE_WORK_PCT_PER_ITERATION = 32;
> private static final int REPLICATION_WORK_MULTIPLIER_PER_ITERATION = 2;
> {code}
> These are used to throttle/limit the amount of deletion and 
> replication-to-other-DN work done per heartbeat interval of a live DN.
> Not many have had reasons to want these changed so far but there have been a 
> few requests I've faced over the past year from a variety of clusters I've 
> helped maintain. I think with the improvements in disks and network thats 
> already started to be rolled out in production environments out there, 
> changing these may start making sense to some.
> Lets at least make it advanced-configurable with proper docs that warn 
> adequately, with the defaults being what they are today. With hardcodes, it 
> comes down to a recompile for admins, which is not something they may like.
> Please let me know your thoughts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3464) BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can throw BKNoSuchLedgerExistsException later.

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDFS-3464.
---

Resolution: Fixed

> BKJM: Deleting currentLedger and leaving 'inprogress_x'  on exceptions can 
> throw BKNoSuchLedgerExistsException later.
> -
>
> Key: HDFS-3464
> URL: https://issues.apache.org/jira/browse/HDFS-3464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> HDFS-3058 will clean currentLedgers on exception.
> In BookKeeperJournalManager, startLogSegment() is deleting the corresponding 
> 'inprogress_ledger' ledger on exception. Here leaving the 'inprogress_x' 
> ledger metadata in ZooKeeper. When the other node becomes active, he will see 
> the 'inprogress_x' znode and tries to recoverLastTxId() it would throw 
> exception, since there is no 'inprogress_ledger' exists. 
> {noformat}
> Caused by: 
> org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException
>   at 
> org.apache.bookkeeper.client.BookKeeper.openLedger(BookKeeper.java:393)
>   at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:493)
> {noformat}
> As per the discussion in HDFS-3058, we will handle the coment as part of this 
> JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-25 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400550#comment-13400550
 ] 

Daryn Sharp commented on HDFS-3549:
---

+1 TestRaidNode appears to fail due to race condition with querying the job 
status.  Findbugs warnings are of course due to making findbugs work again.

> dist tar build fails in hadoop-hdfs-raid project
> 
>
> Key: HDFS-3549
> URL: https://issues.apache.org/jira/browse/HDFS-3549
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, 
> HDFS-3549.patch
>
>
> Trying to build the distribution tarball in a clean tree via {{mvn install 
> -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
> {noformat}
> main:
>  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
> or directory
>  [exec] tar: Exiting with failure status due to previous errors
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3550) raid added javadoc warnings

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400547#comment-13400547
 ] 

Hudson commented on HDFS-3550:
--

Integrated in Hadoop-Common-trunk-Commit #2381 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2381/])
HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 
1353592)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> raid added javadoc warnings
> ---
>
> Key: HDFS-3550
> URL: https://issues.apache.org/jira/browse/HDFS-3550
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3550.patch
>
>
> hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the 
> following javadoc warnings and now all the builds complain about them:
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180:
>  warning - @param argument "parityFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340:
>  warning - @param argument "srcFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:

[jira] [Commented] (HDFS-3550) raid added javadoc warnings

2012-06-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400543#comment-13400543
 ] 

Hudson commented on HDFS-3550:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2451 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2451/])
HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 
1353592)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> raid added javadoc warnings
> ---
>
> Key: HDFS-3550
> URL: https://issues.apache.org/jira/browse/HDFS-3550
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3550.patch
>
>
> hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the 
> following javadoc warnings and now all the builds complain about them:
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180:
>  warning - @param argument "parityFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340:
>  warning - @param argument "srcFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http

[jira] [Commented] (HDFS-3464) BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can throw BKNoSuchLedgerExistsException later.

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400541#comment-13400541
 ] 

Uma Maheswara Rao G commented on HDFS-3464:
---

Yep, We should not get this situation now after handling the specialized 
exceptions.
I will close this JIRA.

> BKJM: Deleting currentLedger and leaving 'inprogress_x'  on exceptions can 
> throw BKNoSuchLedgerExistsException later.
> -
>
> Key: HDFS-3464
> URL: https://issues.apache.org/jira/browse/HDFS-3464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> HDFS-3058 will clean currentLedgers on exception.
> In BookKeeperJournalManager, startLogSegment() is deleting the corresponding 
> 'inprogress_ledger' ledger on exception. Here leaving the 'inprogress_x' 
> ledger metadata in ZooKeeper. When the other node becomes active, he will see 
> the 'inprogress_x' znode and tries to recoverLastTxId() it would throw 
> exception, since there is no 'inprogress_ledger' exists. 
> {noformat}
> Caused by: 
> org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException
>   at 
> org.apache.bookkeeper.client.BookKeeper.openLedger(BookKeeper.java:393)
>   at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:493)
> {noformat}
> As per the discussion in HDFS-3058, we will handle the coment as part of this 
> JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3562) Handle disconnect and session timeout events at BKJM

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400534#comment-13400534
 ] 

Uma Maheswara Rao G commented on HDFS-3562:
---

Just moved this issue to verify one of the INFRA bug for moving the issue from 
BK to here.

> Handle disconnect and session timeout events at BKJM
> 
>
> Key: HDFS-3562
> URL: https://issues.apache.org/jira/browse/HDFS-3562
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Vinay
>Assignee: Vinay
>
> # Retry zookeeper operations for some amount of time in case of 
> CONNECTIONLOSS/OPERATIONTIMEOUT exceptions.
> # In case of Session expiry trigger shutdown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3562) Handle disconnect and session timeout events at BKJM

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-3562:
-

Assignee: Vinay

> Handle disconnect and session timeout events at BKJM
> 
>
> Key: HDFS-3562
> URL: https://issues.apache.org/jira/browse/HDFS-3562
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Vinay
>Assignee: Vinay
>
> # Retry zookeeper operations for some amount of time in case of 
> CONNECTIONLOSS/OPERATIONTIMEOUT exceptions.
> # In case of Session expiry trigger shutdown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3562) Handle disconnect and session timeout events at BKJM

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3562:
--

Issue Type: Sub-task  (was: Bug)
Parent: HDFS-3399

> Handle disconnect and session timeout events at BKJM
> 
>
> Key: HDFS-3562
> URL: https://issues.apache.org/jira/browse/HDFS-3562
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Vinay
>
> # Retry zookeeper operations for some amount of time in case of 
> CONNECTIONLOSS/OPERATIONTIMEOUT exceptions.
> # In case of Session expiry trigger shutdown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3550) raid added javadoc warnings

2012-06-25 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3550:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed to trunk, thanks Jason!

> raid added javadoc warnings
> ---
>
> Key: HDFS-3550
> URL: https://issues.apache.org/jira/browse/HDFS-3550
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HDFS-3550.patch
>
>
> hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the 
> following javadoc warnings and now all the builds complain about them:
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180:
>  warning - @param argument "parityFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340:
>  warning - @param argument "srcFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Moved] (HDFS-3562) Handle disconnect and session timeout events at BKJM

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G moved BOOKKEEPER-316 to HDFS-3562:
--

Target Version/s: 2.0.1-alpha, 3.0.0
 Key: HDFS-3562  (was: BOOKKEEPER-316)
 Project: Hadoop HDFS  (was: Bookkeeper)

> Handle disconnect and session timeout events at BKJM
> 
>
> Key: HDFS-3562
> URL: https://issues.apache.org/jira/browse/HDFS-3562
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Vinay
>
> # Retry zookeeper operations for some amount of time in case of 
> CONNECTIONLOSS/OPERATIONTIMEOUT exceptions.
> # In case of Session expiry trigger shutdown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3530) TestFileAppend2.testComplexAppend occasionally fails

2012-06-25 Thread Tomohiko Kinebuchi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400532#comment-13400532
 ] 

Tomohiko Kinebuchi commented on HDFS-3530:
--

It seems that this tests failed only once as far as I can track on the Jenkins 
test result history.
I was trying to reproduce this failure, but have not been successful so far.
Then, I am now inspecting log messages.

Does someone know what this test case tests?

> TestFileAppend2.testComplexAppend occasionally fails
> 
>
> Key: HDFS-3530
> URL: https://issues.apache.org/jira/browse/HDFS-3530
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Eli Collins
>Assignee: Tomohiko Kinebuchi
> Attachments: HDFS-3530-for-debug.txt, PreCommit-HADOOP-Build #1116 
> test - testComplexAppend.html.gz
>
>
> TestFileAppend2.testComplexAppend occasionally fails with the following:
> junit.framework.AssertionFailedError: testComplexAppend Worker encountered 
> exceptions.
>   at junit.framework.Assert.fail(Assert.java:47)
>   at junit.framework.Assert.assertTrue(Assert.java:20)
>   at 
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:385)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3547) Handle disconnect and session timeout events at BKJM

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3547:
--

Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-3399)

> Handle disconnect and session timeout events at BKJM
> 
>
> Key: HDFS-3547
> URL: https://issues.apache.org/jira/browse/HDFS-3547
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Vinay
>Assignee: Vinay
>
> # Retry zookeeper operations for some amount of time in case of 
> CONNECTIONLOSS/OPERATIONTIMEOUT exceptions.
> # In case of Session expiry trigger shutdown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400504#comment-13400504
 ] 

Uma Maheswara Rao G commented on HDFS-3561:
---

I think we can set retries to 1/2 for avoiding unnecessary actions on small nw 
fluctuations? or we can set it to 0 as we are already setting the same values 
in ConfiguredFailoverProxyProvider for failover clients.

{code}
 public static final String  DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_KEY = 
"dfs.client.failover.connection.retries";
  public static final int DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_DEFAULT = 
0;
  public static final String  
DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_ON_SOCKET_TIMEOUTS_KEY = 
"dfs.client.failover.connection.retries.on.timeouts";
  public static final int 
DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_ON_SOCKET_TIMEOUTS_DEFAULT = 0;
{code}


> ZKFC retries for 45 times to connect to other NN during fencing when network 
> between NNs broken and standby Nn will not take over as active 
> 
>
> Key: HDFS-3561
> URL: https://issues.apache.org/jira/browse/HDFS-3561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Reporter: suja s
>Assignee: Vinay
>
> Scenario:
> Active NN on machine1
> Standby NN on machine2
> Machine1 is isolated from the network (machine1 network cable unplugged)
> After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
> not there.
> ZKFC tries to failover NN2 as active.
> As part of this during fencing it tries to connect to machine1 and kill NN1. 
> (sshfence technique configured)
> This connection retry happens for 45 times( as it takes  
> ipc.client.connect.max.socket.retries)
> Also after that standby NN is not able to take over as active (because of 
> fencing failure).
> Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
> retries it can consider that NN as dead and instruct the other NN to take 
> over as active as there is no chance of the other NN (NN1) retaining its 
> state as active after zk session timeout when its isolated from network
> From ZKFC log:
> {noformat}
> 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
> 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
> 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
> 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
> 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
> 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
> 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
> 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
> 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
> 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3550) raid added javadoc warnings

2012-06-25 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400501#comment-13400501
 ] 

Daryn Sharp commented on HDFS-3550:
---

+1 Thanks for removing the warnings introduced by raid.

> raid added javadoc warnings
> ---
>
> Key: HDFS-3550
> URL: https://issues.apache.org/jira/browse/HDFS-3550
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: HDFS-3550.patch
>
>
> hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the 
> following javadoc warnings and now all the builds complain about them:
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180:
>  warning - @param argument "parityFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340:
>  warning - @param argument "srcFile" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58:
>  warning - @inheritDocs is an unknown tag.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24:
>  warning - Tag @link: reference not found: CronNode
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71:
>  warning - @inheritDocs is an unknown tag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3554) TestRaidNode is failing

2012-06-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400500#comment-13400500
 ] 

Robert Joseph Evans commented on HDFS-3554:
---

It looks like there is no history server up and running.  In Yarn there is a 
race in the client.  If the client asks for status if the AM is still up and 
running then it will talk to the AM.  If it has exited, which it tends to do 
when the MR job has completed then the client will fall over to the history 
server.  It looks like while you are running using the minicluster there is no 
corresponding history server to fulfill the request.

> TestRaidNode is failing
> ---
>
> Key: HDFS-3554
> URL: https://issues.apache.org/jira/browse/HDFS-3554
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/raid, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Weiyan Wang
>
> After MAPREDUCE-3868 re-enabled raid, TestRaidNode has been failing in 
> Jenkins builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread Vinay (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400487#comment-13400487
 ] 

Vinay commented on HDFS-3561:
-

During transition, fencing of old active will be done.

Here before actually using the fencing method configured, gracefull fencing 
will be tried. Now zkfc will try to get the proxy of other machine Namenode. 
since the n/w is down, it is not able to get the connection and it is retrying 
for 45 times configured using *ipc.client.connect.max.retries.on.timeouts*
{code}LOG.info("Should fence: " + target);
boolean gracefulWorked = new FailoverController(conf,
RequestSource.REQUEST_BY_ZKFC).tryGracefulFence(target);
if (gracefulWorked) {
  // It's possible that it's in standby but just about to go into active,
  // no? Is there some race here?
  LOG.info("Successfully transitioned " + target + " to standby " +
  "state without fencing");
  return;
}{code}

I think in ZKFC case we can reduce the number of retries.

Any thoughts?

> ZKFC retries for 45 times to connect to other NN during fencing when network 
> between NNs broken and standby Nn will not take over as active 
> 
>
> Key: HDFS-3561
> URL: https://issues.apache.org/jira/browse/HDFS-3561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Reporter: suja s
>Assignee: Vinay
>
> Scenario:
> Active NN on machine1
> Standby NN on machine2
> Machine1 is isolated from the network (machine1 network cable unplugged)
> After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
> not there.
> ZKFC tries to failover NN2 as active.
> As part of this during fencing it tries to connect to machine1 and kill NN1. 
> (sshfence technique configured)
> This connection retry happens for 45 times( as it takes  
> ipc.client.connect.max.socket.retries)
> Also after that standby NN is not able to take over as active (because of 
> fencing failure).
> Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
> retries it can consider that NN as dead and instruct the other NN to take 
> over as active as there is no chance of the other NN (NN1) retaining its 
> state as active after zk session timeout when its isolated from network
> From ZKFC log:
> {noformat}
> 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
> 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
> 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
> 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
> 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
> 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
> 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
> 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
> 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
> 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-3561:
-

Assignee: Vinay

Good catch Suja. Thanks for filing the JIRA.

> ZKFC retries for 45 times to connect to other NN during fencing when network 
> between NNs broken and standby Nn will not take over as active 
> 
>
> Key: HDFS-3561
> URL: https://issues.apache.org/jira/browse/HDFS-3561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Reporter: suja s
>Assignee: Vinay
>
> Scenario:
> Active NN on machine1
> Standby NN on machine2
> Machine1 is isolated from the network (machine1 network cable unplugged)
> After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
> not there.
> ZKFC tries to failover NN2 as active.
> As part of this during fencing it tries to connect to machine1 and kill NN1. 
> (sshfence technique configured)
> This connection retry happens for 45 times( as it takes  
> ipc.client.connect.max.socket.retries)
> Also after that standby NN is not able to take over as active (because of 
> fencing failure).
> Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
> retries it can consider that NN as dead and instruct the other NN to take 
> over as active as there is no chance of the other NN (NN1) retaining its 
> state as active after zk session timeout when its isolated from network
> From ZKFC log:
> {noformat}
> 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
> 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
> 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
> 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
> 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
> 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
> 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
> 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
> 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
> 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3558) OfflineImageViewer throws an NPE

2012-06-25 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400461#comment-13400461
 ] 

Daryn Sharp commented on HDFS-3558:
---

+1 Although I'd consider putting the annotation {{@VisibleForTesting}} on the 
method {{processDelegationTokens}} whose scope was relaxed for testing.

> OfflineImageViewer throws an NPE
> 
>
> Key: HDFS-3558
> URL: https://issues.apache.org/jira/browse/HDFS-3558
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.3
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-3558.branch-23.patch
>
>
> Courtesy [~mithun]
> Exception in thread "main" java.lang.NullPointerException
>   at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:371)
>   at org.apache.hadoop.security.User.(User.java:48)
>   at org.apache.hadoop.security.User.(User.java:43)
>   at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:857)
>   at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier.getUser(AbstractDelegationTokenIdentifier.java:91)
>   at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier.toString(DelegationTokenIdentifier.java:61)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.processDelegationTokens(ImageLoaderCurrent.java:222)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.loadImage(ImageLoaderCurrent.java:185)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:129)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:250)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1469) TestBlockTokenWithDFS fails on trunk

2012-06-25 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400377#comment-13400377
 ] 

Junping Du commented on HDFS-1469:
--

I just check on trunk that unit test TestBlockTokenWithDFS can pass. So 
somebody can mark it resolved and close it?

> TestBlockTokenWithDFS fails on trunk
> 
>
> Key: HDFS-1469
> URL: https://issues.apache.org/jira/browse/HDFS-1469
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Konstantin Boudnik
>Priority: Blocker
> Attachments: failed-TestBlockTokenWithDFS.txt, log.gz
>
>
> TestBlockTokenWithDFS is failing on trunk:
> Testcase: testAppend took 31.569 sec
>   FAILED
> null
> junit.framework.AssertionFailedError: null
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS.testAppend(TestBlockTokenWithDFS.java:223)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass

2012-06-25 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400363#comment-13400363
 ] 

Junping Du commented on HDFS-3498:
--

File a separated JIRA HADOOP-8526 to fix this issue.

> Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
> 
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-06-25 Thread suja s (JIRA)

suja s created HDFS-3561:


 Summary: ZKFC retries for 45 times to connect to other NN during 
fencing when network between NNs broken and standby Nn will not take over as 
active 
 Key: HDFS-3561
 URL: https://issues.apache.org/jira/browse/HDFS-3561
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Reporter: suja s


Scenario:
Active NN on machine1
Standby NN on machine2
Machine1 is isolated from the network (machine1 network cable unplugged)
After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
not there.
ZKFC tries to failover NN2 as active.
As part of this during fencing it tries to connect to machine1 and kill NN1. 
(sshfence technique configured)
This connection retry happens for 45 times( as it takes  
ipc.client.connect.max.socket.retries)
Also after that standby NN is not able to take over as active (because of 
fencing failure).
Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
retries it can consider that NN as dead and instruct the other NN to take over 
as active as there is no chance of the other NN (NN1) retaining its state as 
active after zk session timeout when its isolated from network

>From ZKFC log:
{noformat}
2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
{noformat}
 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass

2012-06-25 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400322#comment-13400322
 ] 

Junping Du commented on HDFS-3498:
--

I took a look at the log of javadoc testing: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2693/artifact/trunk/patchprocess/patchJavadocWarnings.txt
 and it looks like 13 java warnings are all in hadoop-hdfs-raid. Is there 
anything I should fix in this patch?

> Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
> 
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, 
> HDFS-3498-v4.patch, HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, 
> the Replica Removal Policy is still nested in BlockManager that need to be 
> separated out into a ReplicaPlacementPolicy then can be override later. Also 
> it looks like hadoop unit test lack the testing on replica removal policy, so 
> we add it here.
> On the other hand, as a implementation of ReplicaPlacementPolicy, 
> ReplicaPlacementDefault still show lots of generic for other topology cases 
> like: virtualization, and we want to make code in 
> ReplicaPlacementPolicyDefault can be reused as much as possible so a few of 
> its methods were changed from private to protected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

84 matches

Mail list logo