[jira] [Commented] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster

2012-02-18 Thread Mahadev konar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211020#comment-13211020
 ] 

Mahadev konar commented on HDFS-2904:
-

bq. I was thinking it wouldn't, necessarily. Our current client side code has 
no notion of being able to list all of the potential NNs, only of being able to 
get a proxy to the active one. With ZK, you wouldn't necessarily have a list of 
the standbys. For example, it might be that when you submit the job, the SBN 
happened to be down for maintenance. Then when you bring it back up and do a 
failover to it, it wouldn't have been listed for DT renewal.

I was thinking more along the lines of all the NN's would be listed as a 
persistent nodes on ZK (unless NN's unregister explicitly), so you would 
probably be able to get a list of NN's even when some are down, but again I am 
not an expert on what is going on in HA. I will let you folks decide on the 
right model. As far as MR is concerned I think it should be ok to deploy all NN 
configs on the RM as a stop gap solution. Longer term we should be able to get 
rid of that.


> HA: Client support for getting delegation tokens to an HA cluster
> -
>
> Key: HDFS-2904
> URL: https://issues.apache.org/jira/browse/HDFS-2904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client, name-node, security
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh
>
>
> Currently we have server-side support for delegation tokens in HA, and some 
> tests to verify it, but the client throws NPEs when trying to fetch a DT. 
> This is because the cluster doesn't have a single hostname, but instead a 
> logical nameservice name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster

2012-02-18 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated HDFS-2904:


Comment: was deleted

(was: bq. I was thinking it wouldn't, necessarily. Our current client side code 
has no notion of being able to list all of the potential NNs, only of being 
able to get a proxy to the active one.
bq. With ZK, you wouldn't necessarily have a list of the standbys. For example, 
it might be that when you submit the job, the SBN happened to be down for 
maintenance. Then when you bring it back up and do a failover to it, it 
wouldn't have been listed for DT renewal.)

> HA: Client support for getting delegation tokens to an HA cluster
> -
>
> Key: HDFS-2904
> URL: https://issues.apache.org/jira/browse/HDFS-2904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client, name-node, security
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh
>
>
> Currently we have server-side support for delegation tokens in HA, and some 
> tests to verify it, but the client throws NPEs when trying to fetch a DT. 
> This is because the cluster doesn't have a single hostname, but instead a 
> logical nameservice name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster

2012-02-18 Thread Mahadev konar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211019#comment-13211019
 ] 

Mahadev konar commented on HDFS-2904:
-

bq. I was thinking it wouldn't, necessarily. Our current client side code has 
no notion of being able to list all of the potential NNs, only of being able to 
get a proxy to the active one.
bq. With ZK, you wouldn't necessarily have a list of the standbys. For example, 
it might be that when you submit the job, the SBN happened to be down for 
maintenance. Then when you bring it back up and do a failover to it, it 
wouldn't have been listed for DT renewal.

> HA: Client support for getting delegation tokens to an HA cluster
> -
>
> Key: HDFS-2904
> URL: https://issues.apache.org/jira/browse/HDFS-2904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client, name-node, security
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh
>
>
> Currently we have server-side support for delegation tokens in HA, and some 
> tests to verify it, but the client throws NPEs when trying to fetch a DT. 
> This is because the cluster doesn't have a single hostname, but instead a 
> logical nameservice name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-326) Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.

2012-02-18 Thread Steve Loughran (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-326.
-

Resolution: Won't Fix

Superceded by YARN, though a lot of the work here needs to be pulled in there, 
which can be done on a case by case basis

> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -
>
> Key: HDFS-326
> URL: https://issues.apache.org/jira/browse/HDFS-326
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: AbstractHadoopComponent.java, HADOOP-3628-18.patch, 
> HADOOP-3628-19.patch, HDFS-326.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-lifecycle-tomw.sxw, hadoop-lifecycle.pdf, hadoop-lifecycle.pdf, 
> hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the 
> things that get started or stopped when you bring up a namenode. currently, 
> some of these classes have a stop() or shutdown() method, with no standard 
> name/interface, but no way of seeing if they are live, checking their health 
> of shutting them down reliably. Indeed, there is a tendency for the spawned 
> threads to not want to die; to require the entire process to be killed to 
> stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up 
> threads in their constructor; that's very dangerous as subclasses may have 
> their methods called before they are full initialised. Adding this interface 
> would be the right time to clean up the startup process so that subclassing 
> is less risky.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-545) Add service lifecycle to the HDFS classes: NameNode, Datanode, etc

2012-02-18 Thread Steve Loughran (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-545.
-

Resolution: Won't Fix

Closing; re-open at some point in future w/ making the HDFS components adopt 
the YARN service class and lifecycle

> Add service lifecycle to the HDFS classes: NameNode, Datanode, etc
> --
>
> Key: HDFS-545
> URL: https://issues.apache.org/jira/browse/HDFS-545
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 0.21.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> This is the HDFS portion of the service lifecycle changes: integrating the 
> HDFS services: Namenode (and subclasses) and the Datanode with the Service 
> base class. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211006#comment-13211006
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1765 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1765/])
HDFS-2725 script to mention dfs command (Revision 1245943)

 Result = ABORTED
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211005#comment-13211005
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #577 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/577/])
HDFS-2725 script to mention dfs command (Revision 1245944)

 Result = ABORTED
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210998#comment-13210998
 ] 

Hadoop QA commented on HDFS-2966:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12515090/HDFS-2966.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1885//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1885//console

This message is automatically generated.

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2492:
-

Status: Open  (was: Patch Available)

> BlockManager cross-rack replication checks only work for ScriptBasedMapping
> ---
>
> Key: HDFS-2492
> URL: https://issues.apache.org/jira/browse/HDFS-2492
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch
>
>
> The BlockManager cross-rack replication checks only works if script files are 
> used for replication, not if alternate plugins provide the topology 
> information.
> This is because the BlockManager sets its rack checking flag if there is a 
> filename key
> {code}
> shouldCheckForEnoughRacks = 
> conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
> {code}
> yet this filename key is only used if the topology mapper defined by 
> {code}
> DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
> {code}
> is an instance of {{ScriptBasedMapping}}
> If any other mapper is used, the system may be multi rack, but the Block 
> Manager will not be aware of this fact unless the filename key is set to 
> something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2492:
-

Status: Patch Available  (was: Open)

retrying to see if TestDistributedUpgrade works this time ; it seems brittle

> BlockManager cross-rack replication checks only work for ScriptBasedMapping
> ---
>
> Key: HDFS-2492
> URL: https://issues.apache.org/jira/browse/HDFS-2492
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch
>
>
> The BlockManager cross-rack replication checks only works if script files are 
> used for replication, not if alternate plugins provide the topology 
> information.
> This is because the BlockManager sets its rack checking flag if there is a 
> filename key
> {code}
> shouldCheckForEnoughRacks = 
> conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
> {code}
> yet this filename key is only used if the topology mapper defined by 
> {code}
> DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
> {code}
> is an instance of {{ScriptBasedMapping}}
> If any other mapper is used, the system may be multi rack, but the Block 
> Manager will not be aware of this fact unless the filename key is set to 
> something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2966:
-

Status: Open  (was: Patch Available)

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2966:
-

Status: Patch Available  (was: Open)

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210992#comment-13210992
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1828 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1828/])
HDFS-2725 script to mention dfs command (Revision 1245943)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210991#comment-13210991
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Common-0.23-Commit #575 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/575/])
HDFS-2725 script to mention dfs command (Revision 1245944)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210988#comment-13210988
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Hdfs-0.23-Commit #562 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/562/])
HDFS-2725 script to mention dfs command (Revision 1245944)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210990#comment-13210990
 ] 

Hudson commented on HDFS-2725:
--

Integrated in Hadoop-Common-trunk-Commit #1754 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1754/])
HDFS-2725 script to mention dfs command (Revision 1245943)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs


> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2725:
-

  Resolution: Fixed
   Fix Version/s: (was: 0.23.0)
  0.23.2
Target Version/s: 0.23.0, 0.24.0  (was: 0.24.0, 0.23.0)
  Status: Resolved  (was: Patch Available)

fixed in SVN -thanks!

> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.24.0, 0.23.2
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command

2012-02-18 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210981#comment-13210981
 ] 

Steve Loughran commented on HDFS-2725:
--

failures are clearly spurious, as this patch alters a script that is never used 
in testing.

> hdfs script usage information is missing the information about "dfs" command
> 
>
> Key: HDFS-2725
> URL: https://issues.apache.org/jira/browse/HDFS-2725
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Prashant Sharma
>  Labels: hdfs
> Fix For: 0.23.0, 0.24.0
>
> Attachments: HDFS-2725.patch, HDFS-2725.patch
>
>
> hdfs script does not print the command "dfs" in the usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2674) hdfs script does not work out of the box.

2012-02-18 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210979#comment-13210979
 ] 

Steve Loughran commented on HDFS-2674:
--

-failing test is (clearly) independent.
-patch needs review by someone who understands the scripts

> hdfs script does not work out of the box.
> -
>
> Key: HDFS-2674
> URL: https://issues.apache.org/jira/browse/HDFS-2674
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
>Priority: Critical
> Fix For: 0.24.0
>
> Attachments: HDFS-2674.diff
>
>
> As the title says, hadoop-config.sh doesn't add the hadoop-common jars, which 
> makes the hdfs script fail.
> To repro, follow the instructions from 
> http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment
> {code}
> ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:24 [0 jobs] [hist 
> 1889] 
> $ export HADOOP_COMMON_HOME=$(pwd)/$(ls -d 
> hadoop-common-project/hadoop-common/target/hadoop-common-*-SNAPSHOT)
> ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:29 [0 jobs] [hist 
> 1890] 
> $ export HADOOP_HDFS_HOME=$(pwd)/$(ls -d 
> hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*-SNAPSHOT)
> ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:36 [0 jobs] [hist 
> 1891] 
> $ export PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$PATH
> ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:42 [0 jobs] [hist 
> 1892] 
> $ cat > $HADOOP_COMMON_HOME/etc/hadoop/core-site.xml  << EOF
> > 
> > 
> >   
> > fs.default.name
> > hdfs://localhost/
> >   
> > 
> > EOF
> ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:51 [0 jobs] [hist 
> 1893] 
> $ hdfs namenode -format
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/HadoopIllegalArgumentException
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.HadoopIllegalArgumentException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2966:
-

Status: Patch Available  (was: In Progress)

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210972#comment-13210972
 ] 

Steve Loughran commented on HDFS-2966:
--

patch applies sleep (for the same delay as before) then poll+sleep for a 
limited set of retries before giving up.

Provide the assertions are failing on the exit of the wait cycle, rather than 
on the initial state of the tests, this polling should significantly reduce the 
probability of failure under load. 

To reassure anyone worried that this polling would slow down the test run on a 
machine not under load, this does not appear to be the case.

Before
{code}
Running org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 47.042 sec
{code}

On two runs after making changes, the elapsed times were 46.709 sec and 42.995 
sec. This implies it takes about the same time. 



> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2966:
-

Attachment: HDFS-2966.patch

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2966.patch
>
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-02-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210971#comment-13210971
 ] 

Hadoop QA commented on HDFS-2492:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12515086/HDFS-2492-blockmanager.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1884//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1884//console

This message is automatically generated.

> BlockManager cross-rack replication checks only work for ScriptBasedMapping
> ---
>
> Key: HDFS-2492
> URL: https://issues.apache.org/jira/browse/HDFS-2492
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch
>
>
> The BlockManager cross-rack replication checks only works if script files are 
> used for replication, not if alternate plugins provide the topology 
> information.
> This is because the BlockManager sets its rack checking flag if there is a 
> filename key
> {code}
> shouldCheckForEnoughRacks = 
> conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
> {code}
> yet this filename key is only used if the topology mapper defined by 
> {code}
> DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
> {code}
> is an instance of {{ScriptBasedMapping}}
> If any other mapper is used, the system may be multi rack, but the Block 
> Manager will not be aware of this fact unless the filename key is set to 
> something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210969#comment-13210969
 ] 

Steve Loughran commented on HDFS-2966:
--

A problem here is that the tests are not independent -the fs events from the 
previous test can still be trickling through the filesystem when the next test 
starts running.

A simple poll/sleep cycle actually behaves worse, because it can exit too 
early; the state of the previous test is still there and the more recent 
changes aren't yet in the metrics. 

A sleep+ followup poll cycle would appear to be a better process, though it may 
still have problems under load that movind to per-test mini HDFS clusters would 
be required to fix.

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HDFS-2966:


Assignee: Steve Loughran

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Work started) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-2966 started by Steve Loughran.

> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210959#comment-13210959
 ] 

Hudson commented on HDFS-2969:
--

Integrated in Hadoop-Mapreduce-trunk #994 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/])
HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by 
Todd Lipcon. (Revision 1245830)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245830
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java


> ExtendedBlock.equals is incorrectly implemented
> ---
>
> Key: HDFS-2969
> URL: https://issues.apache.org/jira/browse/HDFS-2969
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.24.0, 0.23.2
>
> Attachments: hdfs-2969.txt
>
>
> The {{ExtendedBlock.equals}} method incorrectly returns true for any two 
> blocks in the same block pool, regardless of block ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2968) Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210958#comment-13210958
 ] 

Hudson commented on HDFS-2968:
--

Integrated in Hadoop-Mapreduce-trunk #994 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/])
HDFS-2968. Protocol translator for BlockRecoveryCommand broken when 
multiple blocks need recovery. Contributed by Todd Lipcon. (Revision 1245832)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245832
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockRecoveryCommand.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java


> Protocol translator for BlockRecoveryCommand broken when multiple blocks need 
> recovery
> --
>
> Key: HDFS-2968
> URL: https://issues.apache.org/jira/browse/HDFS-2968
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.24.0
>
> Attachments: hdfs-2968.txt
>
>
> If there are multiple blocks to be recovered, it ends up translating to N 
> copies of the first block instead of the N different blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210950#comment-13210950
 ] 

Hudson commented on HDFS-2969:
--

Integrated in Hadoop-Mapreduce-0.23-Build #200 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/200/])
HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by 
Todd Lipcon. (Revision 1245829)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245829
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java


> ExtendedBlock.equals is incorrectly implemented
> ---
>
> Key: HDFS-2969
> URL: https://issues.apache.org/jira/browse/HDFS-2969
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.24.0, 0.23.2
>
> Attachments: hdfs-2969.txt
>
>
> The {{ExtendedBlock.equals}} method incorrectly returns true for any two 
> blocks in the same block pool, regardless of block ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2492:
-

Attachment: HDFS-2492-blockmanager.patch

> BlockManager cross-rack replication checks only work for ScriptBasedMapping
> ---
>
> Key: HDFS-2492
> URL: https://issues.apache.org/jira/browse/HDFS-2492
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch
>
>
> The BlockManager cross-rack replication checks only works if script files are 
> used for replication, not if alternate plugins provide the topology 
> information.
> This is because the BlockManager sets its rack checking flag if there is a 
> filename key
> {code}
> shouldCheckForEnoughRacks = 
> conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
> {code}
> yet this filename key is only used if the topology mapper defined by 
> {code}
> DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
> {code}
> is an instance of {{ScriptBasedMapping}}
> If any other mapper is used, the system may be multi rack, but the Block 
> Manager will not be aware of this fact unless the filename key is set to 
> something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-02-18 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2492:
-

Status: Patch Available  (was: In Progress)

> BlockManager cross-rack replication checks only work for ScriptBasedMapping
> ---
>
> Key: HDFS-2492
> URL: https://issues.apache.org/jira/browse/HDFS-2492
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
> HDFS-2492-blockmanager.patch
>
>
> The BlockManager cross-rack replication checks only works if script files are 
> used for replication, not if alternate plugins provide the topology 
> information.
> This is because the BlockManager sets its rack checking flag if there is a 
> filename key
> {code}
> shouldCheckForEnoughRacks = 
> conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
> {code}
> yet this filename key is only used if the topology mapper defined by 
> {code}
> DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
> {code}
> is an instance of {{ScriptBasedMapping}}
> If any other mapper is used, the system may be multi rack, but the Block 
> Manager will not be aware of this fact unless the filename key is set to 
> something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210921#comment-13210921
 ] 

Hudson commented on HDFS-2969:
--

Integrated in Hadoop-Hdfs-0.23-Build #172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/172/])
HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by 
Todd Lipcon. (Revision 1245829)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245829
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java


> ExtendedBlock.equals is incorrectly implemented
> ---
>
> Key: HDFS-2969
> URL: https://issues.apache.org/jira/browse/HDFS-2969
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.24.0, 0.23.2
>
> Attachments: hdfs-2969.txt
>
>
> The {{ExtendedBlock.equals}} method incorrectly returns true for any two 
> blocks in the same block pool, regardless of block ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2952) HA: NN should not start with upgrade option or with a pending an unfinalized upgrade

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210916#comment-13210916
 ] 

Hudson commented on HDFS-2952:
--

Integrated in Hadoop-Hdfs-HAbranch-build #81 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/81/])
HDFS-2952. NN should not start with upgrade option or with a pending an 
unfinalized upgrade. Contributed by Aaron T. Myers. (Revision 1245875)

 Result = UNSTABLE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245875
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDFSUpgradeWithHA.java


> HA: NN should not start with upgrade option or with a pending an unfinalized 
> upgrade
> 
>
> Key: HDFS-2952
> URL: https://issues.apache.org/jira/browse/HDFS-2952
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: HA branch (HDFS-1623)
>
> Attachments: HDFS-2952-HDFS-1623.patch, HDFS-2952-HDFS-1623.patch
>
>
> For simplicity, we should require that upgrades be done with HA disabled. We 
> might support this in future versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210912#comment-13210912
 ] 

Hudson commented on HDFS-2969:
--

Integrated in Hadoop-Hdfs-trunk #959 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/])
HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by 
Todd Lipcon. (Revision 1245830)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245830
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java


> ExtendedBlock.equals is incorrectly implemented
> ---
>
> Key: HDFS-2969
> URL: https://issues.apache.org/jira/browse/HDFS-2969
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.24.0, 0.23.2
>
> Attachments: hdfs-2969.txt
>
>
> The {{ExtendedBlock.equals}} method incorrectly returns true for any two 
> blocks in the same block pool, regardless of block ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2968) Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery

2012-02-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210911#comment-13210911
 ] 

Hudson commented on HDFS-2968:
--

Integrated in Hadoop-Hdfs-trunk #959 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/])
HDFS-2968. Protocol translator for BlockRecoveryCommand broken when 
multiple blocks need recovery. Contributed by Todd Lipcon. (Revision 1245832)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245832
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockRecoveryCommand.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java


> Protocol translator for BlockRecoveryCommand broken when multiple blocks need 
> recovery
> --
>
> Key: HDFS-2968
> URL: https://issues.apache.org/jira/browse/HDFS-2968
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.24.0
>
> Attachments: hdfs-2968.txt
>
>
> If there are multiple blocks to be recovered, it ends up translating to N 
> copies of the first block instead of the N different blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-02-18 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210888#comment-13210888
 ] 

Steve Loughran commented on HDFS-2966:
--

My planned solution to this is move from sleep-then-assert to sleep-poll-repeat 
for a longer period of time. If the state is reached sooner, the test finishes 
earlier, but if the machine is overloaded the test will stretch out. This may 
make it faster on some machines, as well as less brittle on others.


> TestNameNodeMetrics tests can fail under load
> -
>
> Key: HDFS-2966
> URL: https://issues.apache.org/jira/browse/HDFS-2966
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: OS/X running intellij IDEA, firefox, winxp in a 
> virtualbox.
>Reporter: Steve Loughran
>Priority: Minor
>
> I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
> running the HDFS tests on a desktop with out enough memory for all the 
> programs trying to run. Things got swapped out and the tests failed as the DN 
> heartbeats didn't come in on time.
> the tests both rely on {{waitForDeletion()}} to block the tests until the 
> delete operation has completed, but all it does is sleep for the same number 
> of seconds as there are datanodes. This is too brittle -it may work on a 
> lightly-loaded system, but not on a system under heavy load where it is 
> taking longer to replicate than expect.
> Immediate fix: double, triple, the sleep time?
> Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira