[jira] [Commented] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster
[ https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211020#comment-13211020 ] Mahadev konar commented on HDFS-2904: - bq. I was thinking it wouldn't, necessarily. Our current client side code has no notion of being able to list all of the potential NNs, only of being able to get a proxy to the active one. With ZK, you wouldn't necessarily have a list of the standbys. For example, it might be that when you submit the job, the SBN happened to be down for maintenance. Then when you bring it back up and do a failover to it, it wouldn't have been listed for DT renewal. I was thinking more along the lines of all the NN's would be listed as a persistent nodes on ZK (unless NN's unregister explicitly), so you would probably be able to get a list of NN's even when some are down, but again I am not an expert on what is going on in HA. I will let you folks decide on the right model. As far as MR is concerned I think it should be ok to deploy all NN configs on the RM as a stop gap solution. Longer term we should be able to get rid of that. > HA: Client support for getting delegation tokens to an HA cluster > - > > Key: HDFS-2904 > URL: https://issues.apache.org/jira/browse/HDFS-2904 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, hdfs client, name-node, security >Affects Versions: HA branch (HDFS-1623) >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh > > > Currently we have server-side support for delegation tokens in HA, and some > tests to verify it, but the client throws NPEs when trying to fetch a DT. > This is because the cluster doesn't have a single hostname, but instead a > logical nameservice name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster
[ https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated HDFS-2904: Comment: was deleted (was: bq. I was thinking it wouldn't, necessarily. Our current client side code has no notion of being able to list all of the potential NNs, only of being able to get a proxy to the active one. bq. With ZK, you wouldn't necessarily have a list of the standbys. For example, it might be that when you submit the job, the SBN happened to be down for maintenance. Then when you bring it back up and do a failover to it, it wouldn't have been listed for DT renewal.) > HA: Client support for getting delegation tokens to an HA cluster > - > > Key: HDFS-2904 > URL: https://issues.apache.org/jira/browse/HDFS-2904 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, hdfs client, name-node, security >Affects Versions: HA branch (HDFS-1623) >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh > > > Currently we have server-side support for delegation tokens in HA, and some > tests to verify it, but the client throws NPEs when trying to fetch a DT. > This is because the cluster doesn't have a single hostname, but instead a > logical nameservice name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2904) HA: Client support for getting delegation tokens to an HA cluster
[ https://issues.apache.org/jira/browse/HDFS-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211019#comment-13211019 ] Mahadev konar commented on HDFS-2904: - bq. I was thinking it wouldn't, necessarily. Our current client side code has no notion of being able to list all of the potential NNs, only of being able to get a proxy to the active one. bq. With ZK, you wouldn't necessarily have a list of the standbys. For example, it might be that when you submit the job, the SBN happened to be down for maintenance. Then when you bring it back up and do a failover to it, it wouldn't have been listed for DT renewal. > HA: Client support for getting delegation tokens to an HA cluster > - > > Key: HDFS-2904 > URL: https://issues.apache.org/jira/browse/HDFS-2904 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, hdfs client, name-node, security >Affects Versions: HA branch (HDFS-1623) >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-2904.txt, hdfs-2904.txt, hdfs-2904.txt, test-dt.sh > > > Currently we have server-side support for delegation tokens in HA, and some > tests to verify it, but the client throws NPEs when trying to fetch a DT. > This is because the cluster doesn't have a single hostname, but instead a > logical nameservice name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-326) Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
[ https://issues.apache.org/jira/browse/HDFS-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HDFS-326. - Resolution: Won't Fix Superceded by YARN, though a lot of the work here needs to be pulled in there, which can be done on a case by case basis > Add a lifecycle interface for Hadoop components: namenodes, job clients, etc. > - > > Key: HDFS-326 > URL: https://issues.apache.org/jira/browse/HDFS-326 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: AbstractHadoopComponent.java, HADOOP-3628-18.patch, > HADOOP-3628-19.patch, HDFS-326.patch, hadoop-3628.patch, hadoop-3628.patch, > hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, > hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, > hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, > hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, > hadoop-lifecycle-tomw.sxw, hadoop-lifecycle.pdf, hadoop-lifecycle.pdf, > hadoop-lifecycle.sxw > > > I'd like to propose we have a standard interface for hadoop components, the > things that get started or stopped when you bring up a namenode. currently, > some of these classes have a stop() or shutdown() method, with no standard > name/interface, but no way of seeing if they are live, checking their health > of shutting them down reliably. Indeed, there is a tendency for the spawned > threads to not want to die; to require the entire process to be killed to > stop the workers. > Having a standard interface would make it easier for > * management tools to manage the different things > * monitoring the state of things > * subclassing > The latter is interesting as right now TaskTracker and JobTracker start up > threads in their constructor; that's very dangerous as subclasses may have > their methods called before they are full initialised. Adding this interface > would be the right time to clean up the startup process so that subclassing > is less risky. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-545) Add service lifecycle to the HDFS classes: NameNode, Datanode, etc
[ https://issues.apache.org/jira/browse/HDFS-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HDFS-545. - Resolution: Won't Fix Closing; re-open at some point in future w/ making the HDFS components adopt the YARN service class and lifecycle > Add service lifecycle to the HDFS classes: NameNode, Datanode, etc > -- > > Key: HDFS-545 > URL: https://issues.apache.org/jira/browse/HDFS-545 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, name-node >Affects Versions: 0.21.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > > This is the HDFS portion of the service lifecycle changes: integrating the > HDFS services: Namenode (and subclasses) and the Datanode with the Service > base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211006#comment-13211006 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1765 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1765/]) HDFS-2725 script to mention dfs command (Revision 1245943) Result = ABORTED stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211005#comment-13211005 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Mapreduce-0.23-Commit #577 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/577/]) HDFS-2725 script to mention dfs command (Revision 1245944) Result = ABORTED stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210998#comment-13210998 ] Hadoop QA commented on HDFS-2966: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515090/HDFS-2966.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1885//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1885//console This message is automatically generated. > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping
[ https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2492: - Status: Open (was: Patch Available) > BlockManager cross-rack replication checks only work for ScriptBasedMapping > --- > > Key: HDFS-2492 > URL: https://issues.apache.org/jira/browse/HDFS-2492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 0.24.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch > > > The BlockManager cross-rack replication checks only works if script files are > used for replication, not if alternate plugins provide the topology > information. > This is because the BlockManager sets its rack checking flag if there is a > filename key > {code} > shouldCheckForEnoughRacks = > conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; > {code} > yet this filename key is only used if the topology mapper defined by > {code} > DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY > {code} > is an instance of {{ScriptBasedMapping}} > If any other mapper is used, the system may be multi rack, but the Block > Manager will not be aware of this fact unless the filename key is set to > something non-null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping
[ https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2492: - Status: Patch Available (was: Open) retrying to see if TestDistributedUpgrade works this time ; it seems brittle > BlockManager cross-rack replication checks only work for ScriptBasedMapping > --- > > Key: HDFS-2492 > URL: https://issues.apache.org/jira/browse/HDFS-2492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 0.24.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch > > > The BlockManager cross-rack replication checks only works if script files are > used for replication, not if alternate plugins provide the topology > information. > This is because the BlockManager sets its rack checking flag if there is a > filename key > {code} > shouldCheckForEnoughRacks = > conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; > {code} > yet this filename key is only used if the topology mapper defined by > {code} > DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY > {code} > is an instance of {{ScriptBasedMapping}} > If any other mapper is used, the system may be multi rack, but the Block > Manager will not be aware of this fact unless the filename key is set to > something non-null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2966: - Status: Open (was: Patch Available) > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2966: - Status: Patch Available (was: Open) > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210992#comment-13210992 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Hdfs-trunk-Commit #1828 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1828/]) HDFS-2725 script to mention dfs command (Revision 1245943) Result = SUCCESS stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210991#comment-13210991 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Common-0.23-Commit #575 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/575/]) HDFS-2725 script to mention dfs command (Revision 1245944) Result = SUCCESS stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210988#comment-13210988 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Hdfs-0.23-Commit #562 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/562/]) HDFS-2725 script to mention dfs command (Revision 1245944) Result = SUCCESS stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245944 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210990#comment-13210990 ] Hudson commented on HDFS-2725: -- Integrated in Hadoop-Common-trunk-Commit #1754 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1754/]) HDFS-2725 script to mention dfs command (Revision 1245943) Result = SUCCESS stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2725: - Resolution: Fixed Fix Version/s: (was: 0.23.0) 0.23.2 Target Version/s: 0.23.0, 0.24.0 (was: 0.24.0, 0.23.0) Status: Resolved (was: Patch Available) fixed in SVN -thanks! > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.24.0, 0.23.2 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2725) hdfs script usage information is missing the information about "dfs" command
[ https://issues.apache.org/jira/browse/HDFS-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210981#comment-13210981 ] Steve Loughran commented on HDFS-2725: -- failures are clearly spurious, as this patch alters a script that is never used in testing. > hdfs script usage information is missing the information about "dfs" command > > > Key: HDFS-2725 > URL: https://issues.apache.org/jira/browse/HDFS-2725 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.0, 0.24.0 >Reporter: Prashant Sharma > Labels: hdfs > Fix For: 0.23.0, 0.24.0 > > Attachments: HDFS-2725.patch, HDFS-2725.patch > > > hdfs script does not print the command "dfs" in the usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2674) hdfs script does not work out of the box.
[ https://issues.apache.org/jira/browse/HDFS-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210979#comment-13210979 ] Steve Loughran commented on HDFS-2674: -- -failing test is (clearly) independent. -patch needs review by someone who understands the scripts > hdfs script does not work out of the box. > - > > Key: HDFS-2674 > URL: https://issues.apache.org/jira/browse/HDFS-2674 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ivan Kelly >Assignee: Ivan Kelly >Priority: Critical > Fix For: 0.24.0 > > Attachments: HDFS-2674.diff > > > As the title says, hadoop-config.sh doesn't add the hadoop-common jars, which > makes the hdfs script fail. > To repro, follow the instructions from > http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment > {code} > ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:24 [0 jobs] [hist > 1889] > $ export HADOOP_COMMON_HOME=$(pwd)/$(ls -d > hadoop-common-project/hadoop-common/target/hadoop-common-*-SNAPSHOT) > ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:29 [0 jobs] [hist > 1890] > $ export HADOOP_HDFS_HOME=$(pwd)/$(ls -d > hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*-SNAPSHOT) > ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:36 [0 jobs] [hist > 1891] > $ export PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$PATH > ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:42 [0 jobs] [hist > 1892] > $ cat > $HADOOP_COMMON_HOME/etc/hadoop/core-site.xml << EOF > > > > > > > > fs.default.name > > hdfs://localhost/ > > > > > > EOF > ivank@spokegrown-lm ~/src/hadoop-common Tue Dec 13 19:14:51 [0 jobs] [hist > 1893] > $ hdfs namenode -format > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/HadoopIllegalArgumentException > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.HadoopIllegalArgumentException > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2966: - Status: Patch Available (was: In Progress) > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210972#comment-13210972 ] Steve Loughran commented on HDFS-2966: -- patch applies sleep (for the same delay as before) then poll+sleep for a limited set of retries before giving up. Provide the assertions are failing on the exit of the wait cycle, rather than on the initial state of the tests, this polling should significantly reduce the probability of failure under load. To reassure anyone worried that this polling would slow down the test run on a machine not under load, this does not appear to be the case. Before {code} Running org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 47.042 sec {code} On two runs after making changes, the elapsed times were 46.709 sec and 42.995 sec. This implies it takes about the same time. > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2966: - Attachment: HDFS-2966.patch > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2966.patch > > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping
[ https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210971#comment-13210971 ] Hadoop QA commented on HDFS-2492: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515086/HDFS-2492-blockmanager.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1884//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1884//console This message is automatically generated. > BlockManager cross-rack replication checks only work for ScriptBasedMapping > --- > > Key: HDFS-2492 > URL: https://issues.apache.org/jira/browse/HDFS-2492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 0.24.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch > > > The BlockManager cross-rack replication checks only works if script files are > used for replication, not if alternate plugins provide the topology > information. > This is because the BlockManager sets its rack checking flag if there is a > filename key > {code} > shouldCheckForEnoughRacks = > conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; > {code} > yet this filename key is only used if the topology mapper defined by > {code} > DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY > {code} > is an instance of {{ScriptBasedMapping}} > If any other mapper is used, the system may be multi rack, but the Block > Manager will not be aware of this fact unless the filename key is set to > something non-null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210969#comment-13210969 ] Steve Loughran commented on HDFS-2966: -- A problem here is that the tests are not independent -the fs events from the previous test can still be trickling through the filesystem when the next test starts running. A simple poll/sleep cycle actually behaves worse, because it can exit too early; the state of the previous test is still there and the more recent changes aren't yet in the metrics. A sleep+ followup poll cycle would appear to be a better process, though it may still have problems under load that movind to per-test mini HDFS clusters would be required to fix. > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HDFS-2966: Assignee: Steve Loughran > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-2966 started by Steve Loughran. > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented
[ https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210959#comment-13210959 ] Hudson commented on HDFS-2969: -- Integrated in Hadoop-Mapreduce-trunk #994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/]) HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by Todd Lipcon. (Revision 1245830) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245830 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java > ExtendedBlock.equals is incorrectly implemented > --- > > Key: HDFS-2969 > URL: https://issues.apache.org/jira/browse/HDFS-2969 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Fix For: 0.24.0, 0.23.2 > > Attachments: hdfs-2969.txt > > > The {{ExtendedBlock.equals}} method incorrectly returns true for any two > blocks in the same block pool, regardless of block ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2968) Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery
[ https://issues.apache.org/jira/browse/HDFS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210958#comment-13210958 ] Hudson commented on HDFS-2968: -- Integrated in Hadoop-Mapreduce-trunk #994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/]) HDFS-2968. Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery. Contributed by Todd Lipcon. (Revision 1245832) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245832 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockRecoveryCommand.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java > Protocol translator for BlockRecoveryCommand broken when multiple blocks need > recovery > -- > > Key: HDFS-2968 > URL: https://issues.apache.org/jira/browse/HDFS-2968 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, name-node >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.24.0 > > Attachments: hdfs-2968.txt > > > If there are multiple blocks to be recovered, it ends up translating to N > copies of the first block instead of the N different blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented
[ https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210950#comment-13210950 ] Hudson commented on HDFS-2969: -- Integrated in Hadoop-Mapreduce-0.23-Build #200 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/200/]) HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by Todd Lipcon. (Revision 1245829) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245829 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java > ExtendedBlock.equals is incorrectly implemented > --- > > Key: HDFS-2969 > URL: https://issues.apache.org/jira/browse/HDFS-2969 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Fix For: 0.24.0, 0.23.2 > > Attachments: hdfs-2969.txt > > > The {{ExtendedBlock.equals}} method incorrectly returns true for any two > blocks in the same block pool, regardless of block ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping
[ https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2492: - Attachment: HDFS-2492-blockmanager.patch > BlockManager cross-rack replication checks only work for ScriptBasedMapping > --- > > Key: HDFS-2492 > URL: https://issues.apache.org/jira/browse/HDFS-2492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 0.24.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch > > > The BlockManager cross-rack replication checks only works if script files are > used for replication, not if alternate plugins provide the topology > information. > This is because the BlockManager sets its rack checking flag if there is a > filename key > {code} > shouldCheckForEnoughRacks = > conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; > {code} > yet this filename key is only used if the topology mapper defined by > {code} > DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY > {code} > is an instance of {{ScriptBasedMapping}} > If any other mapper is used, the system may be multi rack, but the Block > Manager will not be aware of this fact unless the filename key is set to > something non-null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping
[ https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-2492: - Status: Patch Available (was: In Progress) > BlockManager cross-rack replication checks only work for ScriptBasedMapping > --- > > Key: HDFS-2492 > URL: https://issues.apache.org/jira/browse/HDFS-2492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 0.24.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, > HDFS-2492-blockmanager.patch > > > The BlockManager cross-rack replication checks only works if script files are > used for replication, not if alternate plugins provide the topology > information. > This is because the BlockManager sets its rack checking flag if there is a > filename key > {code} > shouldCheckForEnoughRacks = > conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; > {code} > yet this filename key is only used if the topology mapper defined by > {code} > DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY > {code} > is an instance of {{ScriptBasedMapping}} > If any other mapper is used, the system may be multi rack, but the Block > Manager will not be aware of this fact unless the filename key is set to > something non-null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented
[ https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210921#comment-13210921 ] Hudson commented on HDFS-2969: -- Integrated in Hadoop-Hdfs-0.23-Build #172 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/172/]) HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by Todd Lipcon. (Revision 1245829) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245829 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java > ExtendedBlock.equals is incorrectly implemented > --- > > Key: HDFS-2969 > URL: https://issues.apache.org/jira/browse/HDFS-2969 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Fix For: 0.24.0, 0.23.2 > > Attachments: hdfs-2969.txt > > > The {{ExtendedBlock.equals}} method incorrectly returns true for any two > blocks in the same block pool, regardless of block ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2952) HA: NN should not start with upgrade option or with a pending an unfinalized upgrade
[ https://issues.apache.org/jira/browse/HDFS-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210916#comment-13210916 ] Hudson commented on HDFS-2952: -- Integrated in Hadoop-Hdfs-HAbranch-build #81 (See [https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/81/]) HDFS-2952. NN should not start with upgrade option or with a pending an unfinalized upgrade. Contributed by Aaron T. Myers. (Revision 1245875) Result = UNSTABLE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245875 Files : * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDFSUpgradeWithHA.java > HA: NN should not start with upgrade option or with a pending an unfinalized > upgrade > > > Key: HDFS-2952 > URL: https://issues.apache.org/jira/browse/HDFS-2952 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node >Affects Versions: HA branch (HDFS-1623) >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: HA branch (HDFS-1623) > > Attachments: HDFS-2952-HDFS-1623.patch, HDFS-2952-HDFS-1623.patch > > > For simplicity, we should require that upgrades be done with HA disabled. We > might support this in future versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2969) ExtendedBlock.equals is incorrectly implemented
[ https://issues.apache.org/jira/browse/HDFS-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210912#comment-13210912 ] Hudson commented on HDFS-2969: -- Integrated in Hadoop-Hdfs-trunk #959 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/]) HDFS-2969. ExtendedBlock.equals is incorrectly implemented. Contributed by Todd Lipcon. (Revision 1245830) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245830 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestExtendedBlock.java > ExtendedBlock.equals is incorrectly implemented > --- > > Key: HDFS-2969 > URL: https://issues.apache.org/jira/browse/HDFS-2969 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Fix For: 0.24.0, 0.23.2 > > Attachments: hdfs-2969.txt > > > The {{ExtendedBlock.equals}} method incorrectly returns true for any two > blocks in the same block pool, regardless of block ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2968) Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery
[ https://issues.apache.org/jira/browse/HDFS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210911#comment-13210911 ] Hudson commented on HDFS-2968: -- Integrated in Hadoop-Hdfs-trunk #959 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/]) HDFS-2968. Protocol translator for BlockRecoveryCommand broken when multiple blocks need recovery. Contributed by Todd Lipcon. (Revision 1245832) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245832 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockRecoveryCommand.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java > Protocol translator for BlockRecoveryCommand broken when multiple blocks need > recovery > -- > > Key: HDFS-2968 > URL: https://issues.apache.org/jira/browse/HDFS-2968 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node, name-node >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.24.0 > > Attachments: hdfs-2968.txt > > > If there are multiple blocks to be recovered, it ends up translating to N > copies of the first block instead of the N different blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210888#comment-13210888 ] Steve Loughran commented on HDFS-2966: -- My planned solution to this is move from sleep-then-assert to sleep-poll-repeat for a longer period of time. If the state is reached sooner, the test finishes earlier, but if the machine is overloaded the test will stretch out. This may make it faster on some machines, as well as less brittle on others. > TestNameNodeMetrics tests can fail under load > - > > Key: HDFS-2966 > URL: https://issues.apache.org/jira/browse/HDFS-2966 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 > Environment: OS/X running intellij IDEA, firefox, winxp in a > virtualbox. >Reporter: Steve Loughran >Priority: Minor > > I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of > running the HDFS tests on a desktop with out enough memory for all the > programs trying to run. Things got swapped out and the tests failed as the DN > heartbeats didn't come in on time. > the tests both rely on {{waitForDeletion()}} to block the tests until the > delete operation has completed, but all it does is sleep for the same number > of seconds as there are datanodes. This is too brittle -it may work on a > lightly-loaded system, but not on a system under heavy load where it is > taking longer to replicate than expect. > Immediate fix: double, triple, the sleep time? > Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira