[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605950#comment-14605950 ] Jian Fang commented on HDFS-1623: - Created a JIRA HDFS-8693 to track the issue. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 2.0.0-alpha > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Auto-Re: [jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
您的邮件已收到!谢谢!
Auto-Re: [jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
您的邮件已收到!谢谢!
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605949#comment-14605949 ] Jian Fang commented on HDFS-1623: - Add a new JIRA to track the issue. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 2.0.0-alpha > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Auto-Re: [jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
您的邮件已收到!谢谢!
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605934#comment-14605934 ] Jian Fang commented on HDFS-1623: - Could someone please response on this issue? The new name node on a replacement is critical for auto provisioning a hadoop cluster with HDFS HA support in cloud. Without this support, the HA feature could not really be used. I also observed that the new standby name node on the replacement instance could stuck in safe mode because no data nodes check in with it. Even with a rolling restart, it may take quite some time to restart all data nodes if we have a big cluster, for example, with 4000 data nodes, let alone restarting DN is way too intrusive and it is not a preferred operation in production. It also increases the chance for a double failure because the standby name node is not really ready for a failover in the case that the current active name node fails. This is really a big issue. Please at least provide us some pointers on why it is difficult to support adding a new standby to a running DN and what we need to pay attention if we need to implement this by ourselves. Thanks again. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 2.0.0-alpha > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Auto-Re: [jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
您的邮件已收到!谢谢!
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604495#comment-14604495 ] Jian Fang commented on HDFS-1623: - Sorry for commenting on the resolved JIRA. I tried to run command hdfs dfsadmin -refreshNamenodes datanode-host:port to refresh name nodes on data nodes after I replaced one name node with a new one so that I don't need to restart the data nodes. However, I got the following error: refreshNamenodes: HA does not currently support adding a new standby to a running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. I checked the 2.6.0 code and the error was thrown by the following code snippet, which led me to this JIRA. void refreshNNList(ArrayList addrs) throws IOException { Set oldAddrs = Sets.newHashSet(); for (BPServiceActor actor : bpServices) { oldAddrs.add(actor.getNNSocketAddress()); } Set newAddrs = Sets.newHashSet(addrs); if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) { // Keep things simple for now -- we can implement this at a later date. throw new IOException( "HA does not currently support adding a new standby to a running DN. " + "Please do a rolling restart of DNs to reconfigure the list of NNs."); } } Looks like this JIRA should not be closed and there are some uncompleted work here. Is there any other JIRA to track this issue and how could I workaround this problem? Thanks in advance. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 2.0.0-alpha > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229715#comment-13229715 ] Suresh Srinivas commented on HDFS-1623: --- BTW in the meeting minutes, in list of attendees, I left out Konstantin Shvachko, Colin McCabe and Mayank Bansal. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229647#comment-13229647 ] Todd Lipcon commented on HDFS-1623: --- bq. From experience, I suspect getting the code for IP based automated failover working in a multi-SBN approach to where the community is ready to see it as a stable codebase/release will probably take some more investment. yep, I didn't mean to indicate that the whole project is trivial. Just that the majority of issues we had to solve for the current failover implementation also apply to the IP-failover implementation. The remaining work to do IP-failover is much smaller than the amount already behind us. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229632#comment-13229632 ] Avik Dey commented on HDFS-1623: Hi Todd, I am sure some of the pieces exist. From experience, I suspect getting the code for IP based automated failover working in a multi-SBN approach to where the community is ready to see it as a stable codebase/release will probably take some more investment. :-) When we are there, it would be good to have both the attached design documents and the test plans and test cases updated to reflect the use of multi-SBN for IP based automated HA failover. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229534#comment-13229534 ] Todd Lipcon commented on HDFS-1623: --- Hi Avik. The multi-SBN approach is actually very nearly implemented by what's in trunk. The main missing pieces are actually in the more trivial bits -- for example, we have a few bits of the code that look up "the other node" for operations like triggering checkpoints. Those would have to be modified a bit to "look up the active node" instead. But nothing fundamental or really difficult. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229499#comment-13229499 ] Avik Dey commented on HDFS-1623: Suresh - Thanks for sharing the notes from this discussion on HDFS Namenode HA. Good to see that we are embracing both the IP based NN failover solution as well as a replacement for the NFS based solution through the BN implementation. From this thread as well as reviewing the jiras most of the current thinking for failover seems to be around a single SBN. Keeping in mind typical goals for HA in an enterprise environment and redundancy requirements around it, I would like to suggest that the community consider at least a dual SBN solution. Ideally off course the HA solution would allow for N (N=0 to x) number of SBNs driven by the HA needs of the specific deployment, however that would possibly require a additional design / implementation for leader election mechanism or some predetermined and configured ordering of SBNs. Implementing such a generic N SBN solution would probably add some design complexity and push out the release of this HA solution further in to the future. However, the dual SBN solution should be relatively less involved and easier to implement leveraging the same approach of VIPs and HDFS-3077 + HDFS-3092 while providing significantly higher redundancy in a HA failover scenario. Best Regards. Avik On Wed, Mar 14, 2012 at 9:40 AM, Suresh Srinivas (Commented) (JIRA) < > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229325#comment-13229325 ] Suresh Srinivas commented on HDFS-1623: --- Minutes from HDFS Namenode HA - Next Steps meeting: We had a meeting to discuss the status of Namenode HA and remaining work items. Attendees included Aaron Myers, Eli Collins, Hairong Kuang, Jitendra Pandey, Hari Mankude, Pritam Damania, Suresh Srinivas, Sanjay Radia, Tomasz Nykiel, Todd Lipcon The following topics were discussed: *Clientside failover* # DFSClient failover - Currently configuration based failover is available. We decided we will consider including ZK and DNS based failovers, along the lines of configuration based failover. See HDFS-2839 for details. # IP failover was discussed and we decided that it is an option that will be added. # There was discussion around using NameServiceID as the logical URL. We need to use an appropriate abstraction here. This discussion will continue on HDFS-2839. *Failover Controller* In HDFS-1623 design, the failover controller is a separate process. We discussed whether we should incorporate it with in NN, for now. Decision was to continue with the design from HDFS-1623 and keep it as a separate process. *Use of BackupNode and Journal protocol* Current HA implementation uses an NFS shared storage. In order to eliminate this need, daemons based on Journal protocol that receives streaming edits from active namenode could be used. Some activity around using this in standby namenode and also run such stand alone daemons is starting. See HDFS-3077 and HDFS-3092 for details. Folks, please add if I missed any thing. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227535#comment-13227535 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk #1017 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1017/]) Moving HDFS-1623 and HADOOP-7454 to 0.23.3 section in CHANGES.txt files (Revision 1299417) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299417 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227498#comment-13227498 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-0.23-Build #223 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/223/]) HDFS-1623. Merging change r1296534 from trunk to 0.23 (Revision 1299412) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299412 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtocolTranslator.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/h
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227475#comment-13227475 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-0.23-Build #195 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/195/]) HDFS-1623. Merging change r1296534 from trunk to 0.23 (Revision 1299412) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299412 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtocolTranslator.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-comm
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227469#comment-13227469 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk #982 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/982/]) Moving HDFS-1623 and HADOOP-7454 to 0.23.3 section in CHANGES.txt files (Revision 1299417) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299417 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227200#comment-13227200 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1873 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1873/]) Moving HDFS-1623 and HADOOP-7454 to 0.23.3 section in CHANGES.txt files (Revision 1299417) Result = ABORTED suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299417 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227196#comment-13227196 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-0.23-Commit #678 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/678/]) HDFS-1623. Merging change r1296534 from trunk to 0.23 (Revision 1299412) Result = ABORTED suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299412 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtocolTranslator.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/branches/branch-0.23/hadoop-common-project
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227182#comment-13227182 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk-Commit #1939 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1939/]) Moving HDFS-1623 and HADOOP-7454 to 0.23.3 section in CHANGES.txt files (Revision 1299417) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299417 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227184#comment-13227184 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Common-trunk-Commit #1864 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1864/]) Moving HDFS-1623 and HADOOP-7454 to 0.23.3 section in CHANGES.txt files (Revision 1299417) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299417 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0, 0.23.3 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227123#comment-13227123 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Common-0.23-Commit #670 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/670/]) HDFS-1623. Merging change r1296534 from trunk to 0.23 (Revision 1299412) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299412 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtocolTranslator.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoo
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227122#comment-13227122 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-0.23-Commit #661 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/661/]) HDFS-1623. Merging change r1296534 from trunk to 0.23 (Revision 1299412) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299412 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtocolTranslator.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-co
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226273#comment-13226273 ] Eli Collins commented on HDFS-1623: --- +1 branch 23 patch looks good to me > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, > HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, > NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, > ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224305#comment-13224305 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk #1012 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1012/]) MAPREDUCE-3974. TestSubmitJob in MR1 tests doesn't compile after HDFS-1623 merge. Contributed by Aaron T. Myers. (Revision 1297662) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297662 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestSubmitJob.java > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224266#comment-13224266 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk #977 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/977/]) MAPREDUCE-3974. TestSubmitJob in MR1 tests doesn't compile after HDFS-1623 merge. Contributed by Aaron T. Myers. (Revision 1297662) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297662 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestSubmitJob.java > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223603#comment-13223603 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1849 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1849/]) MAPREDUCE-3974. TestSubmitJob in MR1 tests doesn't compile after HDFS-1623 merge. Contributed by Aaron T. Myers. (Revision 1297662) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297662 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestSubmitJob.java > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223594#comment-13223594 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Common-trunk-Commit #1842 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1842/]) MAPREDUCE-3974. TestSubmitJob in MR1 tests doesn't compile after HDFS-1623 merge. Contributed by Aaron T. Myers. (Revision 1297662) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297662 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestSubmitJob.java > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223592#comment-13223592 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk-Commit #1916 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1916/]) MAPREDUCE-3974. TestSubmitJob in MR1 tests doesn't compile after HDFS-1623 merge. Contributed by Aaron T. Myers. (Revision 1297662) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297662 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestSubmitJob.java > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia > Fix For: 0.24.0 > > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221585#comment-13221585 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk #1008 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1008/]) Fix up CHANGES.txt files post commit of HDFS-1623 and HADOOP-7454. (Revision 1296540) HDFS-1623. High Availability Framework for HDFS NN. Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan (Revision 1296534) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296540 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296534 Files : * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPo
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221569#comment-13221569 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk #973 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/973/]) Fix up CHANGES.txt files post commit of HDFS-1623 and HADOOP-7454. (Revision 1296540) HDFS-1623. High Availability Framework for HDFS NN. Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan (Revision 1296534) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296540 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296534 Files : * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicy.java *
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221456#comment-13221456 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1831 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1831/]) Fix up CHANGES.txt files post commit of HDFS-1623 and HADOOP-7454. (Revision 1296540) HDFS-1623. High Availability Framework for HDFS NN. Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan (Revision 1296534) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296540 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296534 Files : * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221450#comment-13221450 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Hdfs-trunk-Commit #1898 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1898/]) Fix up CHANGES.txt files post commit of HDFS-1623 and HADOOP-7454. (Revision 1296540) HDFS-1623. High Availability Framework for HDFS NN. Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan (Revision 1296534) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296540 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296534 Files : * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/Ret
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221440#comment-13221440 ] Hudson commented on HDFS-1623: -- Integrated in Hadoop-Common-trunk-Commit #1824 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1824/]) Fix up CHANGES.txt files post commit of HDFS-1623 and HADOOP-7454. (Revision 1296540) HDFS-1623. High Availability Framework for HDFS NN. Contributed by Todd Lipcon, Aaron T. Myers, Eli Collins, Uma Maheswara Rao G, Bikas Saha, Suresh Srinivas, Jitendra Nath Pandey, Hari Mankude, Brandon Li, Sanjay Radia, Mingjie Lai, and Gregory Chanan (Revision 1296534) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296540 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1296534 Files : * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.HDFS-1623.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/dev-support/findbugsExcludeFile.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/BadFencingConfigurationException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FailoverFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/FenceMethod.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocolHelper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HealthCheckFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/NodeFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ServiceFailedException.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ShellCommandFencer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/StreamPumper.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/protocolPB/HAServiceProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/DefaultFailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/FailoverProxyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218813#comment-13218813 ] Todd Lipcon commented on HDFS-1623: --- With HDFS-3020, HDFS-3023, HDFS-3024, HDFS-3025, I got the following timings: {code} teragen-4MB-block-ha-off-1.txt: Total time spent by all maps in occupied slots (ms)=255005244 teragen-4MB-block-ha-off-2.txt: Total time spent by all maps in occupied slots (ms)=248092620 teragen-4MB-block-ha-off-3.txt: Total time spent by all maps in occupied slots (ms)=256926353 teragen-4MB-block-ha-off-4.txt: Total time spent by all maps in occupied slots (ms)=244320729 teragen-4MB-block-ha-off-5.txt: Total time spent by all maps in occupied slots (ms)=248901067 teragen-4MB-block-ha-off-6.txt: Total time spent by all maps in occupied slots (ms)=234409970 teragen-4MB-block-ha-off-7.txt: Total time spent by all maps in occupied slots (ms)=224624077 teragen-4MB-block-ha-off-8.txt: Total time spent by all maps in occupied slots (ms)=235166437 teragen-4MB-block-trunk-1.txt: Total time spent by all maps in occupied slots (ms)=247575318 teragen-4MB-block-trunk-2.txt: Total time spent by all maps in occupied slots (ms)=234090512 teragen-4MB-block-trunk-3.txt: Total time spent by all maps in occupied slots (ms)=241264032 teragen-4MB-block-trunk-4.txt: Total time spent by all maps in occupied slots (ms)=242941073 teragen-4MB-block-trunk-5.txt: Total time spent by all maps in occupied slots (ms)=236123386 teragen-4MB-block-trunk-6.txt: Total time spent by all maps in occupied slots (ms)=243662148 teragen-4MB-block-trunk-7.txt: Total time spent by all maps in occupied slots (ms)=240128084 teragen-4MB-block-trunk-8.txt: Total time spent by all maps in occupied slots (ms)=220212020 {code} I ran a t-test which says that the difference in means isn't statistically significant. I'm also running the 256M-block teragen just to be safe. It's not complete yet but so far the results look good. The optimizations also reduced the edit log size for the 4MB-block by a factor of two. So I think once these above JIRAs are committed, we should be fine to merge to trunk. I'll also continue to work on the performance with HA on, but the important issue for merge is to make sure we don't regress the non-HA case. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218402#comment-13218402 ] Todd Lipcon commented on HDFS-1623: --- Yep, I plan to re-run these benchmarks hopefully today/tomorrow with HDFS-3020, HDFS-3023, HDFS-3024. My guess is that the bug fix in HDFS-3020 will make HA Off a bit slower (we were "cheating" before), but the other optimizations will make HA On a bit faster. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218397#comment-13218397 ] Suresh Srinivas commented on HDFS-1623: --- > The difference between HA-on and HA-off is that the HA-on mode actually > fsyncs all of these block allocations. Shoudld the bench mark be re-run with HDFS-3020. It might bring HA On close to HA Off? > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, dfsio-results.tsv, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215309#comment-13215309 ] Todd Lipcon commented on HDFS-1623: --- I've completed a round of preliminary performance testing on a 100-node cluster, each with 48G RAM, gigabit ethernet, 6 disks, dual quad core with hyperthreading. The three builds were trunk (rev 5b8bdf3c11fc69a9076e2b1e3385ecd179975c7d), HA branch (tip), and the same HA branch but with HA turned off. The HA NN is configured to log to an NFS mount, which unfortunately is just a normal Linux NFS box in this case, in the same rack. It was also unfortunately running a DN, though I didn't realize it until I was halfway through the benchmarks, and didn't want to start over from scratch. I ran the TestDFSIO benchmarks as suggested by Konstantin, first with 95 files and then with 380 files. Each file is 5GB. I will attach the results as a TSV file momentarily. The overall summary is that there seems to be a ~4% hit for turning on HA for the write benchmark. The read benchmark has too high variability to really be sure. Even the write one doesn't seem that consistent, given that the HA branch, when HA was disabled, actually went some 2.5% faster than trunk. (these numbers from the 380-file case) I also ran teragen in two different scenarios. The first scenario is a realistic workload (256M blocks): || Build || Runtime || Slot time || || HA On || 8m5s | 3682m | || HA Off || 8m12s | 3756m | || Trunk || 7m10s | 3163m | Here there seems to be a bad performance degredation from the HA branch (about 20%). The second workload was to try to stress the system by setting block size to 4MB (resulting in ~800-1000 block allocations/second): || Build || Runtime || Slot time || Edit log size || || HA On || 12m24s | 6655m | 1.2GB | || HA Off || 8m40s | 4469m | 1.2GB | || Trunk || 7m4s | 3375m | 6.2MB | Note here the much bigger degredation. I also included the size of the edit logs in this later benchmark. I'm pretty confident from looking at jstacks while this was running that the bad performance is due to the new "persistBlocks" calls done in the HA branch. We used to be sloppy about persisting blocks, whereas now we actually write down all of the block allocations as they proceed. The 4MB block case was much worse, since each file being written by teragen consisted of 423 blocks. The persist blocks calls towards the end of the file were logging several KB worth of data, and this resulted in a very large edit log as you can see in the table above. The difference between HA-on and HA-off is that the HA-on mode actually fsyncs all of these block allocations. So before we merge I think we should do a bit of optimization in this area. I will file a JIRA this evening or tomorrow with a couple of easy wins. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214778#comment-13214778 ] Aaron T. Myers commented on HDFS-1623: -- bq. Here is a patch that merges current HDFS-1623 branch to trunk. Thanks a lot, Jitendra. This might make it easier for some to review the merge. For anyone else who wants to review it, note that we've been merging from trunk -> HA branch almost daily, so just looking at a diff between trunk and the HA branch should mostly highlight the salient changes for HA. I also strongly recommend that any reviewers who are just now looking at the branch take a look at [the testing plan document attached to this JIRA by Todd|https://issues.apache.org/jira/secure/attachment/12514080/ha-testplan.pdf], which discusses which areas of the code have the most changes in section 1.5. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-1623.trunk.patch, > HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf, > Namenode HA Framework.pdf, ha-testplan.pdf, ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210132#comment-13210132 ] Konstantin Shvachko commented on HDFS-1623: --- I'd recommend 2 series of DFSIO consisting of -write -read and -append in each series and -fileSize = 1 to 10GB. Pick one value for all runs. We want files with multiple blocks. Series 1. -nrFiles = 95 Series 2. -nrFiles = 95*4 I chose 95, which is a bit less than # of nodes (100). And 95*4 - intended to spin 4 drives on most of the nodes if you have 4 drives or more. Don't forget to turn off speculation. And please watch std deviation in the results. In my experience Throughput values don't make sense if std deviation is high. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209027#comment-13209027 ] Brandon Li commented on HDFS-1623: -- I just ran some benchmark(TestDFSIO) tests on Yahoo cluster for HA branch and trunk. The following is the configuration and initial test results. h3. Test configuration # 5 nodes in total: 2 namenodes(non-HA config uses just one) and 3 datanodes. # The edits log is NFS mounted for both HA and non HA setup # TestDFSIO is started on one datanode. Resource manager and nodemanager are also on the same datanode. h3. Using TestDFSIO write/read 10 files with filesize as 1MB: ||HA (w/r mb/sec)||trunk (w/r mb/sec)|| |9.2/26.5|8.8/25.6| |8.4/26.4|8.3/25.4| h3. Using TestDFSIO write/read 10 files with filesize as 1000MB: ||HA (w/r mb/sec)||trunk (w/r mb/sec)| |15.3/206.4|16.5/205.5| |15.9/209.4|14.2/204.9| |15.6/207.2|14.9/209.5| > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208613#comment-13208613 ] Aaron T. Myers commented on HDFS-1623: -- bq. I think for the first cut we should disallow the NN to start with HA enabled if the upgrade flag is passed, or if there is a pending unfinalized upgrade. Filed: HDFS-2952 > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208599#comment-13208599 ] dhruba borthakur commented on HDFS-1623: > I think for the first cut we should disallow the NN to start with HA enabled > if the upgrade flag is passed, or if there is a pending unfinalized upgrade. sounds like a fine thing to do, in the interest of making HA be part of apache hdfs trunk sooner rather than later. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208299#comment-13208299 ] Todd Lipcon commented on HDFS-1623: --- Good point about upgrade failover. I think for the first cut we should disallow the NN to start with HA enabled if the upgrade flag is passed, or if there is a pending unfinalized upgrade. Then we can circle back on this as an improvement. Does that seem reasonable? Regarding DFSIO: So far we are doing cluster testing on ~100 node clusters here at Cloudera. If you can recommend a good DFSIO setting we'll be sure to try it (and also try with reduced block size to increase the block-related load on the NN and DNs) > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208269#comment-13208269 ] Konstantin Shvachko commented on HDFS-1623: --- Both test plans look good and complimentary. Very glad that benchmarking is a part of it. Will be glad to advise on DFSIO, let me know the cluster size. Pretty good unit test coverage. Liked the idea of testing failover under mixed load of SLive, Terasort, etc. Good plan for testing HBase survival after NN failover. One suggestion: we should test failover during upgrade. We don't want anybody doing it, but somebody will anyway. I see section on testing upgrade in Hari's document, could you please add failover in the middle. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208152#comment-13208152 ] Todd Lipcon commented on HDFS-1623: --- Hi Mingjie, Yes, the first cut doesn't include automatic failover. We plan to implement automatic failover as a follow-up after this is merged to trunk (manual failover addresses a lot of important use cases). > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208138#comment-13208138 ] Mingjie Lai commented on HDFS-1623: --- Regarding the latest 2 test plan documents, I didn't see automatic fail-over getting covered by any case. In the docs, a fail-over will only be triggered by a ``dfs -haadmin'' command. Is it the plan for the first cut? Did I miss anything? > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207293#comment-13207293 ] Hari Mankude commented on HDFS-1623: Initial test plan for HA testing done at Hortonworks included. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HA-tests.pdf, HDFS-High-Availability.pdf, NameNode > HA_v2.pdf, NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, ha-testplan.pdf, > ha-testplan.tex > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204755#comment-13204755 ] Todd Lipcon commented on HDFS-1623: --- Hey Konstantin. What specific performance tests would you like to be run prior to the merge? If you can enumerate some we can be sure to run them. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204739#comment-13204739 ] Konstantin Shvachko commented on HDFS-1623: --- # What test plan has been executed for testing this branch implementing HA? Besides unit tests. # Have you done any benchmarks, comparing current cluster performance against the branch? Would be good to have numbers for both cases with HA off and HA on. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104271#comment-13104271 ] Uma Maheswara Rao G commented on HDFS-1623: --- @Konstantin, I posted the second scenario in HDFS-1971 before (With out HDFS-1108). below is the Sanjay's reply. {quote} However your scenario is a good one: a standby processing BRs from DNs and edits from Active NN may get them out of order. This scenario applies to HDFS-9175 that reads from a shared editsLog file system and also for Backup NN. {quote} Is this is the scenario/similar you are asking? Please correct me if i understood wrongly. Thanks Uma > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093787#comment-13093787 ] Eli Collins commented on HDFS-1623: --- Hi Konstantin, HDFS-1975 (sharing the namenode state from active to standby) was created back in May and the description clearly states that the "proposed solution in this jira is to use shared storage". It's probably the most appropriate place for this discussion. Thanks, Eli > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093571#comment-13093571 ] Konstantin Shvachko commented on HDFS-1623: --- The discussion in HDFS-1108 revealed that Todd, Suresh and Eli (and probably others) are building HA approach based on shared storage (NFS filers) journal synchronization. The motivation for this is claimed to be the simplicity of the approach, compared to the direct streaming of edits to the StandbyNode. I think there are 2 main questions that need to be addressed with respect to this: # _Why do you introduce a dependency on enterprise hardware when you run a commodity hardware cluster?_ *People running a 20-node Hadoop cluster will have to spend probably the same amount extra on a filer.* # _How do you address the race condition between NN addBlock and DN blockReceived?_ Explanation: When HDFS client needs to creates a new block it sends addBlock() command to the NameNode. NN (assuming HDFS-1108 is fixed) writes addBlock transaction to the shared storage. The client writes data to the allocated DataNodes. Each DataNode confirms that it got the replica by sending blockReceived() message to NN and SBN. If blockReceived() is sent to StandbyNode before it consumed addBlock() transaction for this block from shared storage, blockReceived() will be rejected since SBN still does not know the block exists. SBN will eventually learn about that same replica from the next block report, but this can be one hour later. *SBN will be one hour behind the active NN, which is not hot.* > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080281#comment-13080281 ] Suresh Srinivas commented on HDFS-1623: --- bq. I've created a Fix Version for these tasks called "HA Branch (HDFS-1623)".I also added a CHANGES.HDFS-1623.txt file to the hdfs/ directory in the branch. When you commit things to the branch, please add them there rather than the main CHANGES.txt - this makes the merge work a lot easier! Ditto done for hadoop-common directory > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079488#comment-13079488 ] Todd Lipcon commented on HDFS-1623: --- I've created a Fix Version for these tasks called "HA Branch (HDFS-1623)". I also added a CHANGES.HDFS-1623.txt file to the {{hdfs/}} directory in the branch. When you commit things to the branch, please add them there rather than the main CHANGES.txt - this makes the merge work a lot easier! > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070652#comment-13070652 ] Todd Lipcon commented on HDFS-1623: --- Hey Dhruba. Check out HDFS-2179 - looks like we're thinking the same thing. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070643#comment-13070643 ] dhruba borthakur commented on HDFS-1623: Hi Yang, The zk heartbeats and delivery of notifications is not inline with the HDFS writes to its transaction logs. Assume a scenario where ZK delivers a disconnected event to NameNode, but the NameNode already is in the midst of a flushing a long list of transactions to its transaction logs. This could potentially take a non-trivial amount of time (process scheduling, GC issues, etc). Todd: what is our proposed solution for doing IO fencing on transaction logs that reside on a NFS filer? Here we are proposing that we do the following before we can do an auto failover: 1. If the original NameNode is reachable, kill the original NameNode process and verify that it is killed. 2. If step 1 fails (because of network connectivity issue), then issue a power-cycle event to the original NameNode machine via its configured console port. Verify that machine is power-cycled. 3. If Step 2 fails, then abort auto failover. Otherwise continue failover sequence. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070622#comment-13070622 ] Suresh Srinivas commented on HDFS-1623: --- The problem is not with the timing of delivery of disconnect event. During fail over, the standby taking over as active may not be able to communicate with previous active (directly/indirectly) to assert that the previous active has relinquished the role of active. This could be due to network partition, active not functional due to GC, OS issues etc. In such a scenario, the only way for new active to ensure shared resource is not controlled by two actives is to fence the shared resource. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070611#comment-13070611 ] Ted Dunning commented on HDFS-1623: --- {quote} I guess it helps to quantify how "small" this window needs to be and whether current ZK is able to provide the notification fast enough. if not, maybe implementing the ZAB protocol as part of the namenode/backup-nodenode communication ? (in that case it would be nice if ZK exports its protocol as a library) {quote} The problem isn't on the Zookeeper side, really. The problem is that you have to be able to detect failures quickly, but not capriciously. Pretty much the only valid way to do this is with heartbeats of some kind which means that the time between heartbeats is the shortest response time possible. Moreover, the monitored program has to be sure that it can meet the real-time constraint imposed by the heartbeat rate. With the NameNode and possible GC, this is really, really hard to do for short heartbeats periods. There is perhaps some mileage to be had to synchronizing any fencing with the heartbeat so that you at least get rid of that uncertainty. Overall, this is a hard problem which is the reason for protocols like two-phase commit. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070584#comment-13070584 ] Yang Yang commented on HDFS-1623: - bq. Yang: the delivery of the "disconnected" event from ZK is asynchronous. So, there is a small window of time when the old NN still "thinks" it owns the znode while the new node may have taken over. I agree, I asked this question on ZK recently: http://zookeeper-user.578899.n2.nabble.com/help-on-Zookeeper-code-walk-through-tp6589163p6595469.html I guess it helps to quantify how "small" this window needs to be and whether current ZK is able to provide the notification fast enough. if not, maybe implementing the ZAB protocol as part of the namenode/backup-nodenode communication ? (in that case it would be nice if ZK exports its protocol as a library) > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070567#comment-13070567 ] Todd Lipcon commented on HDFS-1623: --- Yang: the delivery of the "disconnected" event from ZK is asynchronous. So, there is a small window of time when the old NN still "thinks" it owns the znode while the new node may have taken over. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070557#comment-13070557 ] Yang Yang commented on HDFS-1623: - in the design docs, why is "fencing" needed? if the namenode uses ZK for leader election, it needs to have a Watcher on the ephemeral node, if the namenode loses its leader status, the Watcher should return "disconnect" or "node removed" event, so that the namenode can naturally shutdown itself. doesn't this achieve what is mentioned in the "fencing" paragraph? > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039891#comment-13039891 ] Eli Collins commented on HDFS-1623: --- Thanks for incorporating the feedback. New doc looks good. Some comments: * Section 8.1 - I think the BN approach is to run *multiple* BNs, this way the 3f use case is not a problem as long as you have at least one BN alive, and you don't need shared storage to address 3f. This is similar to GFS' multiple shadow masters. * Section 8.3 - fail-over time doesn't need to be longer if the client is notified when there's a new primary. One idea, clients could watch an ephemeral ZK node, though there's an open question as to whether ZK can support as many observers as we have clients. * Section 8.5 - We need to figure out where the FailoverController (FC) runs, if lives in the same failure domain as the primary then you've still got a single point of failure. If it lives on a different failure domain then it may not be able to tell if the primary has failed, or be able to take the appropriate action if it has (eg due to lack of connectivity). Obviously the FC needs to be HA itself too (eg leader elected, new FC is spawned if the primary FC fails). * Section 9.9.1 - Todd and I have investigated fencing in NFS some. In v3 locking (NLM) doesn't work because dead clients maintain the lock. We'll need to have a pluggable shell command (eg some vendors provide a perl module that can ssh in and fence a particular IP) if we don't want to require IPMI, ILO, etc for stonith. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode > HA_v2_1.pdf, Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039022#comment-13039022 ] Sanjay Radia commented on HDFS-1623: >Wrt requirement #2, ... fail over to new version of HDFS ... is out of scope >for this framework. For failover during upgrades, the DNs need to be able to communicate to both versions of the NN. This is true for dot releases and will only be true if we support rolling upgrades. I will clarify this on the next version of the document. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, Namenode > HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039019#comment-13039019 ] Sanjay Radia commented on HDFS-1623: > How does NN fail-over impact federation? Eg does viewfs have any special > requirements as a client? In case of federation, each NN needs its own warm or hot failover. Cannot do N+k failover because of the large memory requirements for a NN unless one wants to limit it to cold failover. Viewfs is separate layer and not required for federation; one could choose to use symbolic links or one could choose to not provide a global namespace. But if one were to use viewfs, I am not aware of any *additional* failover issues. Viewfs has bindings like: /user ->hdfs://nnOfUser/ The failover issues are the same as if hdfs://nnofUser was your default file system. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, Namenode > HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009130#comment-13009130 ] Ivan Kelly commented on HDFS-1623: -- {quote} > How does heartbeat deal with network partition? My understanding of it is > that it sends packets at intervals to the other node, and if they don't get > through it considers the other dead. This could create a situation where both > active and standby think that the other is dead, and both become active, > leading to divergent filesystem states on each machine. This is discussed in the document as split brain and fencing requirements right?{quote} Ah, missed this. The fencing section does cover this. {quote} > Also, the design indicates that more than 2 NN is out of scope. Why? Surely > it's as easy to design for N namenodes as it is for 2 namenodes. Why do you need more than 2 NNs? Having more than 2NNs could solve need for outside quorum service. But number of NNs could be huge, especially in federated clusters. {quote} Its mentioned as "Out of scope" but having read operations on a standby could be a use case for this. Read throughput could be increased by adding more standby nodes. While this is out of scope, it would be good to keep it in mind now so that the design doesn't end up being tied to just 2 nodes which may be hard to rectify later. {quote} > If you want manual failover, from the server perspective you need to do > nothing. Operators can have 2 namenode machines, with the namenode only > running on one, writing to shared storage. When the want to failover to the > standby they just have to ensure that the active is down and start the > namenode daemon on the standby. Not sure what you are getting at here, in reference to the attached document? {quote} My point was that one of your requirements is "First class support for manual failover" and that this can doesn't need any changes to implement. It's available now provided you are logging to shared storage. {quote} > I proposed a design last week for streaming updates from an active to a > standby, it may be interesting to you (ZOOKEEPER-1016). It does have some > mentions of active/standby detection, which I should remove. It occurs to me > now, that this functionallity should be separated out completely from the > WALing and should live at the level of NameNode.java. I do not understand this point. Will take a look at your proposal. But as regards to this jira, BookKeeper could be a component in the solution and not the only component.{quote} Just thought it would be useful for you guys to be aware of it. The tackles a different problem, but in a related area. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008607#comment-13008607 ] Suresh Srinivas commented on HDFS-1623: --- > with the only difference being whether an active switch is enabled. Yes. Active/standby will be the state of the namenode and code should be identical. > It would be good if the code for active/standby detection was pluggable. So > that different options for failover could be provided. It wouldn't be good to > require that a zookeeper ensemble be set up just to run a namenode. The document does not state zookeeper is needed. HA solution requires a quorum service, that should be pluggable. ZK is one of the options. > How does heartbeat deal with network partition? My understanding of it is > that it sends packets at intervals to the other node, and if they don't get > through it considers the other dead. This could create a situation where both > active and standby think that the other is dead, and both become active, > leading to divergent filesystem states on each machine. This is discussed in the document as split brain and fencing requirements right? > Also, the design indicates that more than 2 NN is out of scope. Why? Surely > it's as easy to design for N namenodes as it is for 2 namenodes. Why do you need more than 2 NNs? Having more than 2NNs could solve need for outside quorum service. But number of NNs could be huge, especially in federated clusters. > If you want manual failover, from the server perspective you need to do > nothing. Operators can have 2 namenode machines, with the namenode only > running on one, writing to shared storage. When the want to failover to the > standby they just have to ensure that the active is down and start the > namenode daemon on the standby. Not sure what you are getting at here, in reference to the attached document? > I proposed a design last week for streaming updates from an active to a > standby, it may be interesting to you (ZOOKEEPER-1016). It does have some > mentions of active/standby detection, which I should remove. It occurs to me > now, that this functionallity should be separated out completely from the > WALing and should live at the level of NameNode.java. I do not understand this point. Will take a look at your proposal. But as regards to this jira, BookKeeper could be a component in the solution and not the only component. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008395#comment-13008395 ] Ivan Kelly commented on HDFS-1623: -- I assume that in what is envisioned, the code running on active and standby will be identical, with the only difference being whether an active switch is enabled. It would be good if the code for active/standby detection was pluggable. So that different options for failover could be provided. It wouldn't be good to require that a zookeeper ensemble be set up just to run a namenode. How does heartbeat deal with network partition? My understanding of it is that it sends packets at intervals to the other node, and if they don't get through it considers the other dead. This could create a situation where both active and standby think that the other is dead, and both become active, leading to divergent filesystem states on each machine. Also, the design indicates that more than 2 NN is out of scope. Why? Surely it's as easy to design for N namenodes as it is for 2 namenodes. If you want manual failover, from the server perspective you need to do nothing. Operators can have 2 namenode machines, with the namenode only running on one, writing to shared storage. When the want to failover to the standby they just have to ensure that the active is down and start the namenode daemon on the standby. I proposed a design last week for streaming updates from an active to a standby, it may be interesting to you (ZOOKEEPER-1016). It does have some mentions of active/standby detection, which I should remove. It occurs to me now, that this functionallity should be separated out completely from the WALing and should live at the level of NameNode.java. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > Attachments: Namenode HA Framework.pdf > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1623) High Availability Framework for HDFS NN
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993315#comment-12993315 ] Sanjay Radia commented on HDFS-1623: There are various steps one can take to improve the availability of the HDFS Namenode (NN) including reducing startup time, refreshing configuration without restarting the cluster, reducing upgrade time, and providing an manual or automatic failover of the NN, etc. This jira focuses on failover of the NN to address the issue of NN as a single point of failure. There are various ways to provide failure of NN. Some of these include the use of shared storage while others do not. Some include the use of IP failover while others use a smart client side library. One could use Zookeeper for leader election or some other off the shelf scheme such as Linux HA. These different solutions can share some framework components as building blocks. The purpose of this jira is to define these framework components for building failover solution towards improving availability. The avatar node could use some of these framework components. Some of the code from avatar node can be used towards building the framework components. Sub jiras will be created to address each of the framework components. > High Availability Framework for HDFS NN > --- > > Key: HDFS-1623 > URL: https://issues.apache.org/jira/browse/HDFS-1623 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Sanjay Radia >Assignee: Sanjay Radia > -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira