[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3077:
------------------------------

    Attachment: hdfs-3077.txt

Here is an initial patch with the implementation of this design. It is not 
complete, but I'm posting it here as it's already grown large, and I'd like to 
start the review process while I continue to add test coverage and iron out 
various TODOs which are littered around the code.

As it is, the code can be run, and I can successfully start/restart NNs, fail 
JNs, etc, and it mostly "works as advertised". There are known deficiencies 
which I'm working on addressing, and these should mostly be marked by TODOs.

This patch is on top of the following:

ffcfc55 HDFS-3190. 1: Extract code to atomically write a file containing a long
025759c HDFS-3571. Add URL support to EditLogFileInputStream
707a309 HDFS-3572. Clean up init of SPNEGO
d84516f HDFS-3573. Change instantiation of journal managers to have NSInfo
f61dc7d HDFS-3574. Fix race in GetImageServlet where file is removed during 
header-setting
(and those on top of trunk).

I did not end up basing this on the HDFS-3092 branch as I originally planned, 
though there's a bunch of code borrowed from the early work done on that branch 
by Brandon and Hari. I would have liked to use the code exactly as it was, but 
the differences in design made it too difficult to try to reconcile, and I 
ended up copy-pasting and modifying rather than patching against that branch. 
(for example, all of the RPCs in this design go through an async queue in order 
to do quorum writes)

Of course there will be follow-up work to create a test plan, add substantially 
more tests, add docs, etc. But my hope is that, after review, we can commit 
this (and the prereq patches) either to trunk or a branch and work from there 
to fix the remaining work items, test, etc.
                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, hdfs-3077.txt, 
> qjournal-design.pdf, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to