[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229527#comment-13229527
 ] 

Todd Lipcon commented on HDFS-3077:
-----------------------------------

bq. These arguments seem very much to be a case of NIH.
No, they're an argument for uniformity of code base. Hadoop's already a large 
project. Briefly skimming the BK code, I see:
- A new NIO server which we'll have to understand and probably bugfix (we've 
spent literally years working on our own NIO server for IPC)
- A bunch of ad-hoc serialization code (eg in BookieServer.java). We just spent 
a long time making Hadoop wire-compatible using protobufs. We don't want to 
inherit more code which uses ad-hoc serialization.
- No metrics subsystem at all - we want to continue to make use of the existing 
metrics implementation in Hadoop
- No SASL or SSL implementation. On-the-wire encryption is a requirement we're 
hearing more and more in Hadoop. Hadoop IPC already gives us SASL-based 
encryption
- Password-based authentication instead of kerberos-based. One more password to 
configure
- Its own on-disk format for logs. So if you take a backup from a bookie, you 
can't use tools like the OEV to view them
- A different file format, etc

Certainly, it's a "small matter of code" to add all of these things to 
BookKeeper. But given that BK is primarily a project maintained by a research 
organization, and none of the above are at all interesting from a research 
perspective, I don't think it's likely to happen any time soon.

Then, there is a valid NIH concern -- or really not-maintained-here. As I said 
above, if we have a bug in BK, we need to (a) convince someone on the BK team 
to fix it, (b) get it into ZK trunk, (c) get the ZK team to make a new release, 
(d) check Hadoop against any _other_ new changes in that release, (e) convince 
an operations team which may be distinct from the Hadoop ops team to update the 
ZooKeeper installation. That's really painful. If BK were a mature project with 
tons of production users, I'd agree we should just depend on it, given the 
number of bugs we'd likely find would be very low.

Anyway, this JIRA isn't to argue against BookKeeper. If you want to keep 
exploring it, please go ahead - the advantage of a pluggable interface here is 
that different implementations may coexist.

bq. Also, I don't think ZAB is the right tool for this in any case. You have a 
single writer, which can therefore act as a sequencer on the entries. You just 
need to broadcast to an ensemble, and wait for quorum responses, as I outlined 
above for BookKeeper.

We have a single writer, except for when we don't. During a failover, without a 
STONITH capability, we may have overlapping writers. Please see the examples 
above for why we need sequencing of multiple writers.

                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to