Thanks for the great sharing Josh. I have benefited greatly from the docs. Just curious. There is another related issue about walless. https://issues.apache.org/jira/browse/HBASE-20003. I have absolutely no imagination..but how we integrate both features to hbase? to be exact, does they have similar aim - remove the dependency to hdfs from hbase's write path?
Best Regard, Chia-Ping On 2018/05/03 16:04:00, Josh Elser <els...@apache.org> wrote: > Hi, > > I'm pleased to finally be able to share this design document with you > all. It's the result of internal review from half a dozen or so from > within our community (Enis, Devaraj, Artem, and Clay easily come to > mind) after multiple months of review and iteration. > > Abstract: > > <quote> > Infrastructure as a service (IaaS) via public cloud infrastructure > offerings (Cloud Iaas) has grown dramatically in popularity through > services like Amazon EC2, Google Compute Engine, and Microsoft Azure > Compute. Across Apache HBase users, the majority of new system > architectures include some form of Cloud IaaS as a means to increase the > capabilities and/or decrease the cost of operation of their system. > However, deploying HBase on these platforms comes with difficulties as > HBase has a non-optional dependency on Apache Hadoop HDFS to guarantee > the durability of data written to HBase. This document outlines a > proposal to remove HBase’s dependency on HDFS by replacing the current > Write-Ahead-Log (WAL) implementation using Apache Ratis (incubating). It > covers why the HDFS dependency is a problem on Cloud IaaS, how Ratis can > be used to replace HDFS-based WALs, and a high-level development plan to > effectively implement the replacement of this extremely critical HBase > internal component without becoming tied to a single Cloud IaaS offering. > </quote> > > The document is available on Google Docs[1] and there is also PDF > available [2] of the current version. I'm happy to assist those who do > not want to use the copy on a Google service (e.g. transcribe > mailing-list chatter onto the Doc). > > Thanks to some of the same folks who helped with this document, I also > have a fairly in-depth analysis of what we think the required work will > entail. For the HBase specific changes, I'd like to avoid the pitfall we > commonly face and work towards frequent merges into master that do not > destabilize the build (keep things "Green") to avoid stalling our > forward momentum after 2.0. If people are curious/interested, I'm happy > to delve some more into how I think we can implement this. > > - Josh > > [1] > https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit# > [2] https://home.apache.org/~elserj/Effective%20HBase%20in%20the%20Cloud.pdf >