[ https://issues.apache.org/jira/browse/HBASE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Busbey updated HBASE-22149: -------------------------------- Release Note: <!-- markdown --> Initial implementation of the hbase-oss module. Defines a wrapper implementation of Apache Hadoop's FileSystem interface that bridges the gap between Apache HBase, which assumes that many operations are atomic, and object-store implementations of FileSystem (such as s3a) which inherently cannot provide atomic semantics to those operations natively. The implementation can be used e.g. with the s3a filesystem by using a root fs like `s3a://bucket/` and defining * `fs.s3a.impl` set to `org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics` * `fs.hboss.fs.s3a.impl` set to `org.apache.hadoop.fs.s3a.S3AFileSystem` more details in the module's README.md NOTE: This module is labeled with an ALPHA version. It is not considered production ready and makes no promises about compatibility between versions. was: <!-- markdown --> Initial implementation of the hbase-oss module. Defines a wrapper FileSystem implementation of Apache Hadoop's FileSystem interface that bridges the gap between Apache HBase, which assumes that many operations are atomic, and object-store implementations of FileSystem (such as s3a) which inherently cannot provide atomic semantics to those operations natively. The implementation can be used e.g. with the s3a filesystem by using a root fs like `s3a://bucket/` and defining * `fs.s3a.impl` set to `org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics` * `fs.hboss.fs.s3a.impl` set to `org.apache.hadoop.fs.s3a.S3AFileSystem` more details in the module's README.md NOTE: This module is labeled with an ALPHA version. It is not considered production ready and makes no promises about compatibility between versions. > HBOSS: A FileSystem implementation to provide HBase's required semantics on > object stores > ----------------------------------------------------------------------------------------- > > Key: HBASE-22149 > URL: https://issues.apache.org/jira/browse/HBASE-22149 > Project: HBase > Issue Type: New Feature > Components: Filesystem Integration > Reporter: Sean Mackrory > Assignee: Sean Mackrory > Priority: Critical > Fix For: hbase-filesystem-1.0.0-alpha1 > > Attachments: HBASE-22149-hadoop.patch, HBASE-22149-hbase-2.patch, > HBASE-22149-hbase-3.patch, HBASE-22149-hbase-4.patch, > HBASE-22149-hbase-5.patch, HBASE-22149-hbase-filesystem-1.patch, > HBASE-22149-hbase-filesystem-1.patch, HBASE-22149-hbase.patch > > > (Have been using the name HBOSS for HBase / Object Store Semantics) > I've had some thoughts about how to solve the problem of running HBase on > object stores. There has been some thought in the past about adding the > required semantics to S3Guard, but I have some concerns about that. First, > it's mixing complicated solutions to different problems (bridging the gap > between a flat namespace and a hierarchical namespace vs. solving > inconsistency). Second, it's S3-specific, whereas other objects stores could > use virtually identical solutions. And third, we can't do things like atomic > renames in a true sense. There would have to be some trade-offs specific to > HBase's needs and it's better if we can solve that in an HBase-specific > module without mixing all that logic in with the rest of S3A. > Ideas to solve this above the FileSystem layer have been proposed and > considered (HBASE-20431, for one), and maybe that's the right way forward > long-term, but it certainly seems to be a hard problem and hasn't been done > yet. But I don't know enough of all the internal considerations to make much > of a judgment on that myself. > I propose a FileSystem implementation that wraps another FileSystem instance > and provides locking of FileSystem operations to ensure correct semantics. > Locking could quite possibly be done on the same ZooKeeper ensemble as an > HBase cluster already uses (I'm sure there are some performance > considerations here that deserve more attention). I've put together a > proof-of-concept on which I've tested some aspects of atomic renames and > atomic file creates. Both of these tests fail reliably on a naked s3a > instance. I've also done a small YCSB run against a small cluster to sanity > check other functionality and was successful. I will post the patch, and my > laundry list of things that still need work. The WAL is still placed on HDFS, > but the HBase root directory is otherwise on S3. > Note that my prototype is built on Hadoop's source tree right now. That's > purely for my convenience in putting it together quickly, as that's where I > mostly work. I actually think long-term, if this is accepted as a good > solution, it makes sense to live in HBase (or it's own repository). It only > depends on stable, public APIs in Hadoop and is targeted entirely at HBase's > needs, so it should be able to iterate on the HBase community's terms alone. > Another idea [~ste...@apache.org] proposed to me is that of an inode-based > FileSystem that keeps hierarchical metadata in a more appropriate store that > would allow the required transactions (maybe a special table in HBase could > provide that store itself for other tables), and stores the underlying files > with unique identifiers on S3. This allows renames to actually become fast > instead of just large atomic operations. It does however place a strong > dependency on the metadata store. I have not explored this idea much. My > current proof-of-concept has been pleasantly simple, so I think it's the > right solution unless it proves unable to provide the required performance > characteristics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)