#1 looks similar to what MapR has done. On Sat, Jan 22, 2011 at 5:18 PM, Tatsuya Kawano <[email protected]>wrote:
> > Hi, > > I wanted to let you know that I'm planning to contribute the following > items to the HBase community. These are my spare time projects and I'll only > be able to spend my time about 7 hours a week, so the progress will be very > slow. I want some feedback from you guys to prioritize them. Also, if > someone/team wants to work on them (with me or alone), I'll be happy to > provide more details. > > > 1. RADOS integration > > Run HBase not only on HDFS but also RADOS distributed object store (the > lower layer of Ceph), so that the following options will become available to > HBase users: > > -- No SPOF (RADOS doesn't have the name node(s), but only ZK-like monitors > and data nodes) > -- Instant backup of HBase tables (RADOS provides copy-on-write snapshot > per object pool) > -- Extra durability option on WAL (RADOS can do both synchronous and > asynchronous disk flush. HDFS doesn't have the earlier option) > > Note: > RADOS object = HFile, WAL > object pool = group of HFiles or WAL > > Current status: Design phase > > > 2. mapreduce.HFileInputFormat > > MR library to read data directly from HFiles. (Roughly 2.5 times faster > than TableInputFormat in my tests) > > Current status: Completed a proof-of-concept prototype and measured > performance. > > > 3. Enhance Get/Scan performance of RS > > Add an hash code and a couple of flags to HFile at the flush time and > change scanner implementation so that: > > -- Get/Scan operations will get faster. (less key comparisons for > reconstructing a row: O(h * c) -> O(h). [h = number of HFiles for the row, > c = number of columns in an HFile]) > -- The size of HFiles will become a bit smaller. (The flags will eliminate > duplicate bytes in keys (row, column family and qualifier) from HFiles.) > > Current status: Completed a proof-of-concept prototype and measured > performance. > > Detals: > https://github.com/tatsuya6502/hbase-mr-pof/ > (I meant "poc" not "pof"...) > > > 4. Writing Japanese books and documents > > -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL > book > -- I'll translate The Apache HBase Book to Japanese > > > Thank you, > > > -- > Tatsuya Kawano (Mr.) > Tokyo, Japan > > http://twitter.com/#!/tatsuya6502 <http://twitter.com/#%21/tatsuya6502> > > >
