Hi Yifeng, > #4. Writing Japanese books and documents > I am glad if I can work on this one with you.
Thanks for your offer. Let me explain a bit more about them. >> -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL >> book This one is a commercial book from a Japanese publisher, so I'll do this by myself. >> -- I'll translate The Apache HBase Book to Japanese This one comes with HBase, and I'm looking for some people (like you) to work with. http://hbase.apache.org/book.html I created a Jira entry to track this task: https://issues.apache.org/jira/browse/HBASE-3391 Are you working at Rakuten in Tokyo? Maybe we can meet at next Hadoop Source Code Reading at Rakuten Tower. Do you know this event? Thanks, Tatsuya -- Tatsuya Kawano (Mr.) Tokyo, Japan On Jan 25, 2011, at 11:03 AM, Yifeng Jiang <[email protected]> wrote: > #4. Writing Japanese books and documents > I am glad if I can work on this one with you. > > > On 01/23/2011 10:18 AM, Tatsuya Kawano wrote: >> Hi, >> >> I wanted to let you know that I'm planning to contribute the following items >> to the HBase community. These are my spare time projects and I'll only be >> able to spend my time about 7 hours a week, so the progress will be very >> slow. I want some feedback from you guys to prioritize them. Also, if >> someone/team wants to work on them (with me or alone), I'll be happy to >> provide more details. >> >> >> 1. RADOS integration >> >> Run HBase not only on HDFS but also RADOS distributed object store (the >> lower layer of Ceph), so that the following options will become available to >> HBase users: >> >> -- No SPOF (RADOS doesn't have the name node(s), but only ZK-like monitors >> and data nodes) >> -- Instant backup of HBase tables (RADOS provides copy-on-write snapshot per >> object pool) >> -- Extra durability option on WAL (RADOS can do both synchronous and >> asynchronous disk flush. HDFS doesn't have the earlier option) >> >> Note: >> RADOS object = HFile, WAL >> object pool = group of HFiles or WAL >> >> Current status: Design phase >> >> >> 2. mapreduce.HFileInputFormat >> >> MR library to read data directly from HFiles. (Roughly 2.5 times faster than >> TableInputFormat in my tests) >> >> Current status: Completed a proof-of-concept prototype and measured >> performance. >> >> >> 3. Enhance Get/Scan performance of RS >> >> Add an hash code and a couple of flags to HFile at the flush time and change >> scanner implementation so that: >> >> -- Get/Scan operations will get faster. (less key comparisons for >> reconstructing a row: O(h * c) -> O(h). [h = number of HFiles for the row, >> c = number of columns in an HFile]) >> -- The size of HFiles will become a bit smaller. (The flags will eliminate >> duplicate bytes in keys (row, column family and qualifier) from HFiles.) >> >> Current status: Completed a proof-of-concept prototype and measured >> performance. >> >> Detals: >> https://github.com/tatsuya6502/hbase-mr-pof/ >> (I meant "poc" not "pof"...) >> >> >> 4. Writing Japanese books and documents >> >> -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL >> book >> -- I'll translate The Apache HBase Book to Japanese >> >> >> Thank you, >> >> >> -- >> Tatsuya Kawano (Mr.) >> Tokyo, Japan >> >> http://twitter.com/#!/tatsuya6502 >> >> >> > > > -- > Yifeng Jiang >
