Inline comments. Thanks, 2016-04-29 10:57 GMT+08:00 Sean Busbey <[email protected]>:
> I am nervous about having default out-of-the-box new HBase users reliant on > a bespoke HDFS client, especially given Hadoop's compatibility > promises and history. Answers for these questions would make me more > confident: > > 1) Where are we on getting the client-side changes to HDFS pushed back > upstream? > No progress yet... Here I want to tell a good story that HBase is already use it as default :) > > 2) How well do we detect when our FS is not HDFS and what does > fallback look like? > Just wrap FSDataOutputStream to make it act like an asynchronous output(call hflush in a separated thread). The performance is not good I think. > > 3) Will this mean altering the versions of Hadoop we label as > supported for HBase 2.y+? > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think we need to change the supported versions? > > 4) How are we going to ensure our client remains compatible with newer > Hadoop releases? > We can not ensure, HDFS always breaks HBase at a new release... I need to test AsyncFSWAL on every new 2.x release and make it compatible with that version. And back to #1, I think we should make sure that the AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a new 'AsyncFSWAL' that use the AsyncFSOutput in HDFS. > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang <[email protected]> wrote: > > Six month after I filed HBASE-14790... > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is > *1.4x~3.7x* > > faster than FSHLog. The ITBLL result turns out that it is *not bad* than > > FSHLog(the master branch is not that stable itself...). > > > > More details can be found on HBASE-15536. > > > > So here we propose to change the default WAL from FSHLog to AsyncFSWAL. > > Suggestions are welcomed. > > > > Thanks. > > > > -- > busbey >
