On Thu, Apr 28, 2016 at 8:34 PM, Heng Chen <heng.chen.1...@gmail.com> wrote:

> The performance is quite great,  but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> Yeah. Would be nice if a production deploy before we made the switch but
in the absence of that, lets get it enabled by default early in the
master/2.0 branch.

I've done testing on a cluster using ITBLL trying to break it. I've found
that asyncfs WAL is no worse than our FSHLog able to do same scale at
least. Its hard to test master in its current state but I was able to do
runs of billions over many hours of chaos on cluster of 9 nodes (see issue
for detail).

In fact, asyncfswal can only be better. It is a massive simplification of
the disruptor+5 syncing threads+opaque dfsclient internal mess we currently
run with. If an issue, we'll be more likely able to figure it out if
asyncfswal is in place.

St.Ack



> 2016-04-29 11:30 GMT+08:00 张铎 <palomino...@gmail.com>:
>
> > Inline comments.
> > Thanks,
> >
> > 2016-04-29 10:57 GMT+08:00 Sean Busbey <bus...@cloudera.com>:
> >
> > > I am nervous about having default out-of-the-box new HBase users
> reliant
> > on
> > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > promises and history. Answers for these questions would make me more
> > > confident:
> > >
> > > 1) Where are we on getting the client-side changes to HDFS pushed back
> > > upstream?
> > >
> > No progress yet... Here I want to tell a good story that HBase is already
> > use it as default :)
> >
> > >
> > > 2) How well do we detect when our FS is not HDFS and what does
> > > fallback look like?
> > >
> > Just wrap FSDataOutputStream to make it act like an asynchronous
> > output(call hflush in a separated thread). The performance is not good I
> > think.
> >
> > >
> > > 3) Will this mean altering the versions of Hadoop we label as
> > > supported for HBase 2.y+?
> > >
> > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think
> we
> > need to change the supported versions?
> >
> > >
> > > 4) How are we going to ensure our client remains compatible with newer
> > > Hadoop releases?
> > >
> > We can not ensure, HDFS always breaks HBase at a new release...
> > I need to test AsyncFSWAL on every new 2.x release and make it compatible
> > with that version. And back to #1, I think we should make sure that the
> > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> new
> > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> >
> > >
> > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang <zhang...@apache.org>
> wrote:
> > > > Six month after I filed HBASE-14790...
> > > >
> > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > *1.4x~3.7x*
> > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > than
> > > > FSHLog(the master branch is not that stable itself...).
> > > >
> > > > More details can be found on HBASE-15536.
> > > >
> > > > So here we propose to change the default WAL from FSHLog to
> AsyncFSWAL.
> > > > Suggestions are welcomed.
> > > >
> > > > Thanks.
> > >
> > >
> > >
> > > --
> > > busbey
> > >
> >
>

Reply via email to