Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-12 Thread Sean Busbey
On Wed, May 11, 2016 at 7:53 PM, 张铎  wrote:
> I think at that time I will start a new project called AsyncDFSClient which
> will implement the whole client side logic of HDFS without using reflection
> :)
>

If we end up in this dystopian future, then please have that project
live as a subproject of HBase.

-- 
busbey


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-12 Thread Nick Dimiduk
On Wed, May 11, 2016 at 10:28 PM, Andrew Purtell 
wrote:

> All you have to do is stick around long enough. Hadoop 0.20-append v2 :-)
>

*palm-all-the-faces*

> On May 11, 2016, at 9:46 PM, Stack  wrote:
> >
> >> On Wed, May 11, 2016 at 7:53 PM, 张铎  wrote:
> >>
> >> I think at that time I will start a new project called AsyncDFSClient
> which
> >> will implement the whole client side logic of HDFS without using
> reflection
> >> :)
> > Haven't I seen this movie before? (smile)
> > St.Ack
> >
> >
> >
> >> 2016-05-12 10:27 GMT+08:00 Andrew Purtell :
> >>
> >>> If Hadoop refuses the changes before we release, we can change the
> >> default
> >>> back.
> >>>
> >>>
> >>> On May 11, 2016, at 6:50 PM, Gary Helmling 
> wrote:
> >>>
> >
> >
> > I was trying to avoid the below oft-repeated pattern at least for the
> >>> case
> > of critical developments:
> >
> > + New feature arrives after much work by developer, reviewers and
> >>> testers
> > accompanied by fanfare (blog, talks).
> > + Developers and reviewers move on after getting it committed or it
> >> gets
> > hacked into a deploy so it works in a frankenstein form
> > + It sits in our code base across one or more releases marked as
> >>> optional,
> > 'experimental'
> > + The 'experimental' bleamish discourages its exercise by users
> > + The feature lags, rots
> > + Or, the odd time, we go ahead and enable it as default in spite of
> >> the
> > fact it was never tried when experimental.
> >
> > Distributed Log Replay sat in hbase across a few major versions. Only
> >>> when
> > the threat of our making an actual release with it on by default did
> >> it
> >>> get
> > serious attention where it was found flawed and is now being actively
> > purged. This was after it made it past reviews, multiple attempts at
> > testing at scale, and so on; i.e. we'd done it all by the book. The
> >>> time in
> > an 'experimental' state added nothing.
>  Those are all valid concerns as well. It's certainly a pattern that
> >> we've
>  seen repeated. That's also a broader concern I have about the farther
> >> we
>  push out 2.0, then the less exercised master is.
> 
>  I don't really know how best to balance this with concerns about user
>  stability.  Enabling by default in master would certainly be a forcing
>  function and would help it get more testing before release.  I hear
> >> that
>  argument.  But I'm worried about the impact after release, where
> >>> something
>  as simple as a bug-fix point release upgrade of Hadoop could result in
>  runtime breakage of an HBase install.  Will this happen in practice?
> I
>  don't know.  It seems unlikely that the private variable names being
> >> used
>  for example would change in a point release.  But we're violating the
>  abstraction that Hadoop provides us which guarantees such breakage
> >> won't
>  occur.
> 
> 
> >> Yes. 2.0 is a bit out there so we have some time to iron out issues
> >> is
> > the
> > thought. Yes, it could push out delivery of 2.0.
>  Having this on by default in an unreleased master doesn't actually
> >> worry
> >>> me
>  that much.  It's just the question of what happens when we do release.
> >>> At
>  that point, this discussion will be ancient history and I don't think
> >>> we'll
>  give any renewed consideration to what the impact of this change might
> >>> be.
>  Ideally it would be great to see this work in HDFS by that point and
> >> for
>  that HDFS version this becomes a non-issue.
> 
> 
> >
> > I think the discussion here has been helpful. Holes have been found
> >> (and
> > plugged), the risk involved has gotten a good airing out here on dev,
> >>> and
> > in spite of the back and forth, one of our experts in good standing
> is
> > still against it being on by default.
> >
> > If you are not down w/ the arguments, I'd be fine not making it the
> > default.
> > St.Ack
> 
>  I don't think it's right to block this by myself, since I'm clearly in
> >>> the
>  minority.  Since others clearly support this change, have at it.
> 
>  But let me pose an alternate question: what if HDFS flat out refuses
> to
>  adopt this change?  What are our options then with this already
> >> shipping
> >>> as
>  a default?  Would we continue to endure breakage due to the use of
> HDFS
>  private internals?  Do we switch the default back?  Do we do something
> >>> else?
> 
>  Thanks for the discussion.
> >>
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-11 Thread Andrew Purtell
All you have to do is stick around long enough. Hadoop 0.20-append v2 :-)

> On May 11, 2016, at 9:46 PM, Stack  wrote:
> 
>> On Wed, May 11, 2016 at 7:53 PM, 张铎  wrote:
>> 
>> I think at that time I will start a new project called AsyncDFSClient which
>> will implement the whole client side logic of HDFS without using reflection
>> :)
> Haven't I seen this movie before? (smile)
> St.Ack
> 
> 
> 
>> 2016-05-12 10:27 GMT+08:00 Andrew Purtell :
>> 
>>> If Hadoop refuses the changes before we release, we can change the
>> default
>>> back.
>>> 
>>> 
>>> On May 11, 2016, at 6:50 PM, Gary Helmling  wrote:
>>> 
> 
> 
> I was trying to avoid the below oft-repeated pattern at least for the
>>> case
> of critical developments:
> 
> + New feature arrives after much work by developer, reviewers and
>>> testers
> accompanied by fanfare (blog, talks).
> + Developers and reviewers move on after getting it committed or it
>> gets
> hacked into a deploy so it works in a frankenstein form
> + It sits in our code base across one or more releases marked as
>>> optional,
> 'experimental'
> + The 'experimental' bleamish discourages its exercise by users
> + The feature lags, rots
> + Or, the odd time, we go ahead and enable it as default in spite of
>> the
> fact it was never tried when experimental.
> 
> Distributed Log Replay sat in hbase across a few major versions. Only
>>> when
> the threat of our making an actual release with it on by default did
>> it
>>> get
> serious attention where it was found flawed and is now being actively
> purged. This was after it made it past reviews, multiple attempts at
> testing at scale, and so on; i.e. we'd done it all by the book. The
>>> time in
> an 'experimental' state added nothing.
 Those are all valid concerns as well. It's certainly a pattern that
>> we've
 seen repeated. That's also a broader concern I have about the farther
>> we
 push out 2.0, then the less exercised master is.
 
 I don't really know how best to balance this with concerns about user
 stability.  Enabling by default in master would certainly be a forcing
 function and would help it get more testing before release.  I hear
>> that
 argument.  But I'm worried about the impact after release, where
>>> something
 as simple as a bug-fix point release upgrade of Hadoop could result in
 runtime breakage of an HBase install.  Will this happen in practice?  I
 don't know.  It seems unlikely that the private variable names being
>> used
 for example would change in a point release.  But we're violating the
 abstraction that Hadoop provides us which guarantees such breakage
>> won't
 occur.
 
 
>> Yes. 2.0 is a bit out there so we have some time to iron out issues
>> is
> the
> thought. Yes, it could push out delivery of 2.0.
 Having this on by default in an unreleased master doesn't actually
>> worry
>>> me
 that much.  It's just the question of what happens when we do release.
>>> At
 that point, this discussion will be ancient history and I don't think
>>> we'll
 give any renewed consideration to what the impact of this change might
>>> be.
 Ideally it would be great to see this work in HDFS by that point and
>> for
 that HDFS version this becomes a non-issue.
 
 
> 
> I think the discussion here has been helpful. Holes have been found
>> (and
> plugged), the risk involved has gotten a good airing out here on dev,
>>> and
> in spite of the back and forth, one of our experts in good standing is
> still against it being on by default.
> 
> If you are not down w/ the arguments, I'd be fine not making it the
> default.
> St.Ack
 
 I don't think it's right to block this by myself, since I'm clearly in
>>> the
 minority.  Since others clearly support this change, have at it.
 
 But let me pose an alternate question: what if HDFS flat out refuses to
 adopt this change?  What are our options then with this already
>> shipping
>>> as
 a default?  Would we continue to endure breakage due to the use of HDFS
 private internals?  Do we switch the default back?  Do we do something
>>> else?
 
 Thanks for the discussion.
>> 


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-11 Thread Stack
On Wed, May 11, 2016 at 7:53 PM, 张铎  wrote:

> I think at that time I will start a new project called AsyncDFSClient which
> will implement the whole client side logic of HDFS without using reflection
> :)
>
>
Haven't I seen this movie before? (smile)
St.Ack



> 2016-05-12 10:27 GMT+08:00 Andrew Purtell :
>
> > If Hadoop refuses the changes before we release, we can change the
> default
> > back.
> >
> >
> > On May 11, 2016, at 6:50 PM, Gary Helmling  wrote:
> >
> > >>
> > >>
> > >> I was trying to avoid the below oft-repeated pattern at least for the
> > case
> > >> of critical developments:
> > >>
> > >> + New feature arrives after much work by developer, reviewers and
> > testers
> > >> accompanied by fanfare (blog, talks).
> > >> + Developers and reviewers move on after getting it committed or it
> gets
> > >> hacked into a deploy so it works in a frankenstein form
> > >> + It sits in our code base across one or more releases marked as
> > optional,
> > >> 'experimental'
> > >> + The 'experimental' bleamish discourages its exercise by users
> > >> + The feature lags, rots
> > >> + Or, the odd time, we go ahead and enable it as default in spite of
> the
> > >> fact it was never tried when experimental.
> > >>
> > >> Distributed Log Replay sat in hbase across a few major versions. Only
> > when
> > >> the threat of our making an actual release with it on by default did
> it
> > get
> > >> serious attention where it was found flawed and is now being actively
> > >> purged. This was after it made it past reviews, multiple attempts at
> > >> testing at scale, and so on; i.e. we'd done it all by the book. The
> > time in
> > >> an 'experimental' state added nothing.
> > > Those are all valid concerns as well. It's certainly a pattern that
> we've
> > > seen repeated. That's also a broader concern I have about the farther
> we
> > > push out 2.0, then the less exercised master is.
> > >
> > > I don't really know how best to balance this with concerns about user
> > > stability.  Enabling by default in master would certainly be a forcing
> > > function and would help it get more testing before release.  I hear
> that
> > > argument.  But I'm worried about the impact after release, where
> > something
> > > as simple as a bug-fix point release upgrade of Hadoop could result in
> > > runtime breakage of an HBase install.  Will this happen in practice?  I
> > > don't know.  It seems unlikely that the private variable names being
> used
> > > for example would change in a point release.  But we're violating the
> > > abstraction that Hadoop provides us which guarantees such breakage
> won't
> > > occur.
> > >
> > >
> > >>> Yes. 2.0 is a bit out there so we have some time to iron out issues
> is
> > >> the
> > >> thought. Yes, it could push out delivery of 2.0.
> > > Having this on by default in an unreleased master doesn't actually
> worry
> > me
> > > that much.  It's just the question of what happens when we do release.
> > At
> > > that point, this discussion will be ancient history and I don't think
> > we'll
> > > give any renewed consideration to what the impact of this change might
> > be.
> > > Ideally it would be great to see this work in HDFS by that point and
> for
> > > that HDFS version this becomes a non-issue.
> > >
> > >
> > >>
> > >> I think the discussion here has been helpful. Holes have been found
> (and
> > >> plugged), the risk involved has gotten a good airing out here on dev,
> > and
> > >> in spite of the back and forth, one of our experts in good standing is
> > >> still against it being on by default.
> > >>
> > >> If you are not down w/ the arguments, I'd be fine not making it the
> > >> default.
> > >> St.Ack
> > >
> > > I don't think it's right to block this by myself, since I'm clearly in
> > the
> > > minority.  Since others clearly support this change, have at it.
> > >
> > > But let me pose an alternate question: what if HDFS flat out refuses to
> > > adopt this change?  What are our options then with this already
> shipping
> > as
> > > a default?  Would we continue to endure breakage due to the use of HDFS
> > > private internals?  Do we switch the default back?  Do we do something
> > else?
> > >
> > > Thanks for the discussion.
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-11 Thread 张铎
I think at that time I will start a new project called AsyncDFSClient which
will implement the whole client side logic of HDFS without using reflection
:)

2016-05-12 10:27 GMT+08:00 Andrew Purtell :

> If Hadoop refuses the changes before we release, we can change the default
> back.
>
>
> On May 11, 2016, at 6:50 PM, Gary Helmling  wrote:
>
> >>
> >>
> >> I was trying to avoid the below oft-repeated pattern at least for the
> case
> >> of critical developments:
> >>
> >> + New feature arrives after much work by developer, reviewers and
> testers
> >> accompanied by fanfare (blog, talks).
> >> + Developers and reviewers move on after getting it committed or it gets
> >> hacked into a deploy so it works in a frankenstein form
> >> + It sits in our code base across one or more releases marked as
> optional,
> >> 'experimental'
> >> + The 'experimental' bleamish discourages its exercise by users
> >> + The feature lags, rots
> >> + Or, the odd time, we go ahead and enable it as default in spite of the
> >> fact it was never tried when experimental.
> >>
> >> Distributed Log Replay sat in hbase across a few major versions. Only
> when
> >> the threat of our making an actual release with it on by default did it
> get
> >> serious attention where it was found flawed and is now being actively
> >> purged. This was after it made it past reviews, multiple attempts at
> >> testing at scale, and so on; i.e. we'd done it all by the book. The
> time in
> >> an 'experimental' state added nothing.
> > Those are all valid concerns as well. It's certainly a pattern that we've
> > seen repeated. That's also a broader concern I have about the farther we
> > push out 2.0, then the less exercised master is.
> >
> > I don't really know how best to balance this with concerns about user
> > stability.  Enabling by default in master would certainly be a forcing
> > function and would help it get more testing before release.  I hear that
> > argument.  But I'm worried about the impact after release, where
> something
> > as simple as a bug-fix point release upgrade of Hadoop could result in
> > runtime breakage of an HBase install.  Will this happen in practice?  I
> > don't know.  It seems unlikely that the private variable names being used
> > for example would change in a point release.  But we're violating the
> > abstraction that Hadoop provides us which guarantees such breakage won't
> > occur.
> >
> >
> >>> Yes. 2.0 is a bit out there so we have some time to iron out issues is
> >> the
> >> thought. Yes, it could push out delivery of 2.0.
> > Having this on by default in an unreleased master doesn't actually worry
> me
> > that much.  It's just the question of what happens when we do release.
> At
> > that point, this discussion will be ancient history and I don't think
> we'll
> > give any renewed consideration to what the impact of this change might
> be.
> > Ideally it would be great to see this work in HDFS by that point and for
> > that HDFS version this becomes a non-issue.
> >
> >
> >>
> >> I think the discussion here has been helpful. Holes have been found (and
> >> plugged), the risk involved has gotten a good airing out here on dev,
> and
> >> in spite of the back and forth, one of our experts in good standing is
> >> still against it being on by default.
> >>
> >> If you are not down w/ the arguments, I'd be fine not making it the
> >> default.
> >> St.Ack
> >
> > I don't think it's right to block this by myself, since I'm clearly in
> the
> > minority.  Since others clearly support this change, have at it.
> >
> > But let me pose an alternate question: what if HDFS flat out refuses to
> > adopt this change?  What are our options then with this already shipping
> as
> > a default?  Would we continue to endure breakage due to the use of HDFS
> > private internals?  Do we switch the default back?  Do we do something
> else?
> >
> > Thanks for the discussion.
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-11 Thread Andrew Purtell
If Hadoop refuses the changes before we release, we can change the default 
back. 


On May 11, 2016, at 6:50 PM, Gary Helmling  wrote:

>> 
>> 
>> I was trying to avoid the below oft-repeated pattern at least for the case
>> of critical developments:
>> 
>> + New feature arrives after much work by developer, reviewers and testers
>> accompanied by fanfare (blog, talks).
>> + Developers and reviewers move on after getting it committed or it gets
>> hacked into a deploy so it works in a frankenstein form
>> + It sits in our code base across one or more releases marked as optional,
>> 'experimental'
>> + The 'experimental' bleamish discourages its exercise by users
>> + The feature lags, rots
>> + Or, the odd time, we go ahead and enable it as default in spite of the
>> fact it was never tried when experimental.
>> 
>> Distributed Log Replay sat in hbase across a few major versions. Only when
>> the threat of our making an actual release with it on by default did it get
>> serious attention where it was found flawed and is now being actively
>> purged. This was after it made it past reviews, multiple attempts at
>> testing at scale, and so on; i.e. we'd done it all by the book. The time in
>> an 'experimental' state added nothing.
> Those are all valid concerns as well. It's certainly a pattern that we've
> seen repeated. That's also a broader concern I have about the farther we
> push out 2.0, then the less exercised master is.
> 
> I don't really know how best to balance this with concerns about user
> stability.  Enabling by default in master would certainly be a forcing
> function and would help it get more testing before release.  I hear that
> argument.  But I'm worried about the impact after release, where something
> as simple as a bug-fix point release upgrade of Hadoop could result in
> runtime breakage of an HBase install.  Will this happen in practice?  I
> don't know.  It seems unlikely that the private variable names being used
> for example would change in a point release.  But we're violating the
> abstraction that Hadoop provides us which guarantees such breakage won't
> occur.
> 
> 
>>> Yes. 2.0 is a bit out there so we have some time to iron out issues is
>> the
>> thought. Yes, it could push out delivery of 2.0.
> Having this on by default in an unreleased master doesn't actually worry me
> that much.  It's just the question of what happens when we do release.  At
> that point, this discussion will be ancient history and I don't think we'll
> give any renewed consideration to what the impact of this change might be.
> Ideally it would be great to see this work in HDFS by that point and for
> that HDFS version this becomes a non-issue.
> 
> 
>> 
>> I think the discussion here has been helpful. Holes have been found (and
>> plugged), the risk involved has gotten a good airing out here on dev, and
>> in spite of the back and forth, one of our experts in good standing is
>> still against it being on by default.
>> 
>> If you are not down w/ the arguments, I'd be fine not making it the
>> default.
>> St.Ack
> 
> I don't think it's right to block this by myself, since I'm clearly in the
> minority.  Since others clearly support this change, have at it.
> 
> But let me pose an alternate question: what if HDFS flat out refuses to
> adopt this change?  What are our options then with this already shipping as
> a default?  Would we continue to endure breakage due to the use of HDFS
> private internals?  Do we switch the default back?  Do we do something else?
> 
> Thanks for the discussion.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-11 Thread Gary Helmling
>
>
> I was trying to avoid the below oft-repeated pattern at least for the case
> of critical developments:
>
> + New feature arrives after much work by developer, reviewers and testers
> accompanied by fanfare (blog, talks).
> + Developers and reviewers move on after getting it committed or it gets
> hacked into a deploy so it works in a frankenstein form
> + It sits in our code base across one or more releases marked as optional,
> 'experimental'
> + The 'experimental' bleamish discourages its exercise by users
> + The feature lags, rots
> + Or, the odd time, we go ahead and enable it as default in spite of the
> fact it was never tried when experimental.
>
> Distributed Log Replay sat in hbase across a few major versions. Only when
> the threat of our making an actual release with it on by default did it get
> serious attention where it was found flawed and is now being actively
> purged. This was after it made it past reviews, multiple attempts at
> testing at scale, and so on; i.e. we'd done it all by the book. The time in
> an 'experimental' state added nothing.
>
>
Those are all valid concerns as well. It's certainly a pattern that we've
seen repeated. That's also a broader concern I have about the farther we
push out 2.0, then the less exercised master is.

I don't really know how best to balance this with concerns about user
stability.  Enabling by default in master would certainly be a forcing
function and would help it get more testing before release.  I hear that
argument.  But I'm worried about the impact after release, where something
as simple as a bug-fix point release upgrade of Hadoop could result in
runtime breakage of an HBase install.  Will this happen in practice?  I
don't know.  It seems unlikely that the private variable names being used
for example would change in a point release.  But we're violating the
abstraction that Hadoop provides us which guarantees such breakage won't
occur.


> > Yes. 2.0 is a bit out there so we have some time to iron out issues is
> the
> thought. Yes, it could push out delivery of 2.0.
>
>
Having this on by default in an unreleased master doesn't actually worry me
that much.  It's just the question of what happens when we do release.  At
that point, this discussion will be ancient history and I don't think we'll
give any renewed consideration to what the impact of this change might be.
Ideally it would be great to see this work in HDFS by that point and for
that HDFS version this becomes a non-issue.


>
> I think the discussion here has been helpful. Holes have been found (and
> plugged), the risk involved has gotten a good airing out here on dev, and
> in spite of the back and forth, one of our experts in good standing is
> still against it being on by default.
>
> If you are not down w/ the arguments, I'd be fine not making it the
> default.
> St.Ack
>

I don't think it's right to block this by myself, since I'm clearly in the
minority.  Since others clearly support this change, have at it.

But let me pose an alternate question: what if HDFS flat out refuses to
adopt this change?  What are our options then with this already shipping as
a default?  Would we continue to endure breakage due to the use of HDFS
private internals?  Do we switch the default back?  Do we do something else?

Thanks for the discussion.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread 张铎
See HDFS-223 and HDFS-916. There are plenty of issues related. The most
important thing is that we need a suitable api and there is an asynchronous
file system proposal in HADOOP-12910 which does not fit our requirements so
I need to stop it being committed first...

And a default choice in a later HBase version means that this feature is
definitely needed in HDFS and will be well tested even before it go into
the HDFS code base. An experimental feature is always experimental. Think
of DLR. For per CF flush, the data loss issue had been found before
releasing 1.2 in which version it is on by default. And we also experienced
this recently when backporting the scan heartbeat feature into our
internal branch. Lots of small bugs though the code has been there for
months...

Thanks.

2016年5月11日星期三,Gary Helmling  写道:

> >
> > Yeah the 'push to upstream' work has been started already. See here
> >
> > https://issues.apache.org/jira/browse/HADOOP-12910
> >
> > But it is much harder to push code into HDFS than HBase. It is the core
> of
> > all hadoop systems and I do not have many contacts in the hdfs
> community...
> >
> >
> Yes, I'm familiar with the difficulty of getting even relatively small
> change in to HDFS.
>
> Is HADOOP-12910 really the right issue for this?  There is some good
> discussion there, but it looks like it's primarily motivated by doing large
> batches of async renames.  Isn't our issue more that we want a whole new
> OutputStream implementation doing fan out instead of the regular pipeline
> replication?
>
> HDFS-9924 seems like it might be the better umbrella for this.  Maybe it
> would be better to create a new top level issue and link it there and
> comment in HDFS-9924 to try to get some traction.
>
>
> > And it is more convincing if we make it default as it means that we will
> > keep maintaining the code rather than make it stale and unstable.
> >
> >
> I would agree with this reasoning if we were talking about making an
> implementation inside HDFS the default.  That would indicate commitment to
> contribute to and maintain the HDFS implementation.  Making a separate code
> copy living in HBase the default I don't think really means anything for
> HDFS.
>
> The fact that this already needs to be updated for 2.8 just reinforces that
> we're going to see maintainability issues with this.
>
> Again, I appreciate all of the work that has gone in to this feature and
> the big performance improvements it shows, but I think the sequencing of
> events here is going to ultimately cause us and our users more pain than is
> necessary.
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread Stack
On Tue, May 10, 2016 at 10:39 AM, Gary Helmling  wrote:

> >
> > The suggestion is that we make this new client the default now in master
> > branch so we have plenty of time to find any issues with the
> > implementation. We'd also enable it as the default because the
> improvement
> > is dramatic (performance, less moving parts, comprehensible, etc.) and we
> > think this async, lightweight, WAL-purposed client the way to go moving
> > forward. We'd also spare our users having to make a choice; if optional,
> if
> > they trip over the feature at all, they'll be wary enabling such a
> > fundamental afraid that it experimental.
> >
> >
> To me, having the opportunity to shake out issues sounds like an argument
> for making it experimental, not making it the default.



I was trying to avoid the below oft-repeated pattern at least for the case
of critical developments:

+ New feature arrives after much work by developer, reviewers and testers
accompanied by fanfare (blog, talks).
+ Developers and reviewers move on after getting it committed or it gets
hacked into a deploy so it works in a frankenstein form
+ It sits in our code base across one or more releases marked as optional,
'experimental'
+ The 'experimental' bleamish discourages its exercise by users
+ The feature lags, rots
+ Or, the odd time, we go ahead and enable it as default in spite of the
fact it was never tried when experimental.

Distributed Log Replay sat in hbase across a few major versions. Only when
the threat of our making an actual release with it on by default did it get
serious attention where it was found flawed and is now being actively
purged. This was after it made it past reviews, multiple attempts at
testing at scale, and so on; i.e. we'd done it all by the book. The time in
an 'experimental' state added nothing.



> I think that
> features we're rolling out to our users actively enabled should have some
> level of confidence that the issues are already shaken out.  I'm interested
> in testing this out myself, but personally I would want to test this by
> actively enabling it and not just having it show up unexpectedly in a new
> release.
>
> Maybe this is mitigated by the reality that a 2.0 release is not going to
> happen for quite a while.  But this also becomes a self-fulfilling prophecy
> if we continue to make disruptive changes in master.
>
> Yes. 2.0 is a bit out there so we have some time to iron out issues is the
thought. Yes, it could push out delivery of 2.0.


> > Going this route we are taking on risk as you call out Gary but the
> > suggestion is that the benefits far outweigh the downsides (In
> mitigation,
> > I don't think we've ever run against HDFS free of reflection code though,
> > true, we are into a new level of violation with this client).
> >
> >
> I'm not trying to argue for a perfect world. :)
>
> But I do think there is a big difference in some of our other past use of
> reflection to ride over changes in public APIs vs. reaching in to private
> fields in private annotated classes.  What would we say to a coprocessor
> that did the same with HBase?
>

You have a point. We'd break the dependent w/o regard as HDFS will do to
this feature until it gets pushed upstream.



> > We are not arguing that it needs to be default to help get the async
> client
> > upstreamed into HDFS. Maybe it would help but going by the issue cited by
> > Duo below, it seems like there are a bunch of other concerns and
> dimensions
> > to be considered first; it may take a while for an async DFS client to
> land
> > (if at all). We should push on the upstreaming yes to close down the risk
> > you note, but let us not predicate our use of async WAL on it first being
> > committed to HDFS.
> >
> >
> I don't think our ability to use the async WAL as an optional feature
> should be predicated on inclusion in HDFS.  But I do think our use of it as
> the default, and all the continuing support that that implies should be.
> That is where we disagree.
>
> I don't think we're really making progress with the discussion here.  I
> don't agree with the arguments being put forward, but it seems like I'm in
> the minority.
>

I think the discussion here has been helpful. Holes have been found (and
plugged), the risk involved has gotten a good airing out here on dev, and
in spite of the back and forth, one of our experts in good standing is
still against it being on by default.

If you are not down w/ the arguments, I'd be fine not making it the default.
St.Ack


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread Andrew Purtell
I'm not sure this should be default for 2.0 but I'd definitely like to see it 
an option we're comfortable supporting through the duration we are negotiating 
with HDFS. Would be one major reason why trying out a 2.0.0 release would be 
compelling. 


On May 10, 2016, at 10:51 AM, Gary Helmling  wrote:

>> 
>> Yeah the 'push to upstream' work has been started already. See here
>> 
>> https://issues.apache.org/jira/browse/HADOOP-12910
>> 
>> But it is much harder to push code into HDFS than HBase. It is the core of
>> all hadoop systems and I do not have many contacts in the hdfs community...
> Yes, I'm familiar with the difficulty of getting even relatively small
> change in to HDFS.
> 
> Is HADOOP-12910 really the right issue for this?  There is some good
> discussion there, but it looks like it's primarily motivated by doing large
> batches of async renames.  Isn't our issue more that we want a whole new
> OutputStream implementation doing fan out instead of the regular pipeline
> replication?
> 
> HDFS-9924 seems like it might be the better umbrella for this.  Maybe it
> would be better to create a new top level issue and link it there and
> comment in HDFS-9924 to try to get some traction.
> 
> 
>> And it is more convincing if we make it default as it means that we will
>> keep maintaining the code rather than make it stale and unstable.
> I would agree with this reasoning if we were talking about making an
> implementation inside HDFS the default.  That would indicate commitment to
> contribute to and maintain the HDFS implementation.  Making a separate code
> copy living in HBase the default I don't think really means anything for
> HDFS.
> 
> The fact that this already needs to be updated for 2.8 just reinforces that
> we're going to see maintainability issues with this.
> 
> Again, I appreciate all of the work that has gone in to this feature and
> the big performance improvements it shows, but I think the sequencing of
> events here is going to ultimately cause us and our users more pain than is
> necessary.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread Gary Helmling
>
> Yeah the 'push to upstream' work has been started already. See here
>
> https://issues.apache.org/jira/browse/HADOOP-12910
>
> But it is much harder to push code into HDFS than HBase. It is the core of
> all hadoop systems and I do not have many contacts in the hdfs community...
>
>
Yes, I'm familiar with the difficulty of getting even relatively small
change in to HDFS.

Is HADOOP-12910 really the right issue for this?  There is some good
discussion there, but it looks like it's primarily motivated by doing large
batches of async renames.  Isn't our issue more that we want a whole new
OutputStream implementation doing fan out instead of the regular pipeline
replication?

HDFS-9924 seems like it might be the better umbrella for this.  Maybe it
would be better to create a new top level issue and link it there and
comment in HDFS-9924 to try to get some traction.


> And it is more convincing if we make it default as it means that we will
> keep maintaining the code rather than make it stale and unstable.
>
>
I would agree with this reasoning if we were talking about making an
implementation inside HDFS the default.  That would indicate commitment to
contribute to and maintain the HDFS implementation.  Making a separate code
copy living in HBase the default I don't think really means anything for
HDFS.

The fact that this already needs to be updated for 2.8 just reinforces that
we're going to see maintainability issues with this.

Again, I appreciate all of the work that has gone in to this feature and
the big performance improvements it shows, but I think the sequencing of
events here is going to ultimately cause us and our users more pain than is
necessary.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread Gary Helmling
>
> The suggestion is that we make this new client the default now in master
> branch so we have plenty of time to find any issues with the
> implementation. We'd also enable it as the default because the improvement
> is dramatic (performance, less moving parts, comprehensible, etc.) and we
> think this async, lightweight, WAL-purposed client the way to go moving
> forward. We'd also spare our users having to make a choice; if optional, if
> they trip over the feature at all, they'll be wary enabling such a
> fundamental afraid that it experimental.
>
>
To me, having the opportunity to shake out issues sounds like an argument
for making it experimental, not making it the default.  I think that
features we're rolling out to our users actively enabled should have some
level of confidence that the issues are already shaken out.  I'm interested
in testing this out myself, but personally I would want to test this by
actively enabling it and not just having it show up unexpectedly in a new
release.

Maybe this is mitigated by the reality that a 2.0 release is not going to
happen for quite a while.  But this also becomes a self-fulfilling prophecy
if we continue to make disruptive changes in master.


> Going this route we are taking on risk as you call out Gary but the
> suggestion is that the benefits far outweigh the downsides (In mitigation,
> I don't think we've ever run against HDFS free of reflection code though,
> true, we are into a new level of violation with this client).
>
>
I'm not trying to argue for a perfect world. :)

But I do think there is a big difference in some of our other past use of
reflection to ride over changes in public APIs vs. reaching in to private
fields in private annotated classes.  What would we say to a coprocessor
that did the same with HBase?


> We are not arguing that it needs to be default to help get the async client
> upstreamed into HDFS. Maybe it would help but going by the issue cited by
> Duo below, it seems like there are a bunch of other concerns and dimensions
> to be considered first; it may take a while for an async DFS client to land
> (if at all). We should push on the upstreaming yes to close down the risk
> you note, but let us not predicate our use of async WAL on it first being
> committed to HDFS.
>
>
I don't think our ability to use the async WAL as an optional feature
should be predicated on inclusion in HDFS.  But I do think our use of it as
the default, and all the continuing support that that implies should be.
That is where we disagree.

I don't think we're really making progress with the discussion here.  I
don't agree with the arguments being put forward, but it seems like I'm in
the minority.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread Stack
On Mon, May 9, 2016 at 11:59 PM, Gary Helmling  wrote:
...


> To me, it seems much safer to actively try to push this upstream into HDFS
> right now, and still pointing to its optional, non-default use in HBase as
> a compelling story.  I don't understand why making it the default in 2.0 is
> necessary for this.  Do you really think it will make that big a difference
> for upstreaming?  Once it's actually in Hadoop and maintained, it seems
> like a no-brainer to make it the default.
>
>
The suggestion is that we make this new client the default now in master
branch so we have plenty of time to find any issues with the
implementation. We'd also enable it as the default because the improvement
is dramatic (performance, less moving parts, comprehensible, etc.) and we
think this async, lightweight, WAL-purposed client the way to go moving
forward. We'd also spare our users having to make a choice; if optional, if
they trip over the feature at all, they'll be wary enabling such a
fundamental afraid that it experimental.

Going this route we are taking on risk as you call out Gary but the
suggestion is that the benefits far outweigh the downsides (In mitigation,
I don't think we've ever run against HDFS free of reflection code though,
true, we are into a new level of violation with this client).

We are not arguing that it needs to be default to help get the async client
upstreamed into HDFS. Maybe it would help but going by the issue cited by
Duo below, it seems like there are a bunch of other concerns and dimensions
to be considered first; it may take a while for an async DFS client to land
(if at all). We should push on the upstreaming yes to close down the risk
you note, but let us not predicate our use of async WAL on it first being
committed to HDFS.

Thanks,
St.Ack







> On Mon, May 9, 2016 at 5:09 PM Stack  wrote:
>
> > Any other suggestions/objections here? If not, will make the cut over in
> > next day or so.
> > Thanks,
> > St.Ack
> >
> > On Thu, May 5, 2016 at 10:02 PM, Stack  wrote:
> >
> > > On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:
> > >
> > >> Almost miss the party...
> > >>
> > >> bq. Do you think it worth to backport this feature to branch-1 and
> > release
> > >> it in the next 1.x release? This may introduce a compatibility issue
> as
> > >> said
> > >> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
> > >> upgrade
> > >> does not lose data...
> > >> From current perf data I think the effort is worthwhile, we already
> > >> started
> > >> some work here and will run it on production after some carefully
> > testing
> > >> (and of course, if the perf number confirmed, but I'm optimistic
> somehow
> > >> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will
> > make
> > >> it work, right? (And I guess this will also be a question when we
> > upgrade
> > >> from 1.x to 2.0 later?)
> > >>
> > >>
> > > Or a clean shutdown and restart? Or a fresh install? I'd think backport
> > > would be fine if you have to enable it and it has warnings and is clear
> > on
> > > circumstances under which there could be dataloss.
> > >
> > > St.Ack
> > >
> > >
> > >
> > >> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
> > >>
> > >> Best Regards,
> > >> Yu
> > >>
> > >> On 6 May 2016 at 09:49, Ted Yu  wrote:
> > >>
> > >> > Thanks for your effort, Duo.
> > >> >
> > >> > I am in favor of turning AsyncWAL as default in master branch.
> > >> >
> > >> > Cheers
> > >> >
> > >> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
> > >> >
> > >> > > Some progress.
> > >> > >
> > >> > > I have filed HBASE-15743 for the transparent encryption support,
> > >> > > and HBASE-15754 for the AES encryption UT. Now both of them are
> > >> resolved.
> > >> > > Let's resume the discussion here.
> > >> > >
> > >> > > Thanks.
> > >> > >
> > >> > > 2016-05-03 10:09 GMT+08:00 张铎 :
> > >> > >
> > >> > > > Fine, will add the testcase.
> > >> > > >
> > >> > > > And for the RPC, we only implement a new client side DTP here
> and
> > >> still
> > >> > > > use the original RPC.
> > >> > > >
> > >> > > > Thanks.
> > >> > > >
> > >> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> > >> > > >
> > >> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎 
> > wrote:
> > >> > > >>
> > >> > > >> > Yes, it does. There is testcase that enumerates all the
> > possible
> > >> > > >> protection
> > >> > > >> > level(authentication, integrity and privacy) and encryption
> > >> > > >> algorithm(none,
> > >> > > >> > 3des, rc4).
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> > >> > > >> >
> > >> > > >> > I have also tested it in a secure
> cluster(hbase-2.0.0-SNAPSHOT
> > >> and
> > >> > > >> > hadoop-2.4.0).
> > >> > > >> >
> > >> > > >>
> > >> > > >> Thanks.  Can you add in support for testing with AES
> > >> > > >> (dfs.en

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread 张铎
Some methods are moved from classes in hadoop-hdfs to classes in
hadoop-hdfs-client.
ClientProtocol.addBlock method adds an extra parameter.
DFSClient.Conf is moved to a separated file and renamed to DFSClientConf.

Not very hard. I promise that I can give a patch within 3 days after the
release of hadoop-2.8.0.

Thanks.

2016-05-10 16:23 GMT+08:00 张铎 :

> Yeah the 'push to upstream' work has been started already. See here
>
> https://issues.apache.org/jira/browse/HADOOP-12910
>
> But it is much harder to push code into HDFS than HBase. It is the core of
> all hadoop systems and I do not have many contacts in the hdfs community...
>
> And it is more convincing if we make it default as it means that we will
> keep maintaining the code rather than make it stale and unstable.
>
> And for the compatibility of different hadoop versions, I think I can deal
> with it. Let me check hadoop-2.8.0-SNAPSHOT first.
>
> Thanks.
>
> 2016-05-10 14:59 GMT+08:00 Gary Helmling :
>
>> Thanks for adding the tests and fixing up AES support.
>>
>> My only real concern is the maintainability of this code as our own
>> private
>> DFS client.  The SASL support, for example, is largely based on reflection
>> and reaches in to private fields of @InterfaceAudience.Private Hadoop
>> classes.  This seems bound to break with a future Hadoop release.  I
>> appreciate the parameterized testing wrapped around this because it
>> doesn't
>> seem like we'll have much else in the way of safety checking.  This is not
>> a knock on the code -- it's a pretty clean reach into the HDFS guts, but a
>> reach it is.  For a component at the core of our data integrity, this
>> seems
>> like a risk.
>>
>> To me, it seems much safer to actively try to push this upstream into HDFS
>> right now, and still pointing to its optional, non-default use in HBase as
>> a compelling story.  I don't understand why making it the default in 2.0
>> is
>> necessary for this.  Do you really think it will make that big a
>> difference
>> for upstreaming?  Once it's actually in Hadoop and maintained, it seems
>> like a no-brainer to make it the default.
>>
>> On Mon, May 9, 2016 at 5:09 PM Stack  wrote:
>>
>> > Any other suggestions/objections here? If not, will make the cut over in
>> > next day or so.
>> > Thanks,
>> > St.Ack
>> >
>> > On Thu, May 5, 2016 at 10:02 PM, Stack  wrote:
>> >
>> > > On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:
>> > >
>> > >> Almost miss the party...
>> > >>
>> > >> bq. Do you think it worth to backport this feature to branch-1 and
>> > release
>> > >> it in the next 1.x release? This may introduce a compatibility issue
>> as
>> > >> said
>> > >> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
>> > >> upgrade
>> > >> does not lose data...
>> > >> From current perf data I think the effort is worthwhile, we already
>> > >> started
>> > >> some work here and will run it on production after some carefully
>> > testing
>> > >> (and of course, if the perf number confirmed, but I'm optimistic
>> somehow
>> > >> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will
>> > make
>> > >> it work, right? (And I guess this will also be a question when we
>> > upgrade
>> > >> from 1.x to 2.0 later?)
>> > >>
>> > >>
>> > > Or a clean shutdown and restart? Or a fresh install? I'd think
>> backport
>> > > would be fine if you have to enable it and it has warnings and is
>> clear
>> > on
>> > > circumstances under which there could be dataloss.
>> > >
>> > > St.Ack
>> > >
>> > >
>> > >
>> > >> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
>> > >>
>> > >> Best Regards,
>> > >> Yu
>> > >>
>> > >> On 6 May 2016 at 09:49, Ted Yu  wrote:
>> > >>
>> > >> > Thanks for your effort, Duo.
>> > >> >
>> > >> > I am in favor of turning AsyncWAL as default in master branch.
>> > >> >
>> > >> > Cheers
>> > >> >
>> > >> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
>> > >> >
>> > >> > > Some progress.
>> > >> > >
>> > >> > > I have filed HBASE-15743 for the transparent encryption support,
>> > >> > > and HBASE-15754 for the AES encryption UT. Now both of them are
>> > >> resolved.
>> > >> > > Let's resume the discussion here.
>> > >> > >
>> > >> > > Thanks.
>> > >> > >
>> > >> > > 2016-05-03 10:09 GMT+08:00 张铎 :
>> > >> > >
>> > >> > > > Fine, will add the testcase.
>> > >> > > >
>> > >> > > > And for the RPC, we only implement a new client side DTP here
>> and
>> > >> still
>> > >> > > > use the original RPC.
>> > >> > > >
>> > >> > > > Thanks.
>> > >> > > >
>> > >> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
>> > >> > > >
>> > >> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎 
>> > wrote:
>> > >> > > >>
>> > >> > > >> > Yes, it does. There is testcase that enumerates all the
>> > possible
>> > >> > > >> protection
>> > >> > > >> > level(authentication, integrity and privacy) and encryption
>> > >> > > >> algorithm(none,
>> > >> > > >> > 3des, rc4).
>> > >> > > >> >
>> > >> > > >> >
>> > >> > > >> >
>> > >> > > >>
>> > >> > >
>> > >>

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-10 Thread 张铎
Yeah the 'push to upstream' work has been started already. See here

https://issues.apache.org/jira/browse/HADOOP-12910

But it is much harder to push code into HDFS than HBase. It is the core of
all hadoop systems and I do not have many contacts in the hdfs community...

And it is more convincing if we make it default as it means that we will
keep maintaining the code rather than make it stale and unstable.

And for the compatibility of different hadoop versions, I think I can deal
with it. Let me check hadoop-2.8.0-SNAPSHOT first.

Thanks.

2016-05-10 14:59 GMT+08:00 Gary Helmling :

> Thanks for adding the tests and fixing up AES support.
>
> My only real concern is the maintainability of this code as our own private
> DFS client.  The SASL support, for example, is largely based on reflection
> and reaches in to private fields of @InterfaceAudience.Private Hadoop
> classes.  This seems bound to break with a future Hadoop release.  I
> appreciate the parameterized testing wrapped around this because it doesn't
> seem like we'll have much else in the way of safety checking.  This is not
> a knock on the code -- it's a pretty clean reach into the HDFS guts, but a
> reach it is.  For a component at the core of our data integrity, this seems
> like a risk.
>
> To me, it seems much safer to actively try to push this upstream into HDFS
> right now, and still pointing to its optional, non-default use in HBase as
> a compelling story.  I don't understand why making it the default in 2.0 is
> necessary for this.  Do you really think it will make that big a difference
> for upstreaming?  Once it's actually in Hadoop and maintained, it seems
> like a no-brainer to make it the default.
>
> On Mon, May 9, 2016 at 5:09 PM Stack  wrote:
>
> > Any other suggestions/objections here? If not, will make the cut over in
> > next day or so.
> > Thanks,
> > St.Ack
> >
> > On Thu, May 5, 2016 at 10:02 PM, Stack  wrote:
> >
> > > On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:
> > >
> > >> Almost miss the party...
> > >>
> > >> bq. Do you think it worth to backport this feature to branch-1 and
> > release
> > >> it in the next 1.x release? This may introduce a compatibility issue
> as
> > >> said
> > >> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
> > >> upgrade
> > >> does not lose data...
> > >> From current perf data I think the effort is worthwhile, we already
> > >> started
> > >> some work here and will run it on production after some carefully
> > testing
> > >> (and of course, if the perf number confirmed, but I'm optimistic
> somehow
> > >> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will
> > make
> > >> it work, right? (And I guess this will also be a question when we
> > upgrade
> > >> from 1.x to 2.0 later?)
> > >>
> > >>
> > > Or a clean shutdown and restart? Or a fresh install? I'd think backport
> > > would be fine if you have to enable it and it has warnings and is clear
> > on
> > > circumstances under which there could be dataloss.
> > >
> > > St.Ack
> > >
> > >
> > >
> > >> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
> > >>
> > >> Best Regards,
> > >> Yu
> > >>
> > >> On 6 May 2016 at 09:49, Ted Yu  wrote:
> > >>
> > >> > Thanks for your effort, Duo.
> > >> >
> > >> > I am in favor of turning AsyncWAL as default in master branch.
> > >> >
> > >> > Cheers
> > >> >
> > >> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
> > >> >
> > >> > > Some progress.
> > >> > >
> > >> > > I have filed HBASE-15743 for the transparent encryption support,
> > >> > > and HBASE-15754 for the AES encryption UT. Now both of them are
> > >> resolved.
> > >> > > Let's resume the discussion here.
> > >> > >
> > >> > > Thanks.
> > >> > >
> > >> > > 2016-05-03 10:09 GMT+08:00 张铎 :
> > >> > >
> > >> > > > Fine, will add the testcase.
> > >> > > >
> > >> > > > And for the RPC, we only implement a new client side DTP here
> and
> > >> still
> > >> > > > use the original RPC.
> > >> > > >
> > >> > > > Thanks.
> > >> > > >
> > >> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> > >> > > >
> > >> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎 
> > wrote:
> > >> > > >>
> > >> > > >> > Yes, it does. There is testcase that enumerates all the
> > possible
> > >> > > >> protection
> > >> > > >> > level(authentication, integrity and privacy) and encryption
> > >> > > >> algorithm(none,
> > >> > > >> > 3des, rc4).
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> > >> > > >> >
> > >> > > >> > I have also tested it in a secure
> cluster(hbase-2.0.0-SNAPSHOT
> > >> and
> > >> > > >> > hadoop-2.4.0).
> > >> > > >> >
> > >> > > >>
> > >> > > >> Thanks.  Can you add in support for testing with AES
> > >> > > >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?
> > This
> > >> is
> > >> > > only
> > >> > 

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-09 Thread Gary Helmling
Thanks for adding the tests and fixing up AES support.

My only real concern is the maintainability of this code as our own private
DFS client.  The SASL support, for example, is largely based on reflection
and reaches in to private fields of @InterfaceAudience.Private Hadoop
classes.  This seems bound to break with a future Hadoop release.  I
appreciate the parameterized testing wrapped around this because it doesn't
seem like we'll have much else in the way of safety checking.  This is not
a knock on the code -- it's a pretty clean reach into the HDFS guts, but a
reach it is.  For a component at the core of our data integrity, this seems
like a risk.

To me, it seems much safer to actively try to push this upstream into HDFS
right now, and still pointing to its optional, non-default use in HBase as
a compelling story.  I don't understand why making it the default in 2.0 is
necessary for this.  Do you really think it will make that big a difference
for upstreaming?  Once it's actually in Hadoop and maintained, it seems
like a no-brainer to make it the default.

On Mon, May 9, 2016 at 5:09 PM Stack  wrote:

> Any other suggestions/objections here? If not, will make the cut over in
> next day or so.
> Thanks,
> St.Ack
>
> On Thu, May 5, 2016 at 10:02 PM, Stack  wrote:
>
> > On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:
> >
> >> Almost miss the party...
> >>
> >> bq. Do you think it worth to backport this feature to branch-1 and
> release
> >> it in the next 1.x release? This may introduce a compatibility issue as
> >> said
> >> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
> >> upgrade
> >> does not lose data...
> >> From current perf data I think the effort is worthwhile, we already
> >> started
> >> some work here and will run it on production after some carefully
> testing
> >> (and of course, if the perf number confirmed, but I'm optimistic somehow
> >> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will
> make
> >> it work, right? (And I guess this will also be a question when we
> upgrade
> >> from 1.x to 2.0 later?)
> >>
> >>
> > Or a clean shutdown and restart? Or a fresh install? I'd think backport
> > would be fine if you have to enable it and it has warnings and is clear
> on
> > circumstances under which there could be dataloss.
> >
> > St.Ack
> >
> >
> >
> >> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
> >>
> >> Best Regards,
> >> Yu
> >>
> >> On 6 May 2016 at 09:49, Ted Yu  wrote:
> >>
> >> > Thanks for your effort, Duo.
> >> >
> >> > I am in favor of turning AsyncWAL as default in master branch.
> >> >
> >> > Cheers
> >> >
> >> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
> >> >
> >> > > Some progress.
> >> > >
> >> > > I have filed HBASE-15743 for the transparent encryption support,
> >> > > and HBASE-15754 for the AES encryption UT. Now both of them are
> >> resolved.
> >> > > Let's resume the discussion here.
> >> > >
> >> > > Thanks.
> >> > >
> >> > > 2016-05-03 10:09 GMT+08:00 张铎 :
> >> > >
> >> > > > Fine, will add the testcase.
> >> > > >
> >> > > > And for the RPC, we only implement a new client side DTP here and
> >> still
> >> > > > use the original RPC.
> >> > > >
> >> > > > Thanks.
> >> > > >
> >> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> >> > > >
> >> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎 
> wrote:
> >> > > >>
> >> > > >> > Yes, it does. There is testcase that enumerates all the
> possible
> >> > > >> protection
> >> > > >> > level(authentication, integrity and privacy) and encryption
> >> > > >> algorithm(none,
> >> > > >> > 3des, rc4).
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> >> > > >> >
> >> > > >> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT
> >> and
> >> > > >> > hadoop-2.4.0).
> >> > > >> >
> >> > > >>
> >> > > >> Thanks.  Can you add in support for testing with AES
> >> > > >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?
> This
> >> is
> >> > > only
> >> > > >> available in Hadoop 2.6.0+, but I think is far more likely to be
> >> used
> >> > in
> >> > > >> production than 3des or rc4.
> >> > > >
> >> > > >
> >> > > >> Also, have you been following HADOOP-10768?  That is changing
> >> Hadoop
> >> > RPC
> >> > > >> encryption negotiation to support more performant AES wrapping,
> >> > similar
> >> > > to
> >> > > >> what is now supported in the data transfer pipeline.
> >> > > >>
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-09 Thread Stack
Any other suggestions/objections here? If not, will make the cut over in
next day or so.
Thanks,
St.Ack

On Thu, May 5, 2016 at 10:02 PM, Stack  wrote:

> On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:
>
>> Almost miss the party...
>>
>> bq. Do you think it worth to backport this feature to branch-1 and release
>> it in the next 1.x release? This may introduce a compatibility issue as
>> said
>> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
>> upgrade
>> does not lose data...
>> From current perf data I think the effort is worthwhile, we already
>> started
>> some work here and will run it on production after some carefully testing
>> (and of course, if the perf number confirmed, but I'm optimistic somehow
>> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will make
>> it work, right? (And I guess this will also be a question when we upgrade
>> from 1.x to 2.0 later?)
>>
>>
> Or a clean shutdown and restart? Or a fresh install? I'd think backport
> would be fine if you have to enable it and it has warnings and is clear on
> circumstances under which there could be dataloss.
>
> St.Ack
>
>
>
>> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
>>
>> Best Regards,
>> Yu
>>
>> On 6 May 2016 at 09:49, Ted Yu  wrote:
>>
>> > Thanks for your effort, Duo.
>> >
>> > I am in favor of turning AsyncWAL as default in master branch.
>> >
>> > Cheers
>> >
>> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
>> >
>> > > Some progress.
>> > >
>> > > I have filed HBASE-15743 for the transparent encryption support,
>> > > and HBASE-15754 for the AES encryption UT. Now both of them are
>> resolved.
>> > > Let's resume the discussion here.
>> > >
>> > > Thanks.
>> > >
>> > > 2016-05-03 10:09 GMT+08:00 张铎 :
>> > >
>> > > > Fine, will add the testcase.
>> > > >
>> > > > And for the RPC, we only implement a new client side DTP here and
>> still
>> > > > use the original RPC.
>> > > >
>> > > > Thanks.
>> > > >
>> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
>> > > >
>> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
>> > > >>
>> > > >> > Yes, it does. There is testcase that enumerates all the possible
>> > > >> protection
>> > > >> > level(authentication, integrity and privacy) and encryption
>> > > >> algorithm(none,
>> > > >> > 3des, rc4).
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >>
>> > >
>> >
>> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
>> > > >> >
>> > > >> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT
>> and
>> > > >> > hadoop-2.4.0).
>> > > >> >
>> > > >>
>> > > >> Thanks.  Can you add in support for testing with AES
>> > > >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This
>> is
>> > > only
>> > > >> available in Hadoop 2.6.0+, but I think is far more likely to be
>> used
>> > in
>> > > >> production than 3des or rc4.
>> > > >
>> > > >
>> > > >> Also, have you been following HADOOP-10768?  That is changing
>> Hadoop
>> > RPC
>> > > >> encryption negotiation to support more performant AES wrapping,
>> > similar
>> > > to
>> > > >> what is now supported in the data transfer pipeline.
>> > > >>
>> > > >
>> > > >
>> > >
>> >
>>
>
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-05 Thread Stack
On Thu, May 5, 2016 at 7:39 PM, Yu Li  wrote:

> Almost miss the party...
>
> bq. Do you think it worth to backport this feature to branch-1 and release
> it in the next 1.x release? This may introduce a compatibility issue as
> said
> in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
> upgrade
> does not lose data...
> From current perf data I think the effort is worthwhile, we already started
> some work here and will run it on production after some carefully testing
> (and of course, if the perf number confirmed, but I'm optimistic somehow
> :-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will make
> it work, right? (And I guess this will also be a question when we upgrade
> from 1.x to 2.0 later?)
>
>
Or a clean shutdown and restart? Or a fresh install? I'd think backport
would be fine if you have to enable it and it has warnings and is clear on
circumstances under which there could be dataloss.

St.Ack



> btw, I'm +1 about making asyncfswal as default in 2.0 :-)
>
> Best Regards,
> Yu
>
> On 6 May 2016 at 09:49, Ted Yu  wrote:
>
> > Thanks for your effort, Duo.
> >
> > I am in favor of turning AsyncWAL as default in master branch.
> >
> > Cheers
> >
> > On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
> >
> > > Some progress.
> > >
> > > I have filed HBASE-15743 for the transparent encryption support,
> > > and HBASE-15754 for the AES encryption UT. Now both of them are
> resolved.
> > > Let's resume the discussion here.
> > >
> > > Thanks.
> > >
> > > 2016-05-03 10:09 GMT+08:00 张铎 :
> > >
> > > > Fine, will add the testcase.
> > > >
> > > > And for the RPC, we only implement a new client side DTP here and
> still
> > > > use the original RPC.
> > > >
> > > > Thanks.
> > > >
> > > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> > > >
> > > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
> > > >>
> > > >> > Yes, it does. There is testcase that enumerates all the possible
> > > >> protection
> > > >> > level(authentication, integrity and privacy) and encryption
> > > >> algorithm(none,
> > > >> > 3des, rc4).
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> > > >> >
> > > >> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> > > >> > hadoop-2.4.0).
> > > >> >
> > > >>
> > > >> Thanks.  Can you add in support for testing with AES
> > > >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This
> is
> > > only
> > > >> available in Hadoop 2.6.0+, but I think is far more likely to be
> used
> > in
> > > >> production than 3des or rc4.
> > > >
> > > >
> > > >> Also, have you been following HADOOP-10768?  That is changing Hadoop
> > RPC
> > > >> encryption negotiation to support more performant AES wrapping,
> > similar
> > > to
> > > >> what is now supported in the data transfer pipeline.
> > > >>
> > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-05 Thread Yu Li
Almost miss the party...

bq. Do you think it worth to backport this feature to branch-1 and release
it in the next 1.x release? This may introduce a compatibility issue as said
in HBASE-14949 that we need HBASE-14949 to make sure that the rolling upgrade
does not lose data...
>From current perf data I think the effort is worthwhile, we already started
some work here and will run it on production after some carefully testing
(and of course, if the perf number confirmed, but I'm optimistic somehow
:-P). Regarding HBASE-14949, I guess a two-step rolling upgrade will make
it work, right? (And I guess this will also be a question when we upgrade
from 1.x to 2.0 later?)

btw, I'm +1 about making asyncfswal as default in 2.0 :-)

Best Regards,
Yu

On 6 May 2016 at 09:49, Ted Yu  wrote:

> Thanks for your effort, Duo.
>
> I am in favor of turning AsyncWAL as default in master branch.
>
> Cheers
>
> On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:
>
> > Some progress.
> >
> > I have filed HBASE-15743 for the transparent encryption support,
> > and HBASE-15754 for the AES encryption UT. Now both of them are resolved.
> > Let's resume the discussion here.
> >
> > Thanks.
> >
> > 2016-05-03 10:09 GMT+08:00 张铎 :
> >
> > > Fine, will add the testcase.
> > >
> > > And for the RPC, we only implement a new client side DTP here and still
> > > use the original RPC.
> > >
> > > Thanks.
> > >
> > > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> > >
> > >> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
> > >>
> > >> > Yes, it does. There is testcase that enumerates all the possible
> > >> protection
> > >> > level(authentication, integrity and privacy) and encryption
> > >> algorithm(none,
> > >> > 3des, rc4).
> > >> >
> > >> >
> > >> >
> > >>
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> > >> >
> > >> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> > >> > hadoop-2.4.0).
> > >> >
> > >>
> > >> Thanks.  Can you add in support for testing with AES
> > >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This is
> > only
> > >> available in Hadoop 2.6.0+, but I think is far more likely to be used
> in
> > >> production than 3des or rc4.
> > >
> > >
> > >> Also, have you been following HADOOP-10768?  That is changing Hadoop
> RPC
> > >> encryption negotiation to support more performant AES wrapping,
> similar
> > to
> > >> what is now supported in the data transfer pipeline.
> > >>
> > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-05 Thread Ted Yu
Thanks for your effort, Duo.

I am in favor of turning AsyncWAL as default in master branch.

Cheers

On Thu, May 5, 2016 at 6:03 PM, 张铎  wrote:

> Some progress.
>
> I have filed HBASE-15743 for the transparent encryption support,
> and HBASE-15754 for the AES encryption UT. Now both of them are resolved.
> Let's resume the discussion here.
>
> Thanks.
>
> 2016-05-03 10:09 GMT+08:00 张铎 :
>
> > Fine, will add the testcase.
> >
> > And for the RPC, we only implement a new client side DTP here and still
> > use the original RPC.
> >
> > Thanks.
> >
> > 2016-05-03 3:20 GMT+08:00 Gary Helmling :
> >
> >> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
> >>
> >> > Yes, it does. There is testcase that enumerates all the possible
> >> protection
> >> > level(authentication, integrity and privacy) and encryption
> >> algorithm(none,
> >> > 3des, rc4).
> >> >
> >> >
> >> >
> >>
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> >> >
> >> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> >> > hadoop-2.4.0).
> >> >
> >>
> >> Thanks.  Can you add in support for testing with AES
> >> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This is
> only
> >> available in Hadoop 2.6.0+, but I think is far more likely to be used in
> >> production than 3des or rc4.
> >
> >
> >> Also, have you been following HADOOP-10768?  That is changing Hadoop RPC
> >> encryption negotiation to support more performant AES wrapping, similar
> to
> >> what is now supported in the data transfer pipeline.
> >>
> >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-05 Thread 张铎
Some progress.

I have filed HBASE-15743 for the transparent encryption support,
and HBASE-15754 for the AES encryption UT. Now both of them are resolved.
Let's resume the discussion here.

Thanks.

2016-05-03 10:09 GMT+08:00 张铎 :

> Fine, will add the testcase.
>
> And for the RPC, we only implement a new client side DTP here and still
> use the original RPC.
>
> Thanks.
>
> 2016-05-03 3:20 GMT+08:00 Gary Helmling :
>
>> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
>>
>> > Yes, it does. There is testcase that enumerates all the possible
>> protection
>> > level(authentication, integrity and privacy) and encryption
>> algorithm(none,
>> > 3des, rc4).
>> >
>> >
>> >
>> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
>> >
>> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
>> > hadoop-2.4.0).
>> >
>>
>> Thanks.  Can you add in support for testing with AES
>> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This is only
>> available in Hadoop 2.6.0+, but I think is far more likely to be used in
>> production than 3des or rc4.
>
>
>> Also, have you been following HADOOP-10768?  That is changing Hadoop RPC
>> encryption negotiation to support more performant AES wrapping, similar to
>> what is now supported in the data transfer pipeline.
>>
>
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-02 Thread 张铎
Fine, will add the testcase.

And for the RPC, we only implement a new client side DTP here and still use
the original RPC.

Thanks.

2016-05-03 3:20 GMT+08:00 Gary Helmling :

> On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:
>
> > Yes, it does. There is testcase that enumerates all the possible
> protection
> > level(authentication, integrity and privacy) and encryption
> algorithm(none,
> > 3des, rc4).
> >
> >
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> >
> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> > hadoop-2.4.0).
> >
>
> Thanks.  Can you add in support for testing with AES
> (dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This is only
> available in Hadoop 2.6.0+, but I think is far more likely to be used in
> production than 3des or rc4.


> Also, have you been following HADOOP-10768?  That is changing Hadoop RPC
> encryption negotiation to support more performant AES wrapping, similar to
> what is now supported in the data transfer pipeline.
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-02 Thread Gary Helmling
On Fri, Apr 29, 2016 at 6:24 PM 张铎  wrote:

> Yes, it does. There is testcase that enumerates all the possible protection
> level(authentication, integrity and privacy) and encryption algorithm(none,
> 3des, rc4).
>
>
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
>
> I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> hadoop-2.4.0).
>

Thanks.  Can you add in support for testing with AES
(dfs.encrypt.data.transfer.cipher.suites=AES/CTR/NoPadding)?  This is only
available in Hadoop 2.6.0+, but I think is far more likely to be used in
production than 3des or rc4.

Also, have you been following HADOOP-10768?  That is changing Hadoop RPC
encryption negotiation to support more performant AES wrapping, similar to
what is now supported in the data transfer pipeline.


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-01 Thread Stack
On Sat, Apr 30, 2016 at 10:06 PM, Sean Busbey  wrote:

> On Sat, Apr 30, 2016 at 1:34 PM, Stack  wrote:
> > On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu  wrote:
> >
> >> What about support for Transparent Data Encryption feature which was
> >> introduced in Apache Hadoop 2.6.0 ?
> >>
> >>
> > Transparent: "...(of a process or interface) functioning without the user
> > being aware of its presence."
> > St.Ack
> >
> >
> >
>
> It's only transparent when the application is making use of the
> libraries provided by the project. ;)
>
> HDFS Transparent Encryption is all on the client side:
>
>
> http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>

Thanks Sean. Makes sense. Sounds easy enough to address going by Duo later
remarks.

There'll be other teething and holes that need plugging after we make the
switch I'd imagine.

St.Ack


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-05-01 Thread 张铎
I checked the code. The HdfsFileStatus returned when creating of a file
inside encryption zone will contain a FileEncryptionInfo. DFSClient will
create a CryptoOutputStream which wraps a DFSOutputStream based on the
FileEntryptionInfo.

Let me file issue to implement it. Just another piece of reflection codes...

Thanks.

2016-05-01 13:06 GMT+08:00 Sean Busbey :

> On Sat, Apr 30, 2016 at 1:34 PM, Stack  wrote:
> > On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu  wrote:
> >
> >> What about support for Transparent Data Encryption feature which was
> >> introduced in Apache Hadoop 2.6.0 ?
> >>
> >>
> > Transparent: "...(of a process or interface) functioning without the user
> > being aware of its presence."
> > St.Ack
> >
> >
> >
>
> It's only transparent when the application is making use of the
> libraries provided by the project. ;)
>
> HDFS Transparent Encryption is all on the client side:
>
>
> http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-30 Thread Sean Busbey
On Sat, Apr 30, 2016 at 1:34 PM, Stack  wrote:
> On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu  wrote:
>
>> What about support for Transparent Data Encryption feature which was
>> introduced in Apache Hadoop 2.6.0 ?
>>
>>
> Transparent: "...(of a process or interface) functioning without the user
> being aware of its presence."
> St.Ack
>
>
>

It's only transparent when the application is making use of the
libraries provided by the project. ;)

HDFS Transparent Encryption is all on the client side:

http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-30 Thread Stack
On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu  wrote:

> What about support for Transparent Data Encryption feature which was
> introduced in Apache Hadoop 2.6.0 ?
>
>
Transparent: "...(of a process or interface) functioning without the user
being aware of its presence."
St.Ack



> On Fri, Apr 29, 2016 at 6:24 PM, 张铎  wrote:
>
> > Yes, it does. There is testcase that enumerates all the possible
> protection
> > level(authentication, integrity and privacy) and encryption
> algorithm(none,
> > 3des, rc4).
> >
> >
> >
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
> >
> > I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> > hadoop-2.4.0).
> >
> > Thanks.
> >
> > 2016-04-30 2:32 GMT+08:00 Gary Helmling :
> >
> > > How well has this been tested on secure clusters?  I know SASL support
> > was
> > > lacking initially, but I believe it had been added?  Does AsyncFSWAL
> > > support all the HDFS transport encryption options?
> > >
> > >
> > > On Fri, Apr 29, 2016 at 12:05 AM Stack  wrote:
> > >
> > > > I'm +1 on enabling asyncfswal as default in 2.0:
> > > >
> > > > + We'll have plenty of time to figure issues if any if we get it in
> > now,
> > > > early.
> > > > + The improvement in throughput is substantial
> > > > + There are now less moving parts
> > > > + A critical piece of our write path is much less opaque in its
> > workings
> > > > and no longer (effectively) immutable
> > > >
> > > > St.Ack
> > > >
> > > >
> > > > On Thu, Apr 28, 2016 at 11:53 PM, 张铎  wrote:
> > > >
> > > > > I‘ve done dig in HDFS and HADOOP proejcts and found that there is
> an
> > > > active
> > > > > issue HADOOP-12910 that related to asynchronous FileSystem
> > > > implementation.
> > > > >
> > > > > I have left some comments on it, maybe we could start from there.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > 2016-04-29 14:42 GMT+08:00 Stack :
> > > > >
> > > > > > On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > Last comment on HDFS-916 was from 2010.
> > > > > > >
> > > > > > > Suggest making a new issue or reviving discussion on HDFS-916
> > > > > (currently
> > > > > > > assigned to Todd).
> > > > > > >
> > > > > > >
> > > > > > Duo is on it. Some mileage and confidence in the new code would
> be
> > > good
> > > > > to
> > > > > > have before going to HDFS (Getting stuff into HDFS is a PITA at
> the
> > > > best
> > > > > of
> > > > > > times... lets have a good case when we go there).
> > > > > >
> > > > > >
> > > > > > > bq. The fallback implementation is not aim to get a good
> > > performance
> > > > > > >
> > > > > > > For more than two weeks, I have been working with Azure Data
> Lake
> > > > > > > developers so that all hbase system tests pass on ADLS - there
> > were
> > > > > > subtle
> > > > > > > differences between ADLS and hdfs.
> > > > > > >
> > > > > > > If switching to AsyncWAL gives either WASB or ADLS subpar
> > > > performance,
> > > > > it
> > > > > > > would make upgrading to hbase 2.x unacceptable for their users.
> > > > > > >
> > > > > > >
> > > > > > Just use FSHLog instead of asyncfswal when up on WASB. Its just a
> > > > config
> > > > > > change.
> > > > > >
> > > > > > St.Ack
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, Apr 28, 2016 at 8:39 PM, 张铎 
> > wrote:
> > > > > > >
> > > > > > > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > > > > > > >
> > > > > > > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > > > > > > >
> > > > > > > > > Is there HDFS JIRA for the above ? Can you share the
> number ?
> > > > > > > > >
> > > > > > > > I have not filed a new one but there are bunch of related
> > issues
> > > > > > already,
> > > > > > > > such as this one
> > https://issues.apache.org/jira/browse/HDFS-916
> > > > > > > >
> > > > > > > > >
> > > > > > > > > bq. Just wrap FSDataOutputStream to make it act like an
> > > > > asynchronous
> > > > > > > > output
> > > > > > > > >
> > > > > > > > > Can you be a bit more specific ?
> > > > > > > > > HBase currently works with WASB and Azure Data Lake. Does
> the
> > > > above
> > > > > > > mean
> > > > > > > > > their performance would suffer ?
> > > > > > > > >
> > > > > > > > Yes, the performance will suffer...
> > > > > > > > The fallback implementation is not aim to get a good
> > performance,
> > > > > just
> > > > > > > for
> > > > > > > > compatibility with any FileSystem implementation.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Inline comments.
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey <
> > bus...@cloudera.com
> > > >:
> > > > > > > > > >
> > > > > > > > > > > I am nervous about having default out-of-the-box new
> > HBase
> > > > > users
> > > > > > > > > reliant
> > > > > > > > > > on
> > > > > > > > > > > a bespoke HDFS client, espe

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-30 Thread Ted Yu
What about support for Transparent Data Encryption feature which was
introduced in Apache Hadoop 2.6.0 ?

On Fri, Apr 29, 2016 at 6:24 PM, 张铎  wrote:

> Yes, it does. There is testcase that enumerates all the possible protection
> level(authentication, integrity and privacy) and encryption algorithm(none,
> 3des, rc4).
>
>
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java
>
> I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
> hadoop-2.4.0).
>
> Thanks.
>
> 2016-04-30 2:32 GMT+08:00 Gary Helmling :
>
> > How well has this been tested on secure clusters?  I know SASL support
> was
> > lacking initially, but I believe it had been added?  Does AsyncFSWAL
> > support all the HDFS transport encryption options?
> >
> >
> > On Fri, Apr 29, 2016 at 12:05 AM Stack  wrote:
> >
> > > I'm +1 on enabling asyncfswal as default in 2.0:
> > >
> > > + We'll have plenty of time to figure issues if any if we get it in
> now,
> > > early.
> > > + The improvement in throughput is substantial
> > > + There are now less moving parts
> > > + A critical piece of our write path is much less opaque in its
> workings
> > > and no longer (effectively) immutable
> > >
> > > St.Ack
> > >
> > >
> > > On Thu, Apr 28, 2016 at 11:53 PM, 张铎  wrote:
> > >
> > > > I‘ve done dig in HDFS and HADOOP proejcts and found that there is an
> > > active
> > > > issue HADOOP-12910 that related to asynchronous FileSystem
> > > implementation.
> > > >
> > > > I have left some comments on it, maybe we could start from there.
> > > >
> > > > Thanks.
> > > >
> > > > 2016-04-29 14:42 GMT+08:00 Stack :
> > > >
> > > > > On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > Last comment on HDFS-916 was from 2010.
> > > > > >
> > > > > > Suggest making a new issue or reviving discussion on HDFS-916
> > > > (currently
> > > > > > assigned to Todd).
> > > > > >
> > > > > >
> > > > > Duo is on it. Some mileage and confidence in the new code would be
> > good
> > > > to
> > > > > have before going to HDFS (Getting stuff into HDFS is a PITA at the
> > > best
> > > > of
> > > > > times... lets have a good case when we go there).
> > > > >
> > > > >
> > > > > > bq. The fallback implementation is not aim to get a good
> > performance
> > > > > >
> > > > > > For more than two weeks, I have been working with Azure Data Lake
> > > > > > developers so that all hbase system tests pass on ADLS - there
> were
> > > > > subtle
> > > > > > differences between ADLS and hdfs.
> > > > > >
> > > > > > If switching to AsyncWAL gives either WASB or ADLS subpar
> > > performance,
> > > > it
> > > > > > would make upgrading to hbase 2.x unacceptable for their users.
> > > > > >
> > > > > >
> > > > > Just use FSHLog instead of asyncfswal when up on WASB. Its just a
> > > config
> > > > > change.
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > > > On Thu, Apr 28, 2016 at 8:39 PM, 张铎 
> wrote:
> > > > > >
> > > > > > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > > > > > >
> > > > > > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > > > > > >
> > > > > > > > Is there HDFS JIRA for the above ? Can you share the number ?
> > > > > > > >
> > > > > > > I have not filed a new one but there are bunch of related
> issues
> > > > > already,
> > > > > > > such as this one
> https://issues.apache.org/jira/browse/HDFS-916
> > > > > > >
> > > > > > > >
> > > > > > > > bq. Just wrap FSDataOutputStream to make it act like an
> > > > asynchronous
> > > > > > > output
> > > > > > > >
> > > > > > > > Can you be a bit more specific ?
> > > > > > > > HBase currently works with WASB and Azure Data Lake. Does the
> > > above
> > > > > > mean
> > > > > > > > their performance would suffer ?
> > > > > > > >
> > > > > > > Yes, the performance will suffer...
> > > > > > > The fallback implementation is not aim to get a good
> performance,
> > > > just
> > > > > > for
> > > > > > > compatibility with any FileSystem implementation.
> > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎 
> > > wrote:
> > > > > > > >
> > > > > > > > > Inline comments.
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey <
> bus...@cloudera.com
> > >:
> > > > > > > > >
> > > > > > > > > > I am nervous about having default out-of-the-box new
> HBase
> > > > users
> > > > > > > > reliant
> > > > > > > > > on
> > > > > > > > > > a bespoke HDFS client, especially given Hadoop's
> > > compatibility
> > > > > > > > > > promises and history. Answers for these questions would
> > make
> > > me
> > > > > > more
> > > > > > > > > > confident:
> > > > > > > > > >
> > > > > > > > > > 1) Where are we on getting the client-side changes to
> HDFS
> > > > pushed
> > > > > > > back
> > > > > > > > > > upstream?
> > > > > > > > > >
> > > > > > > > > No progress yet... Here I want to tell a good story that
> > HBase
> > > is
> > > > > > 

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-29 Thread 张铎
Yes, it does. There is testcase that enumerates all the possible protection
level(authentication, integrity and privacy) and encryption algorithm(none,
3des, rc4).

https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncDFSOutput.java

I have also tested it in a secure cluster(hbase-2.0.0-SNAPSHOT and
hadoop-2.4.0).

Thanks.

2016-04-30 2:32 GMT+08:00 Gary Helmling :

> How well has this been tested on secure clusters?  I know SASL support was
> lacking initially, but I believe it had been added?  Does AsyncFSWAL
> support all the HDFS transport encryption options?
>
>
> On Fri, Apr 29, 2016 at 12:05 AM Stack  wrote:
>
> > I'm +1 on enabling asyncfswal as default in 2.0:
> >
> > + We'll have plenty of time to figure issues if any if we get it in now,
> > early.
> > + The improvement in throughput is substantial
> > + There are now less moving parts
> > + A critical piece of our write path is much less opaque in its workings
> > and no longer (effectively) immutable
> >
> > St.Ack
> >
> >
> > On Thu, Apr 28, 2016 at 11:53 PM, 张铎  wrote:
> >
> > > I‘ve done dig in HDFS and HADOOP proejcts and found that there is an
> > active
> > > issue HADOOP-12910 that related to asynchronous FileSystem
> > implementation.
> > >
> > > I have left some comments on it, maybe we could start from there.
> > >
> > > Thanks.
> > >
> > > 2016-04-29 14:42 GMT+08:00 Stack :
> > >
> > > > On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu  wrote:
> > > >
> > > > > Last comment on HDFS-916 was from 2010.
> > > > >
> > > > > Suggest making a new issue or reviving discussion on HDFS-916
> > > (currently
> > > > > assigned to Todd).
> > > > >
> > > > >
> > > > Duo is on it. Some mileage and confidence in the new code would be
> good
> > > to
> > > > have before going to HDFS (Getting stuff into HDFS is a PITA at the
> > best
> > > of
> > > > times... lets have a good case when we go there).
> > > >
> > > >
> > > > > bq. The fallback implementation is not aim to get a good
> performance
> > > > >
> > > > > For more than two weeks, I have been working with Azure Data Lake
> > > > > developers so that all hbase system tests pass on ADLS - there were
> > > > subtle
> > > > > differences between ADLS and hdfs.
> > > > >
> > > > > If switching to AsyncWAL gives either WASB or ADLS subpar
> > performance,
> > > it
> > > > > would make upgrading to hbase 2.x unacceptable for their users.
> > > > >
> > > > >
> > > > Just use FSHLog instead of asyncfswal when up on WASB. Its just a
> > config
> > > > change.
> > > >
> > > > St.Ack
> > > >
> > > >
> > > >
> > > > > On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
> > > > >
> > > > > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > > > > >
> > > > > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > > > > >
> > > > > > > Is there HDFS JIRA for the above ? Can you share the number ?
> > > > > > >
> > > > > > I have not filed a new one but there are bunch of related issues
> > > > already,
> > > > > > such as this one https://issues.apache.org/jira/browse/HDFS-916
> > > > > >
> > > > > > >
> > > > > > > bq. Just wrap FSDataOutputStream to make it act like an
> > > asynchronous
> > > > > > output
> > > > > > >
> > > > > > > Can you be a bit more specific ?
> > > > > > > HBase currently works with WASB and Azure Data Lake. Does the
> > above
> > > > > mean
> > > > > > > their performance would suffer ?
> > > > > > >
> > > > > > Yes, the performance will suffer...
> > > > > > The fallback implementation is not aim to get a good performance,
> > > just
> > > > > for
> > > > > > compatibility with any FileSystem implementation.
> > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎 
> > wrote:
> > > > > > >
> > > > > > > > Inline comments.
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey  >:
> > > > > > > >
> > > > > > > > > I am nervous about having default out-of-the-box new HBase
> > > users
> > > > > > > reliant
> > > > > > > > on
> > > > > > > > > a bespoke HDFS client, especially given Hadoop's
> > compatibility
> > > > > > > > > promises and history. Answers for these questions would
> make
> > me
> > > > > more
> > > > > > > > > confident:
> > > > > > > > >
> > > > > > > > > 1) Where are we on getting the client-side changes to HDFS
> > > pushed
> > > > > > back
> > > > > > > > > upstream?
> > > > > > > > >
> > > > > > > > No progress yet... Here I want to tell a good story that
> HBase
> > is
> > > > > > already
> > > > > > > > use it as default :)
> > > > > > > >
> > > > > > > > >
> > > > > > > > > 2) How well do we detect when our FS is not HDFS and what
> > does
> > > > > > > > > fallback look like?
> > > > > > > > >
> > > > > > > > Just wrap FSDataOutputStream to make it act like an
> > asynchronous
> > > > > > > > output(call hflush in a separated thread). The performance is
> > not
> > > > > good
> > > > > > I
> > > > > > > > think.
> > > > > > > >
> > > > > > > > >
> > > 

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-29 Thread Gary Helmling
How well has this been tested on secure clusters?  I know SASL support was
lacking initially, but I believe it had been added?  Does AsyncFSWAL
support all the HDFS transport encryption options?


On Fri, Apr 29, 2016 at 12:05 AM Stack  wrote:

> I'm +1 on enabling asyncfswal as default in 2.0:
>
> + We'll have plenty of time to figure issues if any if we get it in now,
> early.
> + The improvement in throughput is substantial
> + There are now less moving parts
> + A critical piece of our write path is much less opaque in its workings
> and no longer (effectively) immutable
>
> St.Ack
>
>
> On Thu, Apr 28, 2016 at 11:53 PM, 张铎  wrote:
>
> > I‘ve done dig in HDFS and HADOOP proejcts and found that there is an
> active
> > issue HADOOP-12910 that related to asynchronous FileSystem
> implementation.
> >
> > I have left some comments on it, maybe we could start from there.
> >
> > Thanks.
> >
> > 2016-04-29 14:42 GMT+08:00 Stack :
> >
> > > On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu  wrote:
> > >
> > > > Last comment on HDFS-916 was from 2010.
> > > >
> > > > Suggest making a new issue or reviving discussion on HDFS-916
> > (currently
> > > > assigned to Todd).
> > > >
> > > >
> > > Duo is on it. Some mileage and confidence in the new code would be good
> > to
> > > have before going to HDFS (Getting stuff into HDFS is a PITA at the
> best
> > of
> > > times... lets have a good case when we go there).
> > >
> > >
> > > > bq. The fallback implementation is not aim to get a good performance
> > > >
> > > > For more than two weeks, I have been working with Azure Data Lake
> > > > developers so that all hbase system tests pass on ADLS - there were
> > > subtle
> > > > differences between ADLS and hdfs.
> > > >
> > > > If switching to AsyncWAL gives either WASB or ADLS subpar
> performance,
> > it
> > > > would make upgrading to hbase 2.x unacceptable for their users.
> > > >
> > > >
> > > Just use FSHLog instead of asyncfswal when up on WASB. Its just a
> config
> > > change.
> > >
> > > St.Ack
> > >
> > >
> > >
> > > > On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
> > > >
> > > > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > > > >
> > > > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > > > >
> > > > > > Is there HDFS JIRA for the above ? Can you share the number ?
> > > > > >
> > > > > I have not filed a new one but there are bunch of related issues
> > > already,
> > > > > such as this one https://issues.apache.org/jira/browse/HDFS-916
> > > > >
> > > > > >
> > > > > > bq. Just wrap FSDataOutputStream to make it act like an
> > asynchronous
> > > > > output
> > > > > >
> > > > > > Can you be a bit more specific ?
> > > > > > HBase currently works with WASB and Azure Data Lake. Does the
> above
> > > > mean
> > > > > > their performance would suffer ?
> > > > > >
> > > > > Yes, the performance will suffer...
> > > > > The fallback implementation is not aim to get a good performance,
> > just
> > > > for
> > > > > compatibility with any FileSystem implementation.
> > > > >
> > > > > >
> > > > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎 
> wrote:
> > > > > >
> > > > > > > Inline comments.
> > > > > > > Thanks,
> > > > > > >
> > > > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > > > > > >
> > > > > > > > I am nervous about having default out-of-the-box new HBase
> > users
> > > > > > reliant
> > > > > > > on
> > > > > > > > a bespoke HDFS client, especially given Hadoop's
> compatibility
> > > > > > > > promises and history. Answers for these questions would make
> me
> > > > more
> > > > > > > > confident:
> > > > > > > >
> > > > > > > > 1) Where are we on getting the client-side changes to HDFS
> > pushed
> > > > > back
> > > > > > > > upstream?
> > > > > > > >
> > > > > > > No progress yet... Here I want to tell a good story that HBase
> is
> > > > > already
> > > > > > > use it as default :)
> > > > > > >
> > > > > > > >
> > > > > > > > 2) How well do we detect when our FS is not HDFS and what
> does
> > > > > > > > fallback look like?
> > > > > > > >
> > > > > > > Just wrap FSDataOutputStream to make it act like an
> asynchronous
> > > > > > > output(call hflush in a separated thread). The performance is
> not
> > > > good
> > > > > I
> > > > > > > think.
> > > > > > >
> > > > > > > >
> > > > > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > > > > supported for HBase 2.y+?
> > > > > > > >
> > > > > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I
> > don't
> > > > > think
> > > > > > we
> > > > > > > need to change the supported versions?
> > > > > > >
> > > > > > > >
> > > > > > > > 4) How are we going to ensure our client remains compatible
> > with
> > > > > newer
> > > > > > > > Hadoop releases?
> > > > > > > >
> > > > > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > > > > compatible
> > > > > > > with that version. And back to #1, I think we should make sure
> > that
>

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-29 Thread Stack
I'm +1 on enabling asyncfswal as default in 2.0:

+ We'll have plenty of time to figure issues if any if we get it in now,
early.
+ The improvement in throughput is substantial
+ There are now less moving parts
+ A critical piece of our write path is much less opaque in its workings
and no longer (effectively) immutable

St.Ack


On Thu, Apr 28, 2016 at 11:53 PM, 张铎  wrote:

> I‘ve done dig in HDFS and HADOOP proejcts and found that there is an active
> issue HADOOP-12910 that related to asynchronous FileSystem implementation.
>
> I have left some comments on it, maybe we could start from there.
>
> Thanks.
>
> 2016-04-29 14:42 GMT+08:00 Stack :
>
> > On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu  wrote:
> >
> > > Last comment on HDFS-916 was from 2010.
> > >
> > > Suggest making a new issue or reviving discussion on HDFS-916
> (currently
> > > assigned to Todd).
> > >
> > >
> > Duo is on it. Some mileage and confidence in the new code would be good
> to
> > have before going to HDFS (Getting stuff into HDFS is a PITA at the best
> of
> > times... lets have a good case when we go there).
> >
> >
> > > bq. The fallback implementation is not aim to get a good performance
> > >
> > > For more than two weeks, I have been working with Azure Data Lake
> > > developers so that all hbase system tests pass on ADLS - there were
> > subtle
> > > differences between ADLS and hdfs.
> > >
> > > If switching to AsyncWAL gives either WASB or ADLS subpar performance,
> it
> > > would make upgrading to hbase 2.x unacceptable for their users.
> > >
> > >
> > Just use FSHLog instead of asyncfswal when up on WASB. Its just a config
> > change.
> >
> > St.Ack
> >
> >
> >
> > > On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
> > >
> > > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > > >
> > > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > > >
> > > > > Is there HDFS JIRA for the above ? Can you share the number ?
> > > > >
> > > > I have not filed a new one but there are bunch of related issues
> > already,
> > > > such as this one https://issues.apache.org/jira/browse/HDFS-916
> > > >
> > > > >
> > > > > bq. Just wrap FSDataOutputStream to make it act like an
> asynchronous
> > > > output
> > > > >
> > > > > Can you be a bit more specific ?
> > > > > HBase currently works with WASB and Azure Data Lake. Does the above
> > > mean
> > > > > their performance would suffer ?
> > > > >
> > > > Yes, the performance will suffer...
> > > > The fallback implementation is not aim to get a good performance,
> just
> > > for
> > > > compatibility with any FileSystem implementation.
> > > >
> > > > >
> > > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
> > > > >
> > > > > > Inline comments.
> > > > > > Thanks,
> > > > > >
> > > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > > > > >
> > > > > > > I am nervous about having default out-of-the-box new HBase
> users
> > > > > reliant
> > > > > > on
> > > > > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > > > > promises and history. Answers for these questions would make me
> > > more
> > > > > > > confident:
> > > > > > >
> > > > > > > 1) Where are we on getting the client-side changes to HDFS
> pushed
> > > > back
> > > > > > > upstream?
> > > > > > >
> > > > > > No progress yet... Here I want to tell a good story that HBase is
> > > > already
> > > > > > use it as default :)
> > > > > >
> > > > > > >
> > > > > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > > > > fallback look like?
> > > > > > >
> > > > > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > > > > output(call hflush in a separated thread). The performance is not
> > > good
> > > > I
> > > > > > think.
> > > > > >
> > > > > > >
> > > > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > > > supported for HBase 2.y+?
> > > > > > >
> > > > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I
> don't
> > > > think
> > > > > we
> > > > > > need to change the supported versions?
> > > > > >
> > > > > > >
> > > > > > > 4) How are we going to ensure our client remains compatible
> with
> > > > newer
> > > > > > > Hadoop releases?
> > > > > > >
> > > > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > > > compatible
> > > > > > with that version. And back to #1, I think we should make sure
> that
> > > the
> > > > > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can
> > > introduce a
> > > > > new
> > > > > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang <
> zhang...@apache.org>
> > > > > wrote:
> > > > > > > > Six month after I filed HBASE-14790...
> > > > > > > >
> > > > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it
> is
> > > > > > > *1.4x~3.7x*
> > > > > > > > faster than FSHLog. The ITBLL result turns out that it is
>

Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread 张铎
I‘ve done dig in HDFS and HADOOP proejcts and found that there is an active
issue HADOOP-12910 that related to asynchronous FileSystem implementation.

I have left some comments on it, maybe we could start from there.

Thanks.

2016-04-29 14:42 GMT+08:00 Stack :

> On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu  wrote:
>
> > Last comment on HDFS-916 was from 2010.
> >
> > Suggest making a new issue or reviving discussion on HDFS-916 (currently
> > assigned to Todd).
> >
> >
> Duo is on it. Some mileage and confidence in the new code would be good to
> have before going to HDFS (Getting stuff into HDFS is a PITA at the best of
> times... lets have a good case when we go there).
>
>
> > bq. The fallback implementation is not aim to get a good performance
> >
> > For more than two weeks, I have been working with Azure Data Lake
> > developers so that all hbase system tests pass on ADLS - there were
> subtle
> > differences between ADLS and hdfs.
> >
> > If switching to AsyncWAL gives either WASB or ADLS subpar performance, it
> > would make upgrading to hbase 2.x unacceptable for their users.
> >
> >
> Just use FSHLog instead of asyncfswal when up on WASB. Its just a config
> change.
>
> St.Ack
>
>
>
> > On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
> >
> > > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> > >
> > > > bq. AsyncFSOutput will be in HDFS-3.0
> > > >
> > > > Is there HDFS JIRA for the above ? Can you share the number ?
> > > >
> > > I have not filed a new one but there are bunch of related issues
> already,
> > > such as this one https://issues.apache.org/jira/browse/HDFS-916
> > >
> > > >
> > > > bq. Just wrap FSDataOutputStream to make it act like an asynchronous
> > > output
> > > >
> > > > Can you be a bit more specific ?
> > > > HBase currently works with WASB and Azure Data Lake. Does the above
> > mean
> > > > their performance would suffer ?
> > > >
> > > Yes, the performance will suffer...
> > > The fallback implementation is not aim to get a good performance, just
> > for
> > > compatibility with any FileSystem implementation.
> > >
> > > >
> > > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
> > > >
> > > > > Inline comments.
> > > > > Thanks,
> > > > >
> > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > > > >
> > > > > > I am nervous about having default out-of-the-box new HBase users
> > > > reliant
> > > > > on
> > > > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > > > promises and history. Answers for these questions would make me
> > more
> > > > > > confident:
> > > > > >
> > > > > > 1) Where are we on getting the client-side changes to HDFS pushed
> > > back
> > > > > > upstream?
> > > > > >
> > > > > No progress yet... Here I want to tell a good story that HBase is
> > > already
> > > > > use it as default :)
> > > > >
> > > > > >
> > > > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > > > fallback look like?
> > > > > >
> > > > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > > > output(call hflush in a separated thread). The performance is not
> > good
> > > I
> > > > > think.
> > > > >
> > > > > >
> > > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > > supported for HBase 2.y+?
> > > > > >
> > > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't
> > > think
> > > > we
> > > > > need to change the supported versions?
> > > > >
> > > > > >
> > > > > > 4) How are we going to ensure our client remains compatible with
> > > newer
> > > > > > Hadoop releases?
> > > > > >
> > > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > > compatible
> > > > > with that version. And back to #1, I think we should make sure that
> > the
> > > > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can
> > introduce a
> > > > new
> > > > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > > > >
> > > > > >
> > > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> > > > wrote:
> > > > > > > Six month after I filed HBASE-14790...
> > > > > > >
> > > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > > > > *1.4x~3.7x*
> > > > > > > faster than FSHLog. The ITBLL result turns out that it is *not
> > bad*
> > > > > than
> > > > > > > FSHLog(the master branch is not that stable itself...).
> > > > > > >
> > > > > > > More details can be found on HBASE-15536.
> > > > > > >
> > > > > > > So here we propose to change the default WAL from FSHLog to
> > > > AsyncFSWAL.
> > > > > > > Suggestions are welcomed.
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > busbey
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Stack
On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu  wrote:

> Last comment on HDFS-916 was from 2010.
>
> Suggest making a new issue or reviving discussion on HDFS-916 (currently
> assigned to Todd).
>
>
Duo is on it. Some mileage and confidence in the new code would be good to
have before going to HDFS (Getting stuff into HDFS is a PITA at the best of
times... lets have a good case when we go there).


> bq. The fallback implementation is not aim to get a good performance
>
> For more than two weeks, I have been working with Azure Data Lake
> developers so that all hbase system tests pass on ADLS - there were subtle
> differences between ADLS and hdfs.
>
> If switching to AsyncWAL gives either WASB or ADLS subpar performance, it
> would make upgrading to hbase 2.x unacceptable for their users.
>
>
Just use FSHLog instead of asyncfswal when up on WASB. Its just a config
change.

St.Ack



> On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
>
> > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> >
> > > bq. AsyncFSOutput will be in HDFS-3.0
> > >
> > > Is there HDFS JIRA for the above ? Can you share the number ?
> > >
> > I have not filed a new one but there are bunch of related issues already,
> > such as this one https://issues.apache.org/jira/browse/HDFS-916
> >
> > >
> > > bq. Just wrap FSDataOutputStream to make it act like an asynchronous
> > output
> > >
> > > Can you be a bit more specific ?
> > > HBase currently works with WASB and Azure Data Lake. Does the above
> mean
> > > their performance would suffer ?
> > >
> > Yes, the performance will suffer...
> > The fallback implementation is not aim to get a good performance, just
> for
> > compatibility with any FileSystem implementation.
> >
> > >
> > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
> > >
> > > > Inline comments.
> > > > Thanks,
> > > >
> > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > > >
> > > > > I am nervous about having default out-of-the-box new HBase users
> > > reliant
> > > > on
> > > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > > promises and history. Answers for these questions would make me
> more
> > > > > confident:
> > > > >
> > > > > 1) Where are we on getting the client-side changes to HDFS pushed
> > back
> > > > > upstream?
> > > > >
> > > > No progress yet... Here I want to tell a good story that HBase is
> > already
> > > > use it as default :)
> > > >
> > > > >
> > > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > > fallback look like?
> > > > >
> > > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > > output(call hflush in a separated thread). The performance is not
> good
> > I
> > > > think.
> > > >
> > > > >
> > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > supported for HBase 2.y+?
> > > > >
> > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't
> > think
> > > we
> > > > need to change the supported versions?
> > > >
> > > > >
> > > > > 4) How are we going to ensure our client remains compatible with
> > newer
> > > > > Hadoop releases?
> > > > >
> > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > compatible
> > > > with that version. And back to #1, I think we should make sure that
> the
> > > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can
> introduce a
> > > new
> > > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > > >
> > > > >
> > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> > > wrote:
> > > > > > Six month after I filed HBASE-14790...
> > > > > >
> > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > > > *1.4x~3.7x*
> > > > > > faster than FSHLog. The ITBLL result turns out that it is *not
> bad*
> > > > than
> > > > > > FSHLog(the master branch is not that stable itself...).
> > > > > >
> > > > > > More details can be found on HBASE-15536.
> > > > > >
> > > > > > So here we propose to change the default WAL from FSHLog to
> > > AsyncFSWAL.
> > > > > > Suggestions are welcomed.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > busbey
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Stack
On Thu, Apr 28, 2016 at 8:34 PM, Heng Chen  wrote:

> The performance is quite great,  but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> Yeah. Would be nice if a production deploy before we made the switch but
in the absence of that, lets get it enabled by default early in the
master/2.0 branch.

I've done testing on a cluster using ITBLL trying to break it. I've found
that asyncfs WAL is no worse than our FSHLog able to do same scale at
least. Its hard to test master in its current state but I was able to do
runs of billions over many hours of chaos on cluster of 9 nodes (see issue
for detail).

In fact, asyncfswal can only be better. It is a massive simplification of
the disruptor+5 syncing threads+opaque dfsclient internal mess we currently
run with. If an issue, we'll be more likely able to figure it out if
asyncfswal is in place.

St.Ack



> 2016-04-29 11:30 GMT+08:00 张铎 :
>
> > Inline comments.
> > Thanks,
> >
> > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> >
> > > I am nervous about having default out-of-the-box new HBase users
> reliant
> > on
> > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > promises and history. Answers for these questions would make me more
> > > confident:
> > >
> > > 1) Where are we on getting the client-side changes to HDFS pushed back
> > > upstream?
> > >
> > No progress yet... Here I want to tell a good story that HBase is already
> > use it as default :)
> >
> > >
> > > 2) How well do we detect when our FS is not HDFS and what does
> > > fallback look like?
> > >
> > Just wrap FSDataOutputStream to make it act like an asynchronous
> > output(call hflush in a separated thread). The performance is not good I
> > think.
> >
> > >
> > > 3) Will this mean altering the versions of Hadoop we label as
> > > supported for HBase 2.y+?
> > >
> > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think
> we
> > need to change the supported versions?
> >
> > >
> > > 4) How are we going to ensure our client remains compatible with newer
> > > Hadoop releases?
> > >
> > We can not ensure, HDFS always breaks HBase at a new release...
> > I need to test AsyncFSWAL on every new 2.x release and make it compatible
> > with that version. And back to #1, I think we should make sure that the
> > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> new
> > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> >
> > >
> > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> wrote:
> > > > Six month after I filed HBASE-14790...
> > > >
> > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > *1.4x~3.7x*
> > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > than
> > > > FSHLog(the master branch is not that stable itself...).
> > > >
> > > > More details can be found on HBASE-15536.
> > > >
> > > > So here we propose to change the default WAL from FSHLog to
> AsyncFSWAL.
> > > > Suggestions are welcomed.
> > > >
> > > > Thanks.
> > >
> > >
> > >
> > > --
> > > busbey
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread 张铎
2016-04-29 11:47 GMT+08:00 Ted Yu :

> Last comment on HDFS-916 was from 2010.
>
> Suggest making a new issue or reviving discussion on HDFS-916 (currently
> assigned to Todd).
>
> bq. The fallback implementation is not aim to get a good performance
>
> For more than two weeks, I have been working with Azure Data Lake
> developers so that all hbase system tests pass on ADLS - there were subtle
> differences between ADLS and hdfs.
>
> If switching to AsyncWAL gives either WASB or ADLS subpar performance, it
> would make upgrading to hbase 2.x unacceptable for their users.
>
You can still use FSHLog, it is not removed...
But yes, this is a good point on how we choose default configs in HBase.
A config that performs normally for every case, or a config that performs
much better under the main scenario but worse for other scenarios...

>
> On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:
>
> > 2016-04-29 11:35 GMT+08:00 Ted Yu :
> >
> > > bq. AsyncFSOutput will be in HDFS-3.0
> > >
> > > Is there HDFS JIRA for the above ? Can you share the number ?
> > >
> > I have not filed a new one but there are bunch of related issues already,
> > such as this one https://issues.apache.org/jira/browse/HDFS-916
> >
> > >
> > > bq. Just wrap FSDataOutputStream to make it act like an asynchronous
> > output
> > >
> > > Can you be a bit more specific ?
> > > HBase currently works with WASB and Azure Data Lake. Does the above
> mean
> > > their performance would suffer ?
> > >
> > Yes, the performance will suffer...
> > The fallback implementation is not aim to get a good performance, just
> for
> > compatibility with any FileSystem implementation.
> >
> > >
> > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
> > >
> > > > Inline comments.
> > > > Thanks,
> > > >
> > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > > >
> > > > > I am nervous about having default out-of-the-box new HBase users
> > > reliant
> > > > on
> > > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > > promises and history. Answers for these questions would make me
> more
> > > > > confident:
> > > > >
> > > > > 1) Where are we on getting the client-side changes to HDFS pushed
> > back
> > > > > upstream?
> > > > >
> > > > No progress yet... Here I want to tell a good story that HBase is
> > already
> > > > use it as default :)
> > > >
> > > > >
> > > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > > fallback look like?
> > > > >
> > > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > > output(call hflush in a separated thread). The performance is not
> good
> > I
> > > > think.
> > > >
> > > > >
> > > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > > supported for HBase 2.y+?
> > > > >
> > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't
> > think
> > > we
> > > > need to change the supported versions?
> > > >
> > > > >
> > > > > 4) How are we going to ensure our client remains compatible with
> > newer
> > > > > Hadoop releases?
> > > > >
> > > > We can not ensure, HDFS always breaks HBase at a new release...
> > > > I need to test AsyncFSWAL on every new 2.x release and make it
> > compatible
> > > > with that version. And back to #1, I think we should make sure that
> the
> > > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can
> introduce a
> > > new
> > > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > > >
> > > > >
> > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> > > wrote:
> > > > > > Six month after I filed HBASE-14790...
> > > > > >
> > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > > > *1.4x~3.7x*
> > > > > > faster than FSHLog. The ITBLL result turns out that it is *not
> bad*
> > > > than
> > > > > > FSHLog(the master branch is not that stable itself...).
> > > > > >
> > > > > > More details can be found on HBASE-15536.
> > > > > >
> > > > > > So here we propose to change the default WAL from FSHLog to
> > > AsyncFSWAL.
> > > > > > Suggestions are welcomed.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > busbey
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Ted Yu
Last comment on HDFS-916 was from 2010.

Suggest making a new issue or reviving discussion on HDFS-916 (currently
assigned to Todd).

bq. The fallback implementation is not aim to get a good performance

For more than two weeks, I have been working with Azure Data Lake
developers so that all hbase system tests pass on ADLS - there were subtle
differences between ADLS and hdfs.

If switching to AsyncWAL gives either WASB or ADLS subpar performance, it
would make upgrading to hbase 2.x unacceptable for their users.

On Thu, Apr 28, 2016 at 8:39 PM, 张铎  wrote:

> 2016-04-29 11:35 GMT+08:00 Ted Yu :
>
> > bq. AsyncFSOutput will be in HDFS-3.0
> >
> > Is there HDFS JIRA for the above ? Can you share the number ?
> >
> I have not filed a new one but there are bunch of related issues already,
> such as this one https://issues.apache.org/jira/browse/HDFS-916
>
> >
> > bq. Just wrap FSDataOutputStream to make it act like an asynchronous
> output
> >
> > Can you be a bit more specific ?
> > HBase currently works with WASB and Azure Data Lake. Does the above mean
> > their performance would suffer ?
> >
> Yes, the performance will suffer...
> The fallback implementation is not aim to get a good performance, just for
> compatibility with any FileSystem implementation.
>
> >
> > On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
> >
> > > Inline comments.
> > > Thanks,
> > >
> > > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> > >
> > > > I am nervous about having default out-of-the-box new HBase users
> > reliant
> > > on
> > > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > > promises and history. Answers for these questions would make me more
> > > > confident:
> > > >
> > > > 1) Where are we on getting the client-side changes to HDFS pushed
> back
> > > > upstream?
> > > >
> > > No progress yet... Here I want to tell a good story that HBase is
> already
> > > use it as default :)
> > >
> > > >
> > > > 2) How well do we detect when our FS is not HDFS and what does
> > > > fallback look like?
> > > >
> > > Just wrap FSDataOutputStream to make it act like an asynchronous
> > > output(call hflush in a separated thread). The performance is not good
> I
> > > think.
> > >
> > > >
> > > > 3) Will this mean altering the versions of Hadoop we label as
> > > > supported for HBase 2.y+?
> > > >
> > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't
> think
> > we
> > > need to change the supported versions?
> > >
> > > >
> > > > 4) How are we going to ensure our client remains compatible with
> newer
> > > > Hadoop releases?
> > > >
> > > We can not ensure, HDFS always breaks HBase at a new release...
> > > I need to test AsyncFSWAL on every new 2.x release and make it
> compatible
> > > with that version. And back to #1, I think we should make sure that the
> > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> > new
> > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> > >
> > > >
> > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> > wrote:
> > > > > Six month after I filed HBASE-14790...
> > > > >
> > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > > *1.4x~3.7x*
> > > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > > than
> > > > > FSHLog(the master branch is not that stable itself...).
> > > > >
> > > > > More details can be found on HBASE-15536.
> > > > >
> > > > > So here we propose to change the default WAL from FSHLog to
> > AsyncFSWAL.
> > > > > Suggestions are welcomed.
> > > > >
> > > > > Thanks.
> > > >
> > > >
> > > >
> > > > --
> > > > busbey
> > > >
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread 张铎
But 2.0 is not released yet...

Do you think it worth to backport this feature to branch-1 and release it
in the next 1.x release? This may introduce a compatibility issue as said
in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
upgrade does not lose data...

2016-04-29 11:34 GMT+08:00 Heng Chen :

> The performance is quite great,  but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> 2016-04-29 11:30 GMT+08:00 张铎 :
>
> > Inline comments.
> > Thanks,
> >
> > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> >
> > > I am nervous about having default out-of-the-box new HBase users
> reliant
> > on
> > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > promises and history. Answers for these questions would make me more
> > > confident:
> > >
> > > 1) Where are we on getting the client-side changes to HDFS pushed back
> > > upstream?
> > >
> > No progress yet... Here I want to tell a good story that HBase is already
> > use it as default :)
> >
> > >
> > > 2) How well do we detect when our FS is not HDFS and what does
> > > fallback look like?
> > >
> > Just wrap FSDataOutputStream to make it act like an asynchronous
> > output(call hflush in a separated thread). The performance is not good I
> > think.
> >
> > >
> > > 3) Will this mean altering the versions of Hadoop we label as
> > > supported for HBase 2.y+?
> > >
> > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think
> we
> > need to change the supported versions?
> >
> > >
> > > 4) How are we going to ensure our client remains compatible with newer
> > > Hadoop releases?
> > >
> > We can not ensure, HDFS always breaks HBase at a new release...
> > I need to test AsyncFSWAL on every new 2.x release and make it compatible
> > with that version. And back to #1, I think we should make sure that the
> > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> new
> > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> >
> > >
> > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> wrote:
> > > > Six month after I filed HBASE-14790...
> > > >
> > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > *1.4x~3.7x*
> > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > than
> > > > FSHLog(the master branch is not that stable itself...).
> > > >
> > > > More details can be found on HBASE-15536.
> > > >
> > > > So here we propose to change the default WAL from FSHLog to
> AsyncFSWAL.
> > > > Suggestions are welcomed.
> > > >
> > > > Thanks.
> > >
> > >
> > >
> > > --
> > > busbey
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread 张铎
2016-04-29 11:35 GMT+08:00 Ted Yu :

> bq. AsyncFSOutput will be in HDFS-3.0
>
> Is there HDFS JIRA for the above ? Can you share the number ?
>
I have not filed a new one but there are bunch of related issues already,
such as this one https://issues.apache.org/jira/browse/HDFS-916

>
> bq. Just wrap FSDataOutputStream to make it act like an asynchronous output
>
> Can you be a bit more specific ?
> HBase currently works with WASB and Azure Data Lake. Does the above mean
> their performance would suffer ?
>
Yes, the performance will suffer...
The fallback implementation is not aim to get a good performance, just for
compatibility with any FileSystem implementation.

>
> On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:
>
> > Inline comments.
> > Thanks,
> >
> > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> >
> > > I am nervous about having default out-of-the-box new HBase users
> reliant
> > on
> > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > promises and history. Answers for these questions would make me more
> > > confident:
> > >
> > > 1) Where are we on getting the client-side changes to HDFS pushed back
> > > upstream?
> > >
> > No progress yet... Here I want to tell a good story that HBase is already
> > use it as default :)
> >
> > >
> > > 2) How well do we detect when our FS is not HDFS and what does
> > > fallback look like?
> > >
> > Just wrap FSDataOutputStream to make it act like an asynchronous
> > output(call hflush in a separated thread). The performance is not good I
> > think.
> >
> > >
> > > 3) Will this mean altering the versions of Hadoop we label as
> > > supported for HBase 2.y+?
> > >
> > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think
> we
> > need to change the supported versions?
> >
> > >
> > > 4) How are we going to ensure our client remains compatible with newer
> > > Hadoop releases?
> > >
> > We can not ensure, HDFS always breaks HBase at a new release...
> > I need to test AsyncFSWAL on every new 2.x release and make it compatible
> > with that version. And back to #1, I think we should make sure that the
> > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> new
> > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> >
> > >
> > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> wrote:
> > > > Six month after I filed HBASE-14790...
> > > >
> > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > *1.4x~3.7x*
> > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > than
> > > > FSHLog(the master branch is not that stable itself...).
> > > >
> > > > More details can be found on HBASE-15536.
> > > >
> > > > So here we propose to change the default WAL from FSHLog to
> AsyncFSWAL.
> > > > Suggestions are welcomed.
> > > >
> > > > Thanks.
> > >
> > >
> > >
> > > --
> > > busbey
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Ted Yu
+1 to what Heng said.

On Thu, Apr 28, 2016 at 8:34 PM, Heng Chen  wrote:

> The performance is quite great,  but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> 2016-04-29 11:30 GMT+08:00 张铎 :
>
> > Inline comments.
> > Thanks,
> >
> > 2016-04-29 10:57 GMT+08:00 Sean Busbey :
> >
> > > I am nervous about having default out-of-the-box new HBase users
> reliant
> > on
> > > a bespoke HDFS client, especially given Hadoop's compatibility
> > > promises and history. Answers for these questions would make me more
> > > confident:
> > >
> > > 1) Where are we on getting the client-side changes to HDFS pushed back
> > > upstream?
> > >
> > No progress yet... Here I want to tell a good story that HBase is already
> > use it as default :)
> >
> > >
> > > 2) How well do we detect when our FS is not HDFS and what does
> > > fallback look like?
> > >
> > Just wrap FSDataOutputStream to make it act like an asynchronous
> > output(call hflush in a separated thread). The performance is not good I
> > think.
> >
> > >
> > > 3) Will this mean altering the versions of Hadoop we label as
> > > supported for HBase 2.y+?
> > >
> > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think
> we
> > need to change the supported versions?
> >
> > >
> > > 4) How are we going to ensure our client remains compatible with newer
> > > Hadoop releases?
> > >
> > We can not ensure, HDFS always breaks HBase at a new release...
> > I need to test AsyncFSWAL on every new 2.x release and make it compatible
> > with that version. And back to #1, I think we should make sure that the
> > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a
> new
> > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
> >
> > >
> > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang 
> wrote:
> > > > Six month after I filed HBASE-14790...
> > > >
> > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > > *1.4x~3.7x*
> > > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> > than
> > > > FSHLog(the master branch is not that stable itself...).
> > > >
> > > > More details can be found on HBASE-15536.
> > > >
> > > > So here we propose to change the default WAL from FSHLog to
> AsyncFSWAL.
> > > > Suggestions are welcomed.
> > > >
> > > > Thanks.
> > >
> > >
> > >
> > > --
> > > busbey
> > >
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Ted Yu
bq. AsyncFSOutput will be in HDFS-3.0

Is there HDFS JIRA for the above ? Can you share the number ?

bq. Just wrap FSDataOutputStream to make it act like an asynchronous output

Can you be a bit more specific ?
HBase currently works with WASB and Azure Data Lake. Does the above mean
their performance would suffer ?

On Thu, Apr 28, 2016 at 8:30 PM, 张铎  wrote:

> Inline comments.
> Thanks,
>
> 2016-04-29 10:57 GMT+08:00 Sean Busbey :
>
> > I am nervous about having default out-of-the-box new HBase users reliant
> on
> > a bespoke HDFS client, especially given Hadoop's compatibility
> > promises and history. Answers for these questions would make me more
> > confident:
> >
> > 1) Where are we on getting the client-side changes to HDFS pushed back
> > upstream?
> >
> No progress yet... Here I want to tell a good story that HBase is already
> use it as default :)
>
> >
> > 2) How well do we detect when our FS is not HDFS and what does
> > fallback look like?
> >
> Just wrap FSDataOutputStream to make it act like an asynchronous
> output(call hflush in a separated thread). The performance is not good I
> think.
>
> >
> > 3) Will this mean altering the versions of Hadoop we label as
> > supported for HBase 2.y+?
> >
> I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think we
> need to change the supported versions?
>
> >
> > 4) How are we going to ensure our client remains compatible with newer
> > Hadoop releases?
> >
> We can not ensure, HDFS always breaks HBase at a new release...
> I need to test AsyncFSWAL on every new 2.x release and make it compatible
> with that version. And back to #1, I think we should make sure that the
> AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a new
> 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
>
> >
> > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang  wrote:
> > > Six month after I filed HBASE-14790...
> > >
> > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > *1.4x~3.7x*
> > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> than
> > > FSHLog(the master branch is not that stable itself...).
> > >
> > > More details can be found on HBASE-15536.
> > >
> > > So here we propose to change the default WAL from FSHLog to AsyncFSWAL.
> > > Suggestions are welcomed.
> > >
> > > Thanks.
> >
> >
> >
> > --
> > busbey
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Heng Chen
The performance is quite great,  but i think maybe we should collect some
experience on real production cluster before we make it as default.

2016-04-29 11:30 GMT+08:00 张铎 :

> Inline comments.
> Thanks,
>
> 2016-04-29 10:57 GMT+08:00 Sean Busbey :
>
> > I am nervous about having default out-of-the-box new HBase users reliant
> on
> > a bespoke HDFS client, especially given Hadoop's compatibility
> > promises and history. Answers for these questions would make me more
> > confident:
> >
> > 1) Where are we on getting the client-side changes to HDFS pushed back
> > upstream?
> >
> No progress yet... Here I want to tell a good story that HBase is already
> use it as default :)
>
> >
> > 2) How well do we detect when our FS is not HDFS and what does
> > fallback look like?
> >
> Just wrap FSDataOutputStream to make it act like an asynchronous
> output(call hflush in a separated thread). The performance is not good I
> think.
>
> >
> > 3) Will this mean altering the versions of Hadoop we label as
> > supported for HBase 2.y+?
> >
> I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think we
> need to change the supported versions?
>
> >
> > 4) How are we going to ensure our client remains compatible with newer
> > Hadoop releases?
> >
> We can not ensure, HDFS always breaks HBase at a new release...
> I need to test AsyncFSWAL on every new 2.x release and make it compatible
> with that version. And back to #1, I think we should make sure that the
> AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a new
> 'AsyncFSWAL' that use the AsyncFSOutput in HDFS.
>
> >
> > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang  wrote:
> > > Six month after I filed HBASE-14790...
> > >
> > > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> > *1.4x~3.7x*
> > > faster than FSHLog. The ITBLL result turns out that it is *not bad*
> than
> > > FSHLog(the master branch is not that stable itself...).
> > >
> > > More details can be found on HBASE-15536.
> > >
> > > So here we propose to change the default WAL from FSHLog to AsyncFSWAL.
> > > Suggestions are welcomed.
> > >
> > > Thanks.
> >
> >
> >
> > --
> > busbey
> >
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread 张铎
Inline comments.
Thanks,

2016-04-29 10:57 GMT+08:00 Sean Busbey :

> I am nervous about having default out-of-the-box new HBase users reliant on
> a bespoke HDFS client, especially given Hadoop's compatibility
> promises and history. Answers for these questions would make me more
> confident:
>
> 1) Where are we on getting the client-side changes to HDFS pushed back
> upstream?
>
No progress yet... Here I want to tell a good story that HBase is already
use it as default :)

>
> 2) How well do we detect when our FS is not HDFS and what does
> fallback look like?
>
Just wrap FSDataOutputStream to make it act like an asynchronous
output(call hflush in a separated thread). The performance is not good I
think.

>
> 3) Will this mean altering the versions of Hadoop we label as
> supported for HBase 2.y+?
>
I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think we
need to change the supported versions?

>
> 4) How are we going to ensure our client remains compatible with newer
> Hadoop releases?
>
We can not ensure, HDFS always breaks HBase at a new release...
I need to test AsyncFSWAL on every new 2.x release and make it compatible
with that version. And back to #1, I think we should make sure that the
AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a new
'AsyncFSWAL' that use the AsyncFSOutput in HDFS.

>
> On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang  wrote:
> > Six month after I filed HBASE-14790...
> >
> > Now the AsyncFSWAL is ready. The WALPE result shows that it is
> *1.4x~3.7x*
> > faster than FSHLog. The ITBLL result turns out that it is *not bad* than
> > FSHLog(the master branch is not that stable itself...).
> >
> > More details can be found on HBASE-15536.
> >
> > So here we propose to change the default WAL from FSHLog to AsyncFSWAL.
> > Suggestions are welcomed.
> >
> > Thanks.
>
>
>
> --
> busbey
>


Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Sean Busbey
I am nervous about having default out-of-the-box new HBase users reliant on
a bespoke HDFS client, especially given Hadoop's compatibility
promises and history. Answers for these questions would make me more
confident:

1) Where are we on getting the client-side changes to HDFS pushed back upstream?

2) How well do we detect when our FS is not HDFS and what does
fallback look like?

3) Will this mean altering the versions of Hadoop we label as
supported for HBase 2.y+?

4) How are we going to ensure our client remains compatible with newer
Hadoop releases?


On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang  wrote:
> Six month after I filed HBASE-14790...
>
> Now the AsyncFSWAL is ready. The WALPE result shows that it is *1.4x~3.7x*
> faster than FSHLog. The ITBLL result turns out that it is *not bad* than
> FSHLog(the master branch is not that stable itself...).
>
> More details can be found on HBASE-15536.
>
> So here we propose to change the default WAL from FSHLog to AsyncFSWAL.
> Suggestions are welcomed.
>
> Thanks.



-- 
busbey


[DISCUSS] Make AsyncFSWAL the default WAL in 2.0

2016-04-28 Thread Duo Zhang
Six month after I filed HBASE-14790...

Now the AsyncFSWAL is ready. The WALPE result shows that it is *1.4x~3.7x*
faster than FSHLog. The ITBLL result turns out that it is *not bad* than
FSHLog(the master branch is not that stable itself...).

More details can be found on HBASE-15536.

So here we propose to change the default WAL from FSHLog to AsyncFSWAL.
Suggestions are welcomed.

Thanks.