I think at that time I will start a new project called AsyncDFSClient which will implement the whole client side logic of HDFS without using reflection :)
2016-05-12 10:27 GMT+08:00 Andrew Purtell <[email protected]>: > If Hadoop refuses the changes before we release, we can change the default > back. > > > On May 11, 2016, at 6:50 PM, Gary Helmling <[email protected]> wrote: > > >> > >> > >> I was trying to avoid the below oft-repeated pattern at least for the > case > >> of critical developments: > >> > >> + New feature arrives after much work by developer, reviewers and > testers > >> accompanied by fanfare (blog, talks). > >> + Developers and reviewers move on after getting it committed or it gets > >> hacked into a deploy so it works in a frankenstein form > >> + It sits in our code base across one or more releases marked as > optional, > >> 'experimental' > >> + The 'experimental' bleamish discourages its exercise by users > >> + The feature lags, rots > >> + Or, the odd time, we go ahead and enable it as default in spite of the > >> fact it was never tried when experimental. > >> > >> Distributed Log Replay sat in hbase across a few major versions. Only > when > >> the threat of our making an actual release with it on by default did it > get > >> serious attention where it was found flawed and is now being actively > >> purged. This was after it made it past reviews, multiple attempts at > >> testing at scale, and so on; i.e. we'd done it all by the book. The > time in > >> an 'experimental' state added nothing. > > Those are all valid concerns as well. It's certainly a pattern that we've > > seen repeated. That's also a broader concern I have about the farther we > > push out 2.0, then the less exercised master is. > > > > I don't really know how best to balance this with concerns about user > > stability. Enabling by default in master would certainly be a forcing > > function and would help it get more testing before release. I hear that > > argument. But I'm worried about the impact after release, where > something > > as simple as a bug-fix point release upgrade of Hadoop could result in > > runtime breakage of an HBase install. Will this happen in practice? I > > don't know. It seems unlikely that the private variable names being used > > for example would change in a point release. But we're violating the > > abstraction that Hadoop provides us which guarantees such breakage won't > > occur. > > > > > >>> Yes. 2.0 is a bit out there so we have some time to iron out issues is > >> the > >> thought. Yes, it could push out delivery of 2.0. > > Having this on by default in an unreleased master doesn't actually worry > me > > that much. It's just the question of what happens when we do release. > At > > that point, this discussion will be ancient history and I don't think > we'll > > give any renewed consideration to what the impact of this change might > be. > > Ideally it would be great to see this work in HDFS by that point and for > > that HDFS version this becomes a non-issue. > > > > > >> > >> I think the discussion here has been helpful. Holes have been found (and > >> plugged), the risk involved has gotten a good airing out here on dev, > and > >> in spite of the back and forth, one of our experts in good standing is > >> still against it being on by default. > >> > >> If you are not down w/ the arguments, I'd be fine not making it the > >> default. > >> St.Ack > > > > I don't think it's right to block this by myself, since I'm clearly in > the > > minority. Since others clearly support this change, have at it. > > > > But let me pose an alternate question: what if HDFS flat out refuses to > > adopt this change? What are our options then with this already shipping > as > > a default? Would we continue to endure breakage due to the use of HDFS > > private internals? Do we switch the default back? Do we do something > else? > > > > Thanks for the discussion. >
