On Wed, May 11, 2016 at 10:28 PM, Andrew Purtell <[email protected]> wrote:
> All you have to do is stick around long enough. Hadoop 0.20-append v2 :-) > *palm-all-the-faces* > On May 11, 2016, at 9:46 PM, Stack <[email protected]> wrote: > > > >> On Wed, May 11, 2016 at 7:53 PM, 张铎 <[email protected]> wrote: > >> > >> I think at that time I will start a new project called AsyncDFSClient > which > >> will implement the whole client side logic of HDFS without using > reflection > >> :) > > Haven't I seen this movie before? (smile) > > St.Ack > > > > > > > >> 2016-05-12 10:27 GMT+08:00 Andrew Purtell <[email protected]>: > >> > >>> If Hadoop refuses the changes before we release, we can change the > >> default > >>> back. > >>> > >>> > >>> On May 11, 2016, at 6:50 PM, Gary Helmling <[email protected]> > wrote: > >>> > >>>>> > >>>>> > >>>>> I was trying to avoid the below oft-repeated pattern at least for the > >>> case > >>>>> of critical developments: > >>>>> > >>>>> + New feature arrives after much work by developer, reviewers and > >>> testers > >>>>> accompanied by fanfare (blog, talks). > >>>>> + Developers and reviewers move on after getting it committed or it > >> gets > >>>>> hacked into a deploy so it works in a frankenstein form > >>>>> + It sits in our code base across one or more releases marked as > >>> optional, > >>>>> 'experimental' > >>>>> + The 'experimental' bleamish discourages its exercise by users > >>>>> + The feature lags, rots > >>>>> + Or, the odd time, we go ahead and enable it as default in spite of > >> the > >>>>> fact it was never tried when experimental. > >>>>> > >>>>> Distributed Log Replay sat in hbase across a few major versions. Only > >>> when > >>>>> the threat of our making an actual release with it on by default did > >> it > >>> get > >>>>> serious attention where it was found flawed and is now being actively > >>>>> purged. This was after it made it past reviews, multiple attempts at > >>>>> testing at scale, and so on; i.e. we'd done it all by the book. The > >>> time in > >>>>> an 'experimental' state added nothing. > >>>> Those are all valid concerns as well. It's certainly a pattern that > >> we've > >>>> seen repeated. That's also a broader concern I have about the farther > >> we > >>>> push out 2.0, then the less exercised master is. > >>>> > >>>> I don't really know how best to balance this with concerns about user > >>>> stability. Enabling by default in master would certainly be a forcing > >>>> function and would help it get more testing before release. I hear > >> that > >>>> argument. But I'm worried about the impact after release, where > >>> something > >>>> as simple as a bug-fix point release upgrade of Hadoop could result in > >>>> runtime breakage of an HBase install. Will this happen in practice? > I > >>>> don't know. It seems unlikely that the private variable names being > >> used > >>>> for example would change in a point release. But we're violating the > >>>> abstraction that Hadoop provides us which guarantees such breakage > >> won't > >>>> occur. > >>>> > >>>> > >>>>>> Yes. 2.0 is a bit out there so we have some time to iron out issues > >> is > >>>>> the > >>>>> thought. Yes, it could push out delivery of 2.0. > >>>> Having this on by default in an unreleased master doesn't actually > >> worry > >>> me > >>>> that much. It's just the question of what happens when we do release. > >>> At > >>>> that point, this discussion will be ancient history and I don't think > >>> we'll > >>>> give any renewed consideration to what the impact of this change might > >>> be. > >>>> Ideally it would be great to see this work in HDFS by that point and > >> for > >>>> that HDFS version this becomes a non-issue. > >>>> > >>>> > >>>>> > >>>>> I think the discussion here has been helpful. Holes have been found > >> (and > >>>>> plugged), the risk involved has gotten a good airing out here on dev, > >>> and > >>>>> in spite of the back and forth, one of our experts in good standing > is > >>>>> still against it being on by default. > >>>>> > >>>>> If you are not down w/ the arguments, I'd be fine not making it the > >>>>> default. > >>>>> St.Ack > >>>> > >>>> I don't think it's right to block this by myself, since I'm clearly in > >>> the > >>>> minority. Since others clearly support this change, have at it. > >>>> > >>>> But let me pose an alternate question: what if HDFS flat out refuses > to > >>>> adopt this change? What are our options then with this already > >> shipping > >>> as > >>>> a default? Would we continue to endure breakage due to the use of > HDFS > >>>> private internals? Do we switch the default back? Do we do something > >>> else? > >>>> > >>>> Thanks for the discussion. > >> >
