>> You mentioned rolling-upgrades: It will be good to exactly outline the type of testing. For e.g., the rolling-upgrades orchestration order has direct implication on the testing done.
Complete details are available in HDFS-11096 where I'm trying to get scripts to automate these tests committed so we can run them on Jenkins. For HDFS, I follow the same order as the documentation. I did not see any documentation indicate when to upgrade zkfc daemons, so it is done at the end. I also did not see any documentation about a rolling upgrade for YARN, so I'm doing ResourceManagers first then NodeManager, basically following the pattern used in HDFS. I can't speak much about app compatibility in YARN, etc. but the rolling upgrade runs Terasuite from Hadoop 2 continually while doing the upgrade and for sometime afterward. 1 incompatibility was found and fixed in trunk quite a while ago - that part of the test has been working well for quite a while now. >> Copying data between 2.x clusters and 3.x clusters: Does this work already? Is it broken anywhere that we cannot fix? Do we need bridging features for this work? HDFS-11096 also includes tests that data can be copied distcp'd over webhdfs:// to and from old and new clusters regardless of where the distcp job is launched from. I'll try a test run that uses hdfs:// this week, too. As part of that JIRA I also looked through all the protobuf's for any discrepancies / incompatibilities. One was found and fixed, but the rest looked good to me. On Mon, Nov 6, 2017 at 6:42 PM, Vinod Kumar Vavilapalli <vino...@apache.org> wrote: > The main goal of the bridging release is to ease transition on stuff that > is guaranteed to be broken. > > Of the top of my head, one of the biggest areas is application > compatibility. When folks move from 2.x to 3.x, are their apps binary > compatible? Source compatible? Or need changes? > > In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be > source compatible. This means relooking at the API compatibility in 3.x and > their impact of migrating applications. We will have to revist and > un-deprecate old APIs, un-delete old APIs and write documentation on how > apps can be migrated. > > Most of this work will be in 3.x line. The bridging release on the other > hand will have deprecation for APIs that cannot be undeleted. This may be > already have been done in many places. But we need to make sure and fill > gaps if any. > > Other areas that I can recall from the old days > - Config migration: Many configs are deprecated or deleted. We need > documentation to help folks to move. We also need deprecations in the > bridging release for configs that cannot be undeleted. > - You mentioned rolling-upgrades: It will be good to exactly outline the > type of testing. For e.g., the rolling-upgrades orchestration order has > direct implication on the testing done. > - Story for downgrades? > - Copying data between 2.x clusters and 3.x clusters: Does this work > already? Is it broken anywhere that we cannot fix? Do we need bridging > features for this work? > > +Vinod > > > On Nov 6, 2017, at 12:49 PM, Andrew Wang <andrew.w...@cloudera.com> > wrote: > > > > What are the known gaps that need bridging between 2.x and 3.x? > > > > From an HDFS perspective, we've tested wire compat, rolling upgrade, and > > rollback. > > > > From a YARN perspective, we've tested wire compat and rolling upgrade. > Arun > > just mentioned an NM rollback issue that I'm not familiar with. > > > > Anything else? External to this discussion, these should be documented as > > known issues for 3.0. > > > > Best. > > Andrew > > > > On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh <asur...@apache.org> wrote: > > > >> Thanks for starting this discussion VInod. > >> > >> I agree (C) is a bad idea. > >> I would prefer (A) given that ATM, branch-2 is still very close to > >> branch-2.9 - and it is a good time to make a collective decision to lock > >> down commits to branch-2. > >> > >> I think we should also clearly define what the 'bridging' release should > >> be. > >> I assume it means the following: > >> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging > >> release first and then upgrade to the 3.x release. > >> * With regard to state store upgrades (at least NM state stores) the > >> bridging state stores should be aware of all new 3.x keys so the > implicit > >> assumption would be that a user can only rollback from the 3.x release > to > >> the bridging release and not to the old 2.x release. > >> * Use the opportunity to clean up deprecated API ? > >> * Do we even want to consider a separate bridging release for 2.7, 2.8 > an > >> 2.9 lines ? > >> > >> Cheers > >> -Arun > >> > >> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli < > >> vino...@apache.org> > >> wrote: > >> > >>> Hi all, > >>> > >>> With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC > out > >>> (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we > >> have > >>> a discussion on how we manage our developmental bandwidth between 2.x > >> line > >>> and 3.x lines. > >>> > >>> Once 3.0 GA goes out, we will have two parallel and major release > lines. > >>> The last time we were in this situation was back when we did 1.x -> 2.x > >>> jump. > >>> > >>> The parallel releases implies overhead of decisions, branch-merges and > >>> back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1, > >>> 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many > of > >>> these lines - for e.g 2.8, 2.9 - are going to be used for a while at a > >>> bunch of large sites! At the same time, our users won't migrate to 3.0 > GA > >>> overnight - so we do have to support two parallel lines. > >>> > >>> I propose we start thinking of the fate of branch-2. The idea is to > have > >>> one final release that helps our users migrate from 2.x to 3.x. This > >>> includes any changes on the older line to bridge compatibility issues, > >>> upgrade issues, layout changes, tooling etc. > >>> > >>> We have a few options I think > >>> (A) > >>> -- Make 2.9.x the last minor release off branch-2 > >>> -- Have a maintenance release that bridges 2.9 to 3.x > >>> -- Continue to make more maintenance releases on 2.8 and 2.9 as > >>> necessary > >>> -- All new features obviously only go into the 3.x line as no > >> features > >>> can go into the maint line. > >>> > >>> (B) > >>> -- Create a new 2.10 release which doesn't have any new features, > but > >>> as a bridging release > >>> -- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 > as > >>> necessary > >>> -- All new features, other than the bridging changes, go into the > 3.x > >>> line > >>> > >>> (C) > >>> -- Continue making branch-2 releases and postpone this discussion > for > >>> later > >>> > >>> I'm leaning towards (A) or to a lesser extent (B). Willing to hear > >>> otherwise. > >>> > >>> Now, this obviously doesn't mean blocking of any more minor releases on > >>> branch-2. Obviously, any interested committer / PMC can roll up his/her > >>> sleeves, create a release plan and release, but we all need to > >> acknowledge > >>> that versions are not cheap and figure out how the community bandwidth > is > >>> split overall. > >>> > >>> Thanks > >>> +Vinod > >>> PS: The proposal is obviously not to force everyone to go in one > >> direction > >>> but more of a nudging the community to figure out if we can focus a > major > >>> part of of our bandwidth on one line. I had a similar concern when we > >> were > >>> doing 2.8 and 3.0 in parallel, but the impending possibility of > spreading > >>> too thin is much worse IMO. > >>> PPS: (C) is a bad choice. With 2.8 and 2.9 we are already seeing user > >>> adoption splintering between two lines. With 2.10, 2.11 etc coexisting > >> with > >>> 3.0, 3.1 etc, we will revisit the mad phase years ago when we had > 0.20.x, > >>> 0.20-security coexisting with 0.21, 0.22 etc. > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >