Thanks Vinod to bring up this discussion, which is just in time. I agree with most responses that option C is not a good choice as our community bandwidth is precious and we should focus on very limited mainstream branches to develop, test and deployment. Of course, we should still follow Apache way to allow any interested committer for rolling up his/her own release given specific requirement over the mainstream releases.
I am not biased on option A or B (I will discuss this later), but I think a bridge release for upgrading to and back from 3.x is very necessary. The reasons are obviously: 1. Given lesson learned from previous experience of migration from 1.x to 2.x, no matter how careful we tend to be, there is still chance that some level of compatibility (source, binary, configuration, etc.) get broken for the migration to new major release. Some of these incompatibilities can only be identified in runtime after GA release with widely deployed in production cluster - we have tons of downstream projects and numerous configurations and we cannot cover them all from in-house deployment and test. 2. From recent classpath isolation work, I was surprised to find out that many of our downstream projects (HBase, Tez, etc.) are still consuming many non-public, server side APIs of Hadoop, not saying the projects/products outside of hadoop ecosystem. Our API compatibility test does not (and should not) cover these cases and situations. We can claim that new major release shouldn't be responsible for these private API changes. But given the possibility of breaking existing applications in some way, users could be very hesitated to migrate to 3.x release if there is no safe solution to roll back. 3. Beside incompatibilities, there is also possible to have performance regressions (lower throughput, higher latency, slower job running, bigger memory footprint or even memory leaking, etc.) for new hadoop releases. While the performance impact of migration (if any) could be neglectable to some users, other users could be very sensitive and wish to roll back if it happens on their production cluster. As Andrew mentioned in early email threads, some work has been done for verifying rolling upgrade from 2.x to 3.0 (just curious that which 2.x release is tested to upgrade from? 2.8.2 or 2.9.0 which is still in releasing?). But I am not aware any work we are doing now to test downgrade from 3.0 to 2.x (correct me if I miss any work). If users hit any of three situations I mentioned above then we should give them the chance to roll back if they are really conservative to these unexpected side-effect of upgrading. Given this, we should have this bridge release to cover the case for 3.0 safely roll back (no matter rolling or not). I am not sure it should be 2.9.x or 2.10.x for now (we can just call it 2.BR release) because we are not sure what exactly changes we should include for supporting roll back from 3.0 at this moment. We can defer this decision to discuss later when we have better ideas. Summary for my two cents: - No more feature release should happen on branch-2. 2.9 or 2.10 should be the last minor release (mainstream of community) on branch-2 - A bridge release is necessary for safely upgrade/downgrade to 3.x - We can decide later to see if 2.10 is necessary when scope of the bridge release is more clear. Thanks, Junping ________________________________________ From: Andrew Wang <andrew.w...@cloudera.com> Sent: Tuesday, November 14, 2017 2:25 PM To: Wangda Tan Cc: Steve Loughran; Vinod Kumar Vavilapalli; Kai Zheng; Arun Suresh; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: [DISCUSS] A final minor release off branch-2? To follow up on my earlier email, I don't think there's need for a bridge release given that we've successfully tested rolling upgrade from 2.x to 3.0.0. I expect we'll keep making improvements to smooth over any additional incompatibilities found, but there isn't a requirement that a user upgrade to a bridge release before upgrading to 3.0. Otherwise, I don't have a strong opinion about when to discontinue branch-2 releases. Historically, a release line is maintained until interest in it wanes. If the maintainers are taking care of the backports, it's not much work for the rest of us to vote on the RCs. Best, Andrew On Mon, Nov 13, 2017 at 4:19 PM, Wangda Tan <wheele...@gmail.com> wrote: > Thanks Vinod for staring this, > > I'm also leaning towards the plan (A): > > > > > * (A) -- Make 2.9.x the last minor release off branch-2 -- Have a > maintenance release that bridges 2.9 to 3.x -- Continue to make more > maintenance releases on 2.8 and 2.9 as necessary* > > The only part I'm not sure is having a separate bridge release other than > 3.x. > > For the bridge release, Steve's suggestion sounds more doable: > > ** 3.1+ for new features* > ** fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation* > ** whoever puts their hand up to do 2.x releases deserves support in > testing &c* > ** If someone makes a really strong case to backport a feature from 3.x to > branch-2 and its backwards compatible, I'm not going to stop them. It's > just once 3.0 is out and a 3.1 on the way, it's less compelling* > > This makes community can focus on 3.x releases and fill whatever gaps of > migrating from 2.x to 3.x. > > Best, > Wangda > > > On Wed, Nov 8, 2017 at 3:57 AM, Steve Loughran <ste...@hortonworks.com> > wrote: > >> >> > On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalli <vino...@apache.org> >> wrote: >> > >> > >> > >> > >> >> Frankly speaking, working on some bridging release not targeting any >> feature isn't so attractive to me as a contributor. Overall, the final >> minor release off branch-2 is good, we should also give 3.x more time to >> evolve and mature, therefore it looks to me we would have to work on two >> release lines meanwhile for some time. I'd like option C), and suggest we >> focus on the recent releases. >> > >> > >> > >> > Answering this question is also one of the goals of my starting this >> thread. Collectively we need to conclude if we are okay or not okay with no >> longer putting any new feature work in general on the 2.x line after 2.9.0 >> release and move over our focus into 3.0. >> > >> > >> > Thanks >> > +Vinod >> > >> >> >> As a developer of new features (e.g the Hadoop S3A committers), I'm >> mostly already committed to targeting 3.1; the code in there to deal with >> failures and retries has unashamedly embraced java 8 lambda-expressions in >> production code: backporting that is going to be traumatic in terms of >> IDE-assisted code changes and the resultant diff in source between branch-2 >> & trunk. What's worse, its going to be traumatic to test as all my JVMs >> start with an 8 at the moment, and I'm starting to worry about whether I >> should bump a windows VM up to Java 9 to keep an eye on Akira's work there. >> Currently the only testing I'm really doing on java 7 is yetus branch-2 & >> internal test runs. >> >> >> 3.0 will be out the door, and we can assume that CDH will ship with it >> soon (*) which will allow for a rapid round trip time on inevitable bugs: >> 3.1 can be the release with compatibility tuned, those reported issues >> addressed. It's certainly where I'd like to focus. >> >> >> At the same time: 2.7.2-2.8.x are the broadly used versions, we can't >> just say "move to 3.0" & expect everyone to do it, not given we have >> explicitly got backwards-incompatible changes in. I don't seen people >> rushing to do it until the layers above are all qualified (HBase, Hive, >> Spark, ...). Which means big users of 2.7/2,8 won't be in a rush to move >> and we are going to have to maintain 2.x for a while, including security >> patches for old versions. One issue there: what if a patch (such as bumping >> up a JAR version) is incompatible? >> >> For me then >> >> * 3.1+ for new features >> * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation >> * whoever puts their hand up to do 2.x releases deserves support in >> testing &c >> * If someone makes a really strong case to backport a feature from 3.x to >> branch-2 and its backwards compatible, I'm not going to stop them. It's >> just once 3.0 is out and a 3.1 on the way, it's less compelling >> >> -Steve >> >> Note: I'm implicitly assuming a timely 3.1 out the door with my work >> included, all all issues arriving from 3,0 fixed. We can worry when 3.1 >> ships whether there's any benefit in maintaining a 3.0.x, or whether it's >> best to say "move to 3.1" >> >> >> >> (*) just a guess based the effort & test reports of Andrew & others >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org >> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org