Er, that should read "as Allen commented" C.
On Tue, Mar 10, 2015 at 11:55 AM, Colin P. McCabe <cmcc...@apache.org> wrote: > Hi Arun, > > Not all changes which are incompatible can be "fixed"-- sometimes an > incompatibility is a necessary part of a change. For example, taking > a really old library dependency with known security issues off the > CLASSPATH will create incompatibilities, but it's also necessary. A > minimum JDK version bump also falls in that category. There are also > cases where we need to drop support for really obsolete and baroque > features from the past. For example, it would be nice if we could > finally get rid of the code to read pre-transactional edit logs. It's > a substantial amount of code. We could argue that we should just > support legacy stuff forever, but code quality will suffer. > > These changes need to be made sooner or later, and a major version > bump is an ideal place to make them. I think that making these > changes in a 2.x release is hostile to operators, as Alan commented. > That's what we're trying to avoid by discussing Hadoop 3.x. > > Colin > > On Mon, Mar 9, 2015 at 3:54 PM, Arun Murthy <a...@hortonworks.com> wrote: >> Colin, >> >> Do you have a list of incompatible changes other than the shell-script >> rewrite? If we do have others we'd have to fix them anyway for the current >> plan on hadoop-3.x right? So, I don't see the difference? >> >> Arun >> >> ________________________________________ >> From: Colin P. McCabe <cmcc...@apache.org> >> Sent: Monday, March 09, 2015 3:05 PM >> To: hdfs-dev@hadoop.apache.org >> Cc: mapreduce-...@hadoop.apache.org; common-...@hadoop.apache.org; >> yarn-...@hadoop.apache.org >> Subject: Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015? >> >> Java 7 will be end-of-lifed in April 2015. I think it would be unwise >> to plan a new Hadoop release against a version of Java that is almost >> obsolete and (soon) no longer receiving security updates. I think >> people will be willing to roll out a new version of Java for Hadoop >> 3.x. >> >> Similarly, the whole point of bumping the major version number is the >> ability to make incompatible changes. There are already a bunch of >> incompatible changes in the trunk branch. Are you proposing to revert >> those? Or push them into newly created feature branches? This >> doesn't seem like a good idea to me. >> >> I would be in favor of backporting targetted incompatible changes from >> trunk to branch-2. For example, we could consider pulling in Allen's >> shell script rewrite. But pulling in all of trunk seems like a bad >> idea at this point, if we want a 2.x release. >> >> best, >> Colin >> >> On Mon, Mar 9, 2015 at 2:15 PM, Steve Loughran <ste...@hortonworks.com> >> wrote: >>> >>> If 3.x is going to be Java 8 & not backwards compatible, I don't expect >>> anyone wanting to use this in production until some time deep into 2016. >>> >>> Issue: JDK 8 vs 7 >>> >>> It will require Hadoop clusters to move up to Java 8. While there's dev >>> pull for this, there's ops pull against this: people are still in the >>> moving-off Java 6 phase due to that "it's working, don't update it" >>> philosophy. Java 8 is compelling to us coders, but that doesn't mean ops >>> want it. >>> >>> You can run JDK-8 code in a YARN cluster running on Hadoop 2.7 *today*, the >>> main thing is setting up JAVA_HOME. That's something we could make easier >>> somehow (maybe some min Java version field in resource requests that will >>> let apps say java 8, java 9, ...). YARN could not only set up JVM paths, it >>> could fail-fast if a Java version wasn't available. >>> >>> What we can't do in hadoop coretoday is set javac.version=1.8 & use java 8 >>> code. Downstream code ca do that (Hive, etc); they just need to accept that >>> they don't get to play on JDK7 clusters if they embrace l-expressions. >>> >>> So...we need to stay on java 7 for some time due to ops pull; downstream >>> apps get to choose what they want. We can/could enhance YARN to make JVM >>> choice more declarative. >>> >>> Issue: Incompatible changes >>> >>> Without knowing what is proposed for "an incompatible classpath change", I >>> can't say whether this is something that could be made optional. If it >>> isn't, then it is a python-3 class option, "rewrite your code" event, which >>> is going to be particularly traumatic to things like Hive that already do >>> complex CP games. I'm currently against any mandatory change here, though >>> would love to see an optional one. And if optional, it ceases to become an >>> incompatible change... >>> >>> Issue: Getting trunk out the door >>> >>> The main diff from branch-2 and trunk is currently the bash script changes. >>> These don't break client apps. May or may not break bigtop & other >>> downstream hadoop stacks, but developers don't need to worry about this: >>> no recompilation necessary >>> >>> Proposed: ship trunk as a 2.x release, compatible with JDK7 & Java code. >>> >>> It seems to me that I could go >>> >>> git checkout trunk >>> mvn versions:set -DnewVersion=2.8.0-SNAPSHOT >>> >>> We'd then have a version of Hadoop-trunk we could ship later this year, >>> compatible at the JDK and API level with the existing java code & JDK7+ >>> clusters. >>> >>> A classpath fix that is optional/compatible can then go out on the 2.x >>> line, saving the 3.x tag for something that really breaks things, forces >>> all downstream apps to set up new hadoop profiles, have separate modules & >>> generally hate the hadoop dev team >>> >>> This lets us tick off the "recent trunk release" and "fixed shell scripts" >>> items, pushing out those benefits to people sooner rather than later, and >>> puts off the "Hello, we've just broken your code" event for another 12+ >>> months. >>> >>> Comments? >>> >>> -Steve >>> >>> >>>