Is there going to be a general upgrade of dependencies? I'm thinking of jetty & jackson in particular.
On Mar 5, 2015, at 5:24 PM, Andrew Wang <andrew.w...@cloudera.com> wrote: > I've taken the liberty of adding a Hadoop 3 section to the Roadmap wiki > page. In addition to the two things I've been pushing, I also looked > through Allen's list (thanks Allen for making this) and picked out the > shell script rewrite and the removal of HFTP as big changes. This would be > the place to propose features for inclusion in 3.x, I'd particularly > appreciate help on the YARN/MR side. > > Based on what I'm hearing, let me modulate my proposal to the following: > > - We avoid cutting branch-3, and release off of trunk. The trunk-only > changes don't look that scary, so I think this is fine. This does mean we > need to be more rigorous before merging branches to trunk. I think > Vinod/Giri's work on getting test-patch.sh runs on non-trunk branches would > be very helpful in this regard. > - We do not include anything to break wire compatibility unless (as Jason > says) it's an unbelievably awesome feature. > - No harm in rolling alphas from trunk, as it doesn't lock us to anything > compatibility wise. Downstreams like releases. > > I'll take Steve's advice about not locking GA to a given date, but I also > share his belief that we can alpha/beta/GA faster than it took for Hadoop > 2. Let's roll some intermediate releases, work on the roadmap items, and > see how we're feeling in a few months. > > Best, > Andrew > > On Thu, Mar 5, 2015 at 3:21 PM, Siddharth Seth <ss...@apache.org> wrote: > >> I think it'll be useful to have a discussion about what else people would >> like to see in Hadoop 3.x - especially if the change is potentially >> incompatible. Also, what we expect the release schedule to be for major >> releases and what triggers them - JVM version, major features, the need for >> incompatible changes ? Assuming major versions will not be released every 6 >> months/1 year (adoption time, fairly disruptive for downstream projects, >> and users) - considering additional features/incompatible changes for 3.x >> would be useful. >> >> Some features that come to mind immediately would be >> 1) enhancements to the RPC mechanics - specifically support for AsynRPC / >> two way communication. There's a lot of places where we re-use heartbeats >> to send more information than what would be done if the PRC layer supported >> these features. Some of this can be done in a compatible manner to the >> existing RPC sub-system. Others like 2 way communication probably cannot. >> After this, having HDFS/YARN actually make use of these changes. The other >> consideration is adoption of an alternate system ike gRpc which would be >> incompatible. >> 2) Simplification of configs - potentially separating client side configs >> and those used by daemons. This is another source of perpetual confusion >> for users. >> >> Thanks >> - Sid >> >> >> On Thu, Mar 5, 2015 at 2:46 PM, Steve Loughran <ste...@hortonworks.com> >> wrote: >> >>> Sorry, outlook dequoted Alejandros's comments. >>> >>> Let me try again with his comments in italic and proofreading of mine >>> >>> On 05/03/2015 13:59, "Steve Loughran" <ste...@hortonworks.com<mailto: >>> ste...@hortonworks.com>> wrote: >>> >>> >>> >>> On 05/03/2015 13:05, "Alejandro Abdelnur" <tuc...@gmail.com<mailto: >>> tuc...@gmail.com><mailto:tuc...@gmail.com>> wrote: >>> >>> IMO, if part of the community wants to take on the responsibility and >> work >>> that takes to do a new major release, we should not discourage them from >>> doing that. >>> >>> Having multiple major branches active is a standard practice. >>> >>> Looking @ 2.x, the major work (HDFS HA, YARN) meant that it did take a >>> long time to get out, and during that time 0.21, 0.22, got released and >>> ignored; 0.23 picked up and used in production. >>> >>> The 2.04-alpha release was more of a troublespot as it got picked up >>> widely enough to be used in products, and changes were made between that >>> alpha & 2.2 itself which raised compatibility issues. >>> >>> For 3.x I'd propose >>> >>> >>> 1. Have less longevity of 3.x alpha/beta artifacts >>> 2. Make clear there are no guarantees of compatibility from alpha/beta >>> releases to shipping. Best effort, but not to the extent that it gets in >>> the way. More succinctly: we will care more about seamless migration from >>> 2.2+ to 3.x than from a 3.0-alpha to 3.3 production. >>> 3. Anybody who ships code based on 3.x alpha/beta to recognise and >>> accept policy (2). Hadoop's "instability guarantee" for the 3.x >> alpha/beta >>> phase >>> >>> As well as backwards compatibility, we need to think about Forwards >>> compatibility, with the goal being: >>> >>> Any app written/shipped with the 3.x release binaries (JAR and native) >>> will work in and against a 3.y Hadoop cluster, for all x, y in Natural >>> where y>=x and is-release(x) and is-release(y) >>> >>> That's important, as it means all server-side changes in 3.x which are >>> expected to to mandate client-side updates: protocols, HDFS erasure >>> decoding, security features, must be considered complete and stable >> before >>> we can say is-release(x). In an ideal world, we'll even get the semantics >>> right with tests to show this. >>> >>> Fixing classpath hell downstream is certainly one feature I am +1 on. >> But: >>> it's only one of the features, and given there's not any design doc on >> that >>> JIRA, way too immature to set a release schedule on. An alpha schedule >> with >>> no-guarantees and a regular alpha roll, could be viable, as new features >> go >>> in and can then be used to experimentally try this stuff in branches of >>> Hbase (well volunteered, Stack!), etc. Of course instability guarantees >>> will be transitive downstream. >>> >>> >>> This time around we are not replacing the guts as we did from Hadoop 1 to >>> Hadoop 2, but superficial surgery to address issues were not considered >> (or >>> was too much to take on top of the guts transplant). >>> >>> For the split brain concern, we did a great of job maintaining Hadoop 1 >> and >>> Hadoop 2 until Hadoop 1 faded away. >>> >>> And a significant argument about 2.0.4-alpha to 2.2 protobuf/HDFS >>> compatibility. >>> >>> >>> Based on that experience I would say that the coexistence of Hadoop 2 and >>> Hadoop 3 will be much less demanding/traumatic. >>> >>> The re-layout of all the source trees was a major change there, assuming >>> there's no refactoring or switch of build tools then picking things back >>> will be tractable >>> >>> >>> Also, to facilitate the coexistence we should limit Java language >> features >>> to Java 7 (even if the runtime is Java 8), once Java 7 is not used >> anymore >>> we can remove this limitation. >>> >>> +1; setting javac.version will fix this >>> >>> What is nice about having java 8 as the base JVM is that it means you can >>> be confident that all Hadoop 3 servers will be JDK8+, so downstream apps >>> and libs can use all Java 8 features they want to. >>> >>> There's one policy change to consider there which is possibly, just >>> possibly, we could allow new modules in hadoop-tools to adopt Java 8 >>> languages early, provided everyone recognised that "backport to branch-2" >>> isn't going to happen. >>> >>> -Steve >>> >>> >>