On Thu, Apr 10, 2014 at 6:49 AM, Raymie Stata <rst...@altiscale.com> wrote:

> I think the problem to be solved here is to define a point in time
> when the average Hadoop contributor can start using Java7 dependencies
> in their code.
>
> The "use Java7 dependencies in trunk(/branch3)" plan, by itself, does
> not solve this problem.  The average Hadoop contributor wants to see
> their contributions make it into a stable release in a predictable
> amount of time.  Putting code with a Java7 dependency into trunk means
> the exact opposite: there is no timeline to a stable release.  So most
> contributors will stay away from Java7 dependencies, despite the
> nominal policy that they're allowed in trunk.  (And the few that do
> use Java7 dependencies are people who do not value releasing code into
> stable releases, which arguably could lead to a situation that the
> Java7-dependent code in trunk is, on average, on the buggy side.)
>
> I'm not saying the "branch2-in-the-future" plan is the only way to
> solve the problem of putting Java7 dependencies on a known time-table,
> but at least it solves it.  Is there another solution?
>

All good reasons for why we should start thinking about a plan for v3. The
points above pertain to any features for trunk that break compatibility,
not just ones that use new Java APIs.  We shouldn't permit incompatible
changes to merge to v2 just because we don't yet have a timeline for v3, we
should figure out the latter. Also motivates finishing the work to isolate
dependencies between Hadoop code, other framework code, and user code.

Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.

Thanks,
Eli






>
> On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran <ste...@hortonworks.com>
> wrote:
> > On 9 April 2014 23:52, Eli Collins <e...@cloudera.com> wrote:
> >
> >>
> >>
> >> For the sake of this discussion we should separate the runtime from
> >> the programming APIs. Users are already migrating to the java7 runtime
> >> for most of the reasons listed below (support, performance, bugs,
> >> etc), and the various distributions cert their Hadoop 2 based
> >> distributions on java7.  This gives users many of the benefits of
> >> java7, without forcing users off java6. Ie Hadoop does not need to
> >> switch to the java7 programming APIs to make sure everyone has a
> >> supported runtime.
> >>
> >>
> > +1: you can use Java 7 today; I'm not sure how tested Java 8 is
> >
> >
> >> The question here is really about when Hadoop, and the Hadoop
> >> ecosystem (since adjacent projects often end up in the same classpath)
> >> start using the java7 programming APIs and therefore break
> >> compatibility with java6 runtimes. I think our java6 runtime users
> >> would consider dropping support for their java runtime in an update of
> >> a major release to be an incompatible change (the binaries stop
> >> working on their current jvm).
> >
> >
> > do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?
> >
> >
> >> That may be worth it if we can
> >> articulate sufficient value to offset the cost (they have to upgrade
> >> their environment, might make rolling upgrades stop working, etc), but
> >> I've not yet heard an argument that articulates the value relative to
> >> the cost.  Eg upgrading to the java7 APIs allows us to pull in
> >> dependencies with new major versions, but only if those dependencies
> >> don't break compatibility (which is likely given that our classpaths
> >> aren't so isolated), and, realistically, only if the entire Hadoop
> >> stack moves to java7 as well
> >
> >
> >
> >
> >> (eg we have to recompile HBase to
> >> generate v1.7 binaries even if they stick on API v1.6). I'm not aware
> >> of a feature, bug etc that really motivates this.
> >>
> >> I don't see that being needed unless we move up to new java7+ only
> > libraries and HBase needs to track this.
> >
> >  The big "recompile to work" issue is google guava, which is troublesome
> > enough I'd be tempted to say "can we drop it entirely"
> >
> >
> >
> >> An alternate approach is to keep the current stable release series
> >> (v2.x) as is, and start using new APIs in trunk (for v3). This will be
> >> a major upgrade for Hadoop and therefore an incompatible change like
> >> this is to be expected (it would be great if this came with additional
> >> changes to better isolate classpaths and dependencies from each
> >> other). It allows us to continue to support multiple types of users
> >> with different branches, vs forcing all users onto a new version. It
> >> of course means that 2.x users will not get the benefits of the new
> >> API, but its unclear what those benefits are given theIy can already
> >> get the benefits of adopting the newer java runtimes today.
> >>
> >>
> >>
> > I'm (personally) +1 to this, I also think we should plan to do the switch
> > some time this year to not only get the benefits, but discover the costs
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>

Reply via email to