Re: Plans of moving towards JDK7 in trunk

2014-06-23 Thread Vinod Kumar Vavilapalli
Hey all,

This one started as an innocuous thread of enabling JDK7 on trunk and now it 
seems like (haven't still finished reading the entire thing, and I started a 
while ago) it has become a full blown proposal on 2.x, 3.x and 4.x releases. 
Some of us haven't been tracking this (at least me and a few others who 
indicated offline as such) assuming this is only about letting Jenkins run 
JDK7, but it has the potential to impact all future work.

I propose we fork this thread into a new one which clarifies the topic clearly 
for others to follow too.

Thanks,
+Vinod

On Jun 23, 2014, at 1:53 PM, sanjay Radia  wrote:

> 
> On Jun 21, 2014, at 8:01 AM, Andrew Wang  wrote:
> 
>> This is why I'd like to keep my original proposal on the table: keep going
>> with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
>> by April next year. It doesn't need to be a big bang release either. I'd be
>> delighted if we could rolling upgrade from one to the other. I just didn't
>> want to rule out the inclusion of some very compelling feature outright.
>> Trust me though, I'd be the first person to ask about compatibility if such
>> a feature does come up.
> 
> 
> Given your above statement  on compatibility (such as rolling upgrades),  it 
> should be fine for the JDK8-based-Hadoop-release to not be 3.0 and instead 
> merely be 2.x? Or do you have any incompatible changes to Hadoop protocol or 
> APIs in mind during the same time period?
> 
> sanjay
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Plans of moving towards JDK7 in trunk

2014-06-23 Thread Sandy Ryza
Andrew, correct me if I'm misunderstanding, but the incompatible change
that would require a major version bump is dropping support for JDK6.


On Mon, Jun 23, 2014 at 1:53 PM, sanjay Radia 
wrote:

>
> On Jun 21, 2014, at 8:01 AM, Andrew Wang  wrote:
>
> > This is why I'd like to keep my original proposal on the table: keep
> going
> > with branch-2 in the near term, while working towards a JDK8-based
> Hadoop 3
> > by April next year. It doesn't need to be a big bang release either. I'd
> be
> > delighted if we could rolling upgrade from one to the other. I just
> didn't
> > want to rule out the inclusion of some very compelling feature outright.
> > Trust me though, I'd be the first person to ask about compatibility if
> such
> > a feature does come up.
>
>
> Given your above statement  on compatibility (such as rolling upgrades),
>  it should be fine for the JDK8-based-Hadoop-release to not be 3.0 and
> instead merely be 2.x? Or do you have any incompatible changes to Hadoop
> protocol or APIs in mind during the same time period?
>
> sanjay
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: Plans of moving towards JDK7 in trunk

2014-06-23 Thread sanjay Radia

On Jun 21, 2014, at 8:01 AM, Andrew Wang  wrote:

> This is why I'd like to keep my original proposal on the table: keep going
> with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
> by April next year. It doesn't need to be a big bang release either. I'd be
> delighted if we could rolling upgrade from one to the other. I just didn't
> want to rule out the inclusion of some very compelling feature outright.
> Trust me though, I'd be the first person to ask about compatibility if such
> a feature does come up.


Given your above statement  on compatibility (such as rolling upgrades),  it 
should be fine for the JDK8-based-Hadoop-release to not be 3.0 and instead 
merely be 2.x? Or do you have any incompatible changes to Hadoop protocol or 
APIs in mind during the same time period?

sanjay
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-21 Thread Alejandro Abdelnur
On Fri, Jun 20, 2014 at 10:02 PM, Arun C Murthy  wrote:

> > Hadoop  3.x out the door later this year
>
> +1 that makes sense to me. Thanks for volunteering Steve - I'm glad to
> share the pain… ;-)


Hey Arun, you may have missed that Andrew volunteered for doing this as
well (the thread is long, so easy to miss).

Cheers

-- 
Alejandro


Re: Plans of moving towards JDK7 in trunk

2014-06-21 Thread Arun C Murthy

After further consideration, here is an alternate.

On Jun 21, 2014, at 11:14 AM, "Arun C. Murthy"  wrote:
> 
> JDK6 eol was Feb 2013 and, a year later, we are still have customers using it 
> - which means we can't drop it yet.
> 
> http://www.oracle.com/technetwork/java/eol-135779.html
> 
> Given that, it seems highly unlikely everyone will suddenly jump to JDK8 by 
> April of next year... I suspect this means we'd have to support JDK7 at least 
> till late 2015. I think, that, is really key regardless of version numbers.
> 
> Furthermore, if we, as a community, maintain discipline in terms of 
> wire-compat, rolling-upgrades etc. we are better off making a major release 
> every year - as you put, no more 'Big Bang' releases.


Looking at the big picture, I believe the users of Apache Hadoop would be 
better served by us if we prioritized operational aspects such as rolling 
upgrades, wire-compatibility, binary etc. for a couple of years.

Since not everyone has moved to hadoop-2 yet, talk of more incompatibility 
between hadoop-2/hadoop-3 or between hadoop-3/hadoop-4 within the next 12 
months would certainly be a big issue for users - especially w.r.t rolling 
upgrades, wire-compat etc.

So, I think we should prioritize these operational aspects for users above 
everything else. Sure, jdk versions, features etc. are important, but lower in 
priority.

I'd also like to reiterate my concern on *dropping* support for a JDK7 - we 
need to support it till end of 2015 at the very least; happy to ship a version 
of Hadoop which is JDK8 only in 2015 - it just needs to support 
rolling-upgrades from the JDK7 Hadoop till end of 2015.

With that in mind... I actually like Andrew's suggestion below:

>  On Jun 21, 2014, at 8:01 AM, Andrew Wang  wrote:
> 
>  I'd be more okay with an intermediate release with no incompatible changes
>  whatsoever besides bumping the JDK requirement to JDK7.

Taking that thought to it's logical conclusion, we can de-couple the dual 
concerns of JDK versions and major releases but bumping up our software 
dependencies (JDK, guice etc.) at well-defined and well-articulated releases.

The reason to so would be to ensure we *do not* sneak in operational 
incompatibilities in the guise of bumping JDK versions.

So, we could do something like:
# hadoop-2.30+ is JDK7, but provides rolling upgrades and wire-compat with 
hadoop-2.2+; say in Oct 2014
# hadoop-2.50+ is JDK8, but provides rolling upgrades and wire-compat with 
hadoop-2.2+; say in June 2015 (or even earlier).

This scheme certainly has some dis-advantages, however it has the significant 
advantage of making it *very* clear to end-users and administrators that we 
take operational aspects seriously.

Also, this is something we already have done i.e. we updated some of our 
software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as dramatic 
as JDK. Here are some examples:
https://issues.apache.org/jira/browse/HADOOP-9991
https://issues.apache.org/jira/browse/HADOOP-10102
https://issues.apache.org/jira/browse/HADOOP-10103
https://issues.apache.org/jira/browse/HADOOP-10104
https://issues.apache.org/jira/browse/HADOOP-10503

In summary, the key goals we should keep in mind are:
# Operational aspects such as rolling upgrades, wire-compat etc. for the next 
couple of years.
# Support JDK7 till end of 2015 at least, even if we decide to support JDK8 
sometime in 2015. Just ensure wire-compat, rolling-upgrades etc.

Thoughts?

thanks,
Arun
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-21 Thread Arun C. Murthy
Andrew,


> On Jun 21, 2014, at 8:01 AM, Andrew Wang  wrote:
> 
> Hi Steve, let me confirm that I understand your proposal correctly:
> 
> - Release an intermediate Hadoop 3 a few months out, based on JDK7 and with
> bumped library versions
> - Release a Hadoop 4 mid next year, based on JDK8
> 
> I question the utility of an intermediate Hadoop 3 like this. Assuming that
> it gets out in September (i.e. roughly when a 2.6 would land), we're
> looking at a valid lifespan of about 7 months before JDK7 is EOL i

JDK6 eol was Feb 2013 and, a year later, we are still have customers using it - 
which means we can't drop it yet.

http://www.oracle.com/technetwork/java/eol-135779.html

Given that, it seems highly unlikely everyone will suddenly jump to JDK8 by 
April of next year... I suspect this means we'd have to support JDK7 at least 
till late 2015. I think, that, is really key regardless of version numbers.

Furthermore, if we, as a community, maintain discipline in terms of 
wire-compat, rolling-upgrades etc. we are better off making a major release 
every year - as you put, no more 'Big Bang' releases.

 We have to, as a development community, ourselves get over the 'trauma' of 
major releases - I do realize the irony here - but it's requisite to help our 
users feel confident in upgrading at a reasonable rate.

So, something like this could work:
# hadoop-2 / jdk6 - Oct 2013
# hadoop-3 / jdk7 - Oct 2014
# hadoop-4 / jdk8 - Oct 2015

Having said that, it would also be prudent to co-release hadoop-2/hadoop-3 & 
hadoop-3/hadoop-4 with requisite jdk versions. Maybe even hadoop-4 beta by 
middle of 2015. As such, it a good idea to allow trunk to move to jdk7 now - 
it's good practice as we will have to do the same for jdk8.

It does help, a lot, that we have now de-coupled user dependencies from the 
system with YARN. For e.g. we could run hadoop-2 MR on hadoop-3 YARN, even if 
there is some work remaining... see MAPREDUCE-4551. Future reliance on 
technologies like Docker will help further.

Thoughts?

Arun

> If this release also breaks compatibility by changing library versions,
> then it looks less and less appealing from a user perspective. I suspect it
> would end up seeing low adoption as everyone waits (at most) 7 months for
> the JDK8-based release to emerge.
> 
> I'd be more okay with an intermediate release with no incompatible changes
> whatsoever besides bumping the JDK requirement to JDK7. However, it'd still
> be a weak release considering that branch-2 already runs fine on JDK7, and
> it looks somewhat bad publicly as we burn another major release number less
> than a year since 2.x going GA.
> 
> This is why I'd like to keep my original proposal on the table: keep going
> with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
> by April next year. It doesn't need to be a big bang release either. I'd be
> delighted if we could rolling upgrade from one to the other. I just didn't
> want to rule out the inclusion of some very compelling feature outright.
> Trust me though, I'd be the first person to ask about compatibility if such
> a feature does come up.
> 
> I'll also posit that people will shy away from using JDK8 features while
> branch-2 remains in active use. There's definitely some new shiny there,
> but nothing compelling enough to me personally when weighed against the
> pain of harder branch-2 backports.
> 
> Let's try to keep this thread focused on the planning side of things
> though, deferring JDK-feature-related discussion to a different thread.
> We'd need to draw up a code-style doc on the wiki, but it sounds like
> something Steve and/or I could draft initially.
> 
> Thanks,
> Andrew
> 
> 
>> On Fri, Jun 20, 2014 at 10:02 PM, Arun C Murthy  wrote:
>> 
>> 
>> On Jun 20, 2014, at 9:51 PM, Steve Loughran 
>> wrote:
>> 
 On 20 June 2014 21:35, Steve Loughran  wrote:
 
 
 This actually argues in favour of
 
 -renaming branch-2 branch-3 after a release
 -making trunk hadoop-4
 
 -getting hadoop 3 released off the new branch-3 out in 2014, effectively
 being an iteration of branch-2 with updated java , moves of (off?)
>> guava,
 off jetty, lib changes, but no other significant "big bang" features
 
 
 Hadoop 4.x then becomes the 2015 release, which can add more stuff. In
 particular, anything that goes into Hadoop 4 for which there's no
>> intent to
 support in hadoop 2 & 3, can use the java 8 language features sooner
>> rather
 than later.
>>> I should add that I'm willing to be the person who gets the Java-7 based
>>> Hadoop  3.x out the door later this year
>> 
>> +1 that makes sense to me. Thanks for volunteering Steve - I'm glad to
>> share the pain… ;-)
>> 
>> Arun
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure unde

Re: Plans of moving towards JDK7 in trunk

2014-06-21 Thread Steve Loughran
On 21 June 2014 08:01, Andrew Wang  wrote:

> Hi Steve, let me confirm that I understand your proposal correctly:
>
> - Release an intermediate Hadoop 3 a few months out, based on JDK7 and with
> bumped library versions
> - Release a Hadoop 4 mid next year, based on JDK8
>
> I question the utility of an intermediate Hadoop 3 like this. Assuming that
> it gets out in September (i.e. roughly when a 2.6 would land), we're
> looking at a valid lifespan of about 7 months before JDK7 is EOL in April.
> If this release also breaks compatibility by changing library versions,
> then it looks less and less appealing from a user perspective. I suspect it
> would end up seeing low adoption as everyone waits (at most) 7 months for
> the JDK8-based release to emerge.
>


I'm saying that we'd replace hadooop 2.6 with a 3.x release that, along
with the 2.6 changes, ups the java version and the JARs and dependencies
which we are frozen with in Hadoop 2.x

this issue of dependencies may not be so visible in hadoop's own codebase,
but when you write any downstream project, the majority of the xml
 in your POM file is about excluding stuff Hadoop pulls in. I've
been quietly trying to address this at HADOOP-9991, but we've reached the
limit of what can get in.

I'd be happy enough with the original "Stata Plan": a release of Hadoop 2.x
that says "java 7 + new libs", but given we've committed to not doing that,
releasing a Hadoop 3 stating that lets us get a hadoop with a modern set of
underpinnings out in 2014


>
> I'd be more okay with an intermediate release with no incompatible changes
> whatsoever besides bumping the JDK requirement to JDK7. However, it'd still
> be a weak release considering that branch-2 already runs fine on JDK7, and
> it looks somewhat bad publicly as we burn another major release number less
> than a year since 2.x going GA.
>


it'll be > 1 year for 2.x to 3,

And to be realistic, the move to java 8+ across the entire hadoop stack
will probably take 1y too.


>
> This is why I'd like to keep my original proposal on the table: keep going
> with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
> by April next year. It doesn't need to be a big bang release either. I'd be
> delighted if we could rolling upgrade from one to the other. I just didn't
> want to rule out the inclusion of some very compelling feature outright.
> Trust me though, I'd be the first person to ask about compatibility if such
> a feature does come up.
>
> I'll also posit that people will shy away from using JDK8 features while
> branch-2 remains in active use. There's definitely some new shiny there,
> but nothing compelling enough to me personally when weighed against the
> pain of harder branch-2 backports.
>


branch 2 would be frozen and tell everyone "move to java 7+", everything
downstream gets updated binaries and a chance to move forwards.

There's another issue, which is one Alejandro highlit:

-- Forwarded message --
From: Alejandro Abdelnur 
Date: 10 April 2014 10:30
Subject: Re: Plans of moving towards JDK7 in trunk
To: "common-dev@hadoop.apache.org" 


A bit of a different angle.

As the bottom of the stack Hadoop has to be conservative in adopting
things, but it should not preclude consumers of Hadoop (downstream projects
and Hadoop application developers) to have additional requirements such as
a higher JDK API than JDK6.

Hadoop 2.x should stick to using JDK6  API
Hadoop 2.x should be tested with multiple runtimes: JDK6, JDK7 and
eventually JDK8
Downstream projects and Hadoop application developers are free to require
any JDK6+ version for development and runtime.

Hadoop 3.x should allow using JDK7 API, bumping the minimum runtime
requirement to JDK7 and be tested with JDK7 and JDK8 runtimes.

-- Forwarded message --

The minimum version of Java that Hadoop mandates is going to be the minimum
version of Java that the entire stack has to adopt, and the minimum version
of Java that has to be run in the datacentre.

I wonder about how easily it will be for us all to go to the big hadoop
sites and say "java 8+ only", as well as to all those Hadoop projects that
want to run on java 7 and say "upgrade time". I think we'll hit a lot of
inertia -and, to be fair- it's due to Hadoop core's long-standing support
for Java 6. If Hadoop 2.x had always been java7+ it would be simpler, but
we all know the trauma of getting hadoop 2.2 out the door and our lack of
enthusiasm for any major dependency updates apart from the protobuf one.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is 

Re: Plans of moving towards JDK7 in trunk

2014-06-21 Thread Andrew Wang
Hi Steve, let me confirm that I understand your proposal correctly:

- Release an intermediate Hadoop 3 a few months out, based on JDK7 and with
bumped library versions
- Release a Hadoop 4 mid next year, based on JDK8

I question the utility of an intermediate Hadoop 3 like this. Assuming that
it gets out in September (i.e. roughly when a 2.6 would land), we're
looking at a valid lifespan of about 7 months before JDK7 is EOL in April.
If this release also breaks compatibility by changing library versions,
then it looks less and less appealing from a user perspective. I suspect it
would end up seeing low adoption as everyone waits (at most) 7 months for
the JDK8-based release to emerge.

I'd be more okay with an intermediate release with no incompatible changes
whatsoever besides bumping the JDK requirement to JDK7. However, it'd still
be a weak release considering that branch-2 already runs fine on JDK7, and
it looks somewhat bad publicly as we burn another major release number less
than a year since 2.x going GA.

This is why I'd like to keep my original proposal on the table: keep going
with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
by April next year. It doesn't need to be a big bang release either. I'd be
delighted if we could rolling upgrade from one to the other. I just didn't
want to rule out the inclusion of some very compelling feature outright.
Trust me though, I'd be the first person to ask about compatibility if such
a feature does come up.

I'll also posit that people will shy away from using JDK8 features while
branch-2 remains in active use. There's definitely some new shiny there,
but nothing compelling enough to me personally when weighed against the
pain of harder branch-2 backports.

Let's try to keep this thread focused on the planning side of things
though, deferring JDK-feature-related discussion to a different thread.
We'd need to draw up a code-style doc on the wiki, but it sounds like
something Steve and/or I could draft initially.

Thanks,
Andrew


On Fri, Jun 20, 2014 at 10:02 PM, Arun C Murthy  wrote:

>
> On Jun 20, 2014, at 9:51 PM, Steve Loughran 
> wrote:
>
> > On 20 June 2014 21:35, Steve Loughran  wrote:
> >
> >>
> >> This actually argues in favour of
> >>
> >> -renaming branch-2 branch-3 after a release
> >> -making trunk hadoop-4
> >>
> >> -getting hadoop 3 released off the new branch-3 out in 2014, effectively
> >> being an iteration of branch-2 with updated java , moves of (off?)
> guava,
> >> off jetty, lib changes, but no other significant "big bang" features
> >>
> >>
> >> Hadoop 4.x then becomes the 2015 release, which can add more stuff. In
> >> particular, anything that goes into Hadoop 4 for which there's no
> intent to
> >> support in hadoop 2 & 3, can use the java 8 language features sooner
> rather
> >> than later.
> >>
> >>
> >>
> > I should add that I'm willing to be the person who gets the Java-7 based
> > Hadoop  3.x out the door later this year
>
> +1 that makes sense to me. Thanks for volunteering Steve - I'm glad to
> share the pain… ;-)
>
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Arun C Murthy

On Jun 20, 2014, at 9:51 PM, Steve Loughran  wrote:

> On 20 June 2014 21:35, Steve Loughran  wrote:
> 
>> 
>> This actually argues in favour of
>> 
>> -renaming branch-2 branch-3 after a release
>> -making trunk hadoop-4
>> 
>> -getting hadoop 3 released off the new branch-3 out in 2014, effectively
>> being an iteration of branch-2 with updated java , moves of (off?) guava,
>> off jetty, lib changes, but no other significant "big bang" features
>> 
>> 
>> Hadoop 4.x then becomes the 2015 release, which can add more stuff. In
>> particular, anything that goes into Hadoop 4 for which there's no intent to
>> support in hadoop 2 & 3, can use the java 8 language features sooner rather
>> than later.
>> 
>> 
>> 
> I should add that I'm willing to be the person who gets the Java-7 based
> Hadoop  3.x out the door later this year

+1 that makes sense to me. Thanks for volunteering Steve - I'm glad to share 
the pain… ;-)

Arun
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Steve Loughran
On 20 June 2014 21:35, Steve Loughran  wrote:

>
> This actually argues in favour of
>
> -renaming branch-2 branch-3 after a release
> -making trunk hadoop-4
>
> -getting hadoop 3 released off the new branch-3 out in 2014, effectively
> being an iteration of branch-2 with updated java , moves of (off?) guava,
> off jetty, lib changes, but no other significant "big bang" features
>
>
> Hadoop 4.x then becomes the 2015 release, which can add more stuff. In
> particular, anything that goes into Hadoop 4 for which there's no intent to
> support in hadoop 2 & 3, can use the java 8 language features sooner rather
> than later.
>
>
>
I should add that I'm willing to be the person who gets the Java-7 based
Hadoop  3.x out the door later this year

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Steve Loughran
​Having gone back through the entire thread I can see we've made progress
here, as the discussion has moved on from when to move to java 7 to when to
move to java 8... which, I've alway felt the appeal of from the
coding-side. Java 8 tomorrow is the most compelling reason to move to java
7 today.

It also sets us up to thinking about java 9, where there are already early
access releases available on java.net.

But, as Colin McCabe  wrote on 14 April 2014 09:22,
> I think the bottom line here is that as long as our stable release
> uses JDK6, there is going to be a very, very strong disincentive to
> put any code which can't run on JDK6 into trunk.

and there's a problem. I'm seeing push back now on flipping the java7 bit,
I can imagine how a patch that went to java 8 and added some of the new
streams operations would go down? Java 8 is radically different enough
code-wise from java 6 that if you embrace those new features, you don't
stand a chance of backporting.

We need to move to a more recent java version in release hadoop, so that
trunk & backported code can use java 7 code and libraries. Then trunk can
flip the java 8 jvm bit -while still using java7 language- for as long as
we plan to be able to move code/patches from trunk to release.

This actually argues in favour of

-renaming branch-2 branch-3 after a release
-making trunk hadoop-4

-getting hadoop 3 released off the new branch-3 out in 2014, effectively
being an iteration of branch-2 with updated java , moves of (off?) guava,
off jetty, lib changes, but no other significant "big bang" features


Hadoop 4.x then becomes the 2015 release, which can add more stuff. In
particular, anything that goes into Hadoop 4 for which there's no intent to
support in hadoop 2 & 3, can use the java 8 language features sooner rather
than later.

-Steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Steve Loughran
On 20 June 2014 17:01, Andrew Wang  wrote:

> Thanks everyone for the discussion so far. I talked with some of our other
> teams and thought about the issue some more.
>
> Regarding branch-2, we can't do much because of compatibility. Dropping
> support for a JDK is supposed to happen in a major release. I think we all
> understand this though, so it's not really under discussion.
>

...although we can just rename "hadoop 2.6" as "hadoop 3.0" and make that
the java 7
switch.,


>
> Regarding trunk, I think that leapfrogging to JDK8 is the right move. JDK7
> is EOL April next year, so it'd be better to avoid going through this pain
> twice so soon. Developer momentum also seems very strong behind JDK8
> because of all the shiny new features, so I think we'll see quick adoption.
> We also need some time to clean up APIs and I'm sure people have big,
> incompatible project ideas floating around they'd like to get in.
>


> With the JDK7 EOL in mind, we need a JDK8-based 3.0 release by mid next
> year. Since I have a strong interest in all these things, I'd like to
> volunteer as release manager for this beast. This means, yep, I'll wrangle
> the builds, worry about compat, bump lib versions, and all those other fun
> tasks. There's clearly a lot to discuss logistically (let's take that to a
> different thread), but this feels like the right way forward to me.
>
> Best,
> Andrew
>
>

I feel the appeal of a jump to java 8, but also fear that it will postpone
that release even more.

If we had a java 7 flag today, we could think -as Raymie proposed- about
having a hadoop-only-runs-on-java7 release relatively easily. There's no
technical cost to "migrate" to java7, as it is effectively the java version
hadoop is running on. All we would be doing is documenting the fact

In contrast, even making sure the entire Hadoop stack runs on Java 8 is a
major undertaking -which I know, as TWILL-82 shows that it isn't widely
tested. That's making sure it works -not even the big project ideas and any
java 8 migration.

which is something that worries me here "  big, incompatible project ideas
floating around they'd like to get in."

There's a risk that this becomes an opportunity for everything to go in, it
ends up taking too long and being pushed out, Hadoop 2.x frozen in java 6
mode for its code and all its dependencies, for at least another year
-which is what is being proposed here.

-steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Aaron T. Myers
ebated, they are quite minor. On the other
> > hand,
> > > I
> > > >> would imagine discussion and debate on what 8+ language features
> might
> > > be
> > > >> useful to use at some future time could be a lively one.
> > > >>
> > > >>
> > > >>
> > > >> On Wed, Jun 18, 2014 at 3:03 PM, Colin McCabe <
> cmcc...@alumni.cmu.edu
> > >
> > > >> wrote:
> > > >>
> > > >>> In CDH5, Cloudera encourages people to use JDK7.  JDK6 has been EOL
> > > >>> for a while now and is not something we recommend.
> > > >>>
> > > >>> As we discussed before, everyone is in favor of upgrading to JDK7.
> > > >>> Every cluster operator of a reasonably modern Hadoop should do
> it
> > > >>> whatever distro or release you run.  As developers, we run JDK7 as
> > > >>> well.
> > > >>>
> > > >>> I'd just like to see a plan for when branch-2 (or some other
> branch)
> > > >>> will create a stable release that drops support for JDK1.6.  If we
> > > >>> don't have such a plan, I feel like it's too early to talk about
> this
> > > >>> stuff.
> > > >>>
> > > >>> If we drop support for 1.6 in trunk but not in branch-2, we are
> > > >>> fragmenting the project.  People will start writing unreleaseable
> > code
> > > >>> (because it doesn't work on branch-2) and we'll be back to the bad
> > old
> > > >>> days of Hadoop version fragmentation that branch-2 was intended to
> > > >>> solve.  Backports will become harder.  The biggest problem is that
> > > >>> trunk will start to depend on libraries or Maven plugins that
> > branch-2
> > > >>> can't even use, because they're JDK7+-only.
> > > >>>
> > > >>> Steve wrote: "if someone actually did file a bug on something on
> > > >>> branch-2 which didn't work on Java 6 but went away on Java7+, we'd
> > > >>> probably close it as a WORKSFORME".
> > > >>>
> > > >>> Steve, if this is true, we should just bump the minimum supported
> > > >>> version for branch-2 to 1.7 today and resolve this.  If we truly
> > > >>> believe that there are no issues here, then let's just decide to
> drop
> > > >>> 1.6 in a specific future release of Hadoop 2.  If there are issues
> > > >>> with releasing JDK1.7+ only code, then let's figure out what they
> are
> > > >>> before proceeding.
> > > >>>
> > > >>> best,
> > > >>> Colin
> > > >>>
> > > >>>
> > > >>> On Wed, Jun 18, 2014 at 1:41 PM, Sandy Ryza <
> sandy.r...@cloudera.com
> > >
> > > >>> wrote:
> > > >>> > We do release warnings when we are aware of vulnerabilities in
> our
> > > >>> > dependencies.
> > > >>> >
> > > >>> > However, unless I'm grossly misunderstanding, the vulnerability
> > that
> > > you
> > > >>> > point out is not a vulnerability within the context of our
> > software.
> > > >>> >  Hadoop doesn't try to sandbox within JVMs.  In a secure setup,
> any
> > > JVM
> > > >>> > running non-trusted user code is running as that user, so
> "breaking
> > > out"
> > > >>> > doesn't offer the ability to do anything malicious.
> > > >>> >
> > > >>> > -Sandy
> > > >>> >
> > > >>> > On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi <
> > > >>> davi.ottenhei...@emc.com
> > > >>> >> wrote:
> > > >>> >
> > > >>> >> Andrew,
> > > >>> >>
> > > >>> >>
> > > >>> >>
> > > >>> >> “I don't see any point to switching” is an interesting
> > perspective,
> > > >>> given
> > > >>> >> the well-known risks of running unsafe software. Clearly
> customer
> > > best
> > > >>> >> interest is stability. JDK6 is in a known unsafe state. The
>

Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Andrew Wang
 JDK7.
> > >>> Every cluster operator of a reasonably modern Hadoop should do it
> > >>> whatever distro or release you run.  As developers, we run JDK7 as
> > >>> well.
> > >>>
> > >>> I'd just like to see a plan for when branch-2 (or some other branch)
> > >>> will create a stable release that drops support for JDK1.6.  If we
> > >>> don't have such a plan, I feel like it's too early to talk about this
> > >>> stuff.
> > >>>
> > >>> If we drop support for 1.6 in trunk but not in branch-2, we are
> > >>> fragmenting the project.  People will start writing unreleaseable
> code
> > >>> (because it doesn't work on branch-2) and we'll be back to the bad
> old
> > >>> days of Hadoop version fragmentation that branch-2 was intended to
> > >>> solve.  Backports will become harder.  The biggest problem is that
> > >>> trunk will start to depend on libraries or Maven plugins that
> branch-2
> > >>> can't even use, because they're JDK7+-only.
> > >>>
> > >>> Steve wrote: "if someone actually did file a bug on something on
> > >>> branch-2 which didn't work on Java 6 but went away on Java7+, we'd
> > >>> probably close it as a WORKSFORME".
> > >>>
> > >>> Steve, if this is true, we should just bump the minimum supported
> > >>> version for branch-2 to 1.7 today and resolve this.  If we truly
> > >>> believe that there are no issues here, then let's just decide to drop
> > >>> 1.6 in a specific future release of Hadoop 2.  If there are issues
> > >>> with releasing JDK1.7+ only code, then let's figure out what they are
> > >>> before proceeding.
> > >>>
> > >>> best,
> > >>> Colin
> > >>>
> > >>>
> > >>> On Wed, Jun 18, 2014 at 1:41 PM, Sandy Ryza  >
> > >>> wrote:
> > >>> > We do release warnings when we are aware of vulnerabilities in our
> > >>> > dependencies.
> > >>> >
> > >>> > However, unless I'm grossly misunderstanding, the vulnerability
> that
> > you
> > >>> > point out is not a vulnerability within the context of our
> software.
> > >>> >  Hadoop doesn't try to sandbox within JVMs.  In a secure setup, any
> > JVM
> > >>> > running non-trusted user code is running as that user, so "breaking
> > out"
> > >>> > doesn't offer the ability to do anything malicious.
> > >>> >
> > >>> > -Sandy
> > >>> >
> > >>> > On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi <
> > >>> davi.ottenhei...@emc.com
> > >>> >> wrote:
> > >>> >
> > >>> >> Andrew,
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> “I don't see any point to switching” is an interesting
> perspective,
> > >>> given
> > >>> >> the well-known risks of running unsafe software. Clearly customer
> > best
> > >>> >> interest is stability. JDK6 is in a known unsafe state. The longer
> > >>> anyone
> > >>> >> delays the necessary transition to safety the longer the door is
> > left
> > >>> open
> > >>> >> to predictable disaster.
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> You also said "we still test and support JDK6". I searched but
> have
> > not
> > >>> >> been able to find Cloudera critical security fixes for JDK6.
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465?
> In
> > >>> other
> > >>> >> words, did you release to your customers any kind of public alert
> or
> > >>> >> warning of this CVSS 10.0 event as part of your JDK6 support?
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> http://www.cvedetails.com/cve/CVE-2013-2465/
> > >>> >>
> > >>> >>
> > >>> >

Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Bryan Beaudreault
c future release of Hadoop 2.  If there are issues
> >>> with releasing JDK1.7+ only code, then let's figure out what they are
> >>> before proceeding.
> >>>
> >>> best,
> >>> Colin
> >>>
> >>>
> >>> On Wed, Jun 18, 2014 at 1:41 PM, Sandy Ryza 
> >>> wrote:
> >>> > We do release warnings when we are aware of vulnerabilities in our
> >>> > dependencies.
> >>> >
> >>> > However, unless I'm grossly misunderstanding, the vulnerability that
> you
> >>> > point out is not a vulnerability within the context of our software.
> >>> >  Hadoop doesn't try to sandbox within JVMs.  In a secure setup, any
> JVM
> >>> > running non-trusted user code is running as that user, so "breaking
> out"
> >>> > doesn't offer the ability to do anything malicious.
> >>> >
> >>> > -Sandy
> >>> >
> >>> > On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi <
> >>> davi.ottenhei...@emc.com
> >>> >> wrote:
> >>> >
> >>> >> Andrew,
> >>> >>
> >>> >>
> >>> >>
> >>> >> “I don't see any point to switching” is an interesting perspective,
> >>> given
> >>> >> the well-known risks of running unsafe software. Clearly customer
> best
> >>> >> interest is stability. JDK6 is in a known unsafe state. The longer
> >>> anyone
> >>> >> delays the necessary transition to safety the longer the door is
> left
> >>> open
> >>> >> to predictable disaster.
> >>> >>
> >>> >>
> >>> >>
> >>> >> You also said "we still test and support JDK6". I searched but have
> not
> >>> >> been able to find Cloudera critical security fixes for JDK6.
> >>> >>
> >>> >>
> >>> >>
> >>> >> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In
> >>> other
> >>> >> words, did you release to your customers any kind of public alert or
> >>> >> warning of this CVSS 10.0 event as part of your JDK6 support?
> >>> >>
> >>> >>
> >>> >>
> >>> >> http://www.cvedetails.com/cve/CVE-2013-2465/
> >>> >>
> >>> >>
> >>> >>
> >>> >> If you are not releasing your own security fixes for JDK6 post-EOL
> would
> >>> >> it perhaps be safer to say Cloudera is hands-off; neither supports,
> nor
> >>> >> opposes the known insecure and deprecated/unpatched JDK?
> >>> >>
> >>> >>
> >>> >>
> >>> >> I mentioned before in this thread the Oracle support timeline:
> >>> >>
> >>> >>
> >>> >>
> >>> >> - official public EOL (end of life) was more than a year ago
> >>> >>
> >>> >> - premier support ended more than six months ago
> >>> >>
> >>> >> - extended support may get critical security fixes until the end of
> 2016
> >>> >>
> >>> >>
> >>> >>
> >>> >> Given this timeline, does Cloudera officially take responsibility
> for
> >>> >> Hadoop customer safety? Are you going to be releasing critical
> security
> >>> >> fixes to a known unsafe JDK?
> >>> >>
> >>> >>
> >>> >>
> >>> >> Davi
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> > -Original Message-
> >>> >>
> >>> >> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
> >>> >>
> >>> >> > Sent: Wednesday, June 18, 2014 12:33 PM
> >>> >>
> >>> >> > To: common-dev@hadoop.apache.org
> >>> >>
> >>> >> > Subject: Re: Plans of moving towards JDK7 in trunk
> >>> >>
> >>> >> >
> >>> >>
> >>> >> > Actually, a lot of our customers are still on JDK6, so if
> anythin

Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Colin McCabe
> wrote:
>>> >
>>> >> Andrew,
>>> >>
>>> >>
>>> >>
>>> >> “I don't see any point to switching” is an interesting perspective,
>>> given
>>> >> the well-known risks of running unsafe software. Clearly customer best
>>> >> interest is stability. JDK6 is in a known unsafe state. The longer
>>> anyone
>>> >> delays the necessary transition to safety the longer the door is left
>>> open
>>> >> to predictable disaster.
>>> >>
>>> >>
>>> >>
>>> >> You also said "we still test and support JDK6". I searched but have not
>>> >> been able to find Cloudera critical security fixes for JDK6.
>>> >>
>>> >>
>>> >>
>>> >> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In
>>> other
>>> >> words, did you release to your customers any kind of public alert or
>>> >> warning of this CVSS 10.0 event as part of your JDK6 support?
>>> >>
>>> >>
>>> >>
>>> >> http://www.cvedetails.com/cve/CVE-2013-2465/
>>> >>
>>> >>
>>> >>
>>> >> If you are not releasing your own security fixes for JDK6 post-EOL would
>>> >> it perhaps be safer to say Cloudera is hands-off; neither supports, nor
>>> >> opposes the known insecure and deprecated/unpatched JDK?
>>> >>
>>> >>
>>> >>
>>> >> I mentioned before in this thread the Oracle support timeline:
>>> >>
>>> >>
>>> >>
>>> >> - official public EOL (end of life) was more than a year ago
>>> >>
>>> >> - premier support ended more than six months ago
>>> >>
>>> >> - extended support may get critical security fixes until the end of 2016
>>> >>
>>> >>
>>> >>
>>> >> Given this timeline, does Cloudera officially take responsibility for
>>> >> Hadoop customer safety? Are you going to be releasing critical security
>>> >> fixes to a known unsafe JDK?
>>> >>
>>> >>
>>> >>
>>> >> Davi
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> > -Original Message-
>>> >>
>>> >> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
>>> >>
>>> >> > Sent: Wednesday, June 18, 2014 12:33 PM
>>> >>
>>> >> > To: common-dev@hadoop.apache.org
>>> >>
>>> >> > Subject: Re: Plans of moving towards JDK7 in trunk
>>> >>
>>> >> >
>>> >>
>>> >> > Actually, a lot of our customers are still on JDK6, so if anything,
>>> its
>>> >> popularity
>>> >>
>>> >> > hasn't significantly decreased. We still test and support JDK6 for
>>> CDH4
>>> >> and
>>> >>
>>> >> > CDH5. The claim that branch-2 is effectively JDK7 because no one
>>> supports
>>> >>
>>> >> > JDK6 is untrue.
>>> >>
>>> >> >
>>> >>
>>> >> > One issue with your proposal is that java 7+ libraries can have
>>> >> incompatible
>>> >>
>>> >> > APIs compared to their java 6 versions. Guava moves very quickly with
>>> >> regard
>>> >>
>>> >> > to the deprecate+remove cycle. This means branch-2 and trunk
>>> divergence,
>>> >>
>>> >> > as we're stuck using different Guava APIs to do the same thing.
>>> >>
>>> >> >
>>> >>
>>> >> > No one's arguing against moving to Java 7+ in trunk eventually, but
>>> >> there isn't
>>> >>
>>> >> > a clear plan for a trunk-based release. I don't see any point to
>>> >> switching trunk
>>> >>
>>> >> > over until that's true, for the aforementioned reasons.
>>> >>
>>> >> >
>>> >>
>>> >> > Best,
>>> >>
>>> >> > Andrew
>>

Re: Plans of moving towards JDK7 in trunk

2014-06-20 Thread Colin McCabe
>> open
>> >> to predictable disaster.
>> >>
>> >>
>> >>
>> >> You also said "we still test and support JDK6". I searched but have not
>> >> been able to find Cloudera critical security fixes for JDK6.
>> >>
>> >>
>> >>
>> >> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In
>> other
>> >> words, did you release to your customers any kind of public alert or
>> >> warning of this CVSS 10.0 event as part of your JDK6 support?
>> >>
>> >>
>> >>
>> >> http://www.cvedetails.com/cve/CVE-2013-2465/
>> >>
>> >>
>> >>
>> >> If you are not releasing your own security fixes for JDK6 post-EOL would
>> >> it perhaps be safer to say Cloudera is hands-off; neither supports, nor
>> >> opposes the known insecure and deprecated/unpatched JDK?
>> >>
>> >>
>> >>
>> >> I mentioned before in this thread the Oracle support timeline:
>> >>
>> >>
>> >>
>> >> - official public EOL (end of life) was more than a year ago
>> >>
>> >> - premier support ended more than six months ago
>> >>
>> >> - extended support may get critical security fixes until the end of 2016
>> >>
>> >>
>> >>
>> >> Given this timeline, does Cloudera officially take responsibility for
>> >> Hadoop customer safety? Are you going to be releasing critical security
>> >> fixes to a known unsafe JDK?
>> >>
>> >>
>> >>
>> >> Davi
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> > -Original Message-
>> >>
>> >> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
>> >>
>> >> > Sent: Wednesday, June 18, 2014 12:33 PM
>> >>
>> >> > To: common-dev@hadoop.apache.org
>> >>
>> >> > Subject: Re: Plans of moving towards JDK7 in trunk
>> >>
>> >> >
>> >>
>> >> > Actually, a lot of our customers are still on JDK6, so if anything,
>> its
>> >> popularity
>> >>
>> >> > hasn't significantly decreased. We still test and support JDK6 for
>> CDH4
>> >> and
>> >>
>> >> > CDH5. The claim that branch-2 is effectively JDK7 because no one
>> supports
>> >>
>> >> > JDK6 is untrue.
>> >>
>> >> >
>> >>
>> >> > One issue with your proposal is that java 7+ libraries can have
>> >> incompatible
>> >>
>> >> > APIs compared to their java 6 versions. Guava moves very quickly with
>> >> regard
>> >>
>> >> > to the deprecate+remove cycle. This means branch-2 and trunk
>> divergence,
>> >>
>> >> > as we're stuck using different Guava APIs to do the same thing.
>> >>
>> >> >
>> >>
>> >> > No one's arguing against moving to Java 7+ in trunk eventually, but
>> >> there isn't
>> >>
>> >> > a clear plan for a trunk-based release. I don't see any point to
>> >> switching trunk
>> >>
>> >> > over until that's true, for the aforementioned reasons.
>> >>
>> >> >
>> >>
>> >> > Best,
>> >>
>> >> > Andrew
>> >>
>> >> >
>> >>
>> >> >
>> >>
>> >> > On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran
>> >>
>> >> > mailto:ste...@hortonworks.com>>
>> >>
>> >> > wrote:
>> >>
>> >> >
>> >>
>> >> > > I also think we need to recognise that its been three months since
>> >>
>> >> > > that last discussion, and Java 6 has not suddenly burst back into
>> >>
>> >> > > popularity
>> >>
>> >> > >
>> >>
>> >> > >
>> >>
>> >> > >- nobody providing commercial support for Hadoop is offering
>> >> branch-2
>> >>
>> >> > >support on Java 6 AFAIK
>> >>
>> >> > >- therefore, 

Re: Plans of moving towards JDK7 in trunk

2014-06-19 Thread Andrew Purtell
support timeline:
> >>
> >>
> >>
> >> - official public EOL (end of life) was more than a year ago
> >>
> >> - premier support ended more than six months ago
> >>
> >> - extended support may get critical security fixes until the end of 2016
> >>
> >>
> >>
> >> Given this timeline, does Cloudera officially take responsibility for
> >> Hadoop customer safety? Are you going to be releasing critical security
> >> fixes to a known unsafe JDK?
> >>
> >>
> >>
> >> Davi
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> > -Original Message-
> >>
> >> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
> >>
> >> > Sent: Wednesday, June 18, 2014 12:33 PM
> >>
> >> > To: common-dev@hadoop.apache.org
> >>
> >> > Subject: Re: Plans of moving towards JDK7 in trunk
> >>
> >> >
> >>
> >> > Actually, a lot of our customers are still on JDK6, so if anything,
> its
> >> popularity
> >>
> >> > hasn't significantly decreased. We still test and support JDK6 for
> CDH4
> >> and
> >>
> >> > CDH5. The claim that branch-2 is effectively JDK7 because no one
> supports
> >>
> >> > JDK6 is untrue.
> >>
> >> >
> >>
> >> > One issue with your proposal is that java 7+ libraries can have
> >> incompatible
> >>
> >> > APIs compared to their java 6 versions. Guava moves very quickly with
> >> regard
> >>
> >> > to the deprecate+remove cycle. This means branch-2 and trunk
> divergence,
> >>
> >> > as we're stuck using different Guava APIs to do the same thing.
> >>
> >> >
> >>
> >> > No one's arguing against moving to Java 7+ in trunk eventually, but
> >> there isn't
> >>
> >> > a clear plan for a trunk-based release. I don't see any point to
> >> switching trunk
> >>
> >> > over until that's true, for the aforementioned reasons.
> >>
> >> >
> >>
> >> > Best,
> >>
> >> > Andrew
> >>
> >> >
> >>
> >> >
> >>
> >> > On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran
> >>
> >> > mailto:ste...@hortonworks.com>>
> >>
> >> > wrote:
> >>
> >> >
> >>
> >> > > I also think we need to recognise that its been three months since
> >>
> >> > > that last discussion, and Java 6 has not suddenly burst back into
> >>
> >> > > popularity
> >>
> >> > >
> >>
> >> > >
> >>
> >> > >- nobody providing commercial support for Hadoop is offering
> >> branch-2
> >>
> >> > >support on Java 6 AFAIK
> >>
> >> > >- therefore, nobody is testing it at scale except privately, and
> >> they
> >>
> >> > >aren't reporting bugs if they are
> >>
> >> > >- if someone actually did file a bug on something on branch-2
> which
> >>
> >> > >didn't work on Java 6 but went away on Java7+, we'd probably
> close
> >>
> >> > > it as a
> >>
> >> > >WORKSFORME
> >>
> >> > >
> >>
> >> > >
> >>
> >> > > whether we acknowledge it or not, Hadoop 2.x is now really Java 7+.
> >>
> >> > >
> >>
> >> > > We do all agree that hadoop 3 will not be java 6, so the only issue
> is
> >>
> >> > > "when and how to make that transition".
> >>
> >> > >
> >>
> >> > > That patch of mine just makes it possible to do today.
> >>
> >> > >
> >>
> >> > > I have actually jumped to Java7 in the slider project, and actually
> >>
> >> > > being using Java 8 and twill; the new language features there are
> >>
> >> > > significant and would be great to use in Hadoop *at some point in
> the
> >>
> >> > > future*
> >>
> >> > >
> >>
> >> > > For Java 7 though, based on that exp

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Colin McCabe
In CDH5, Cloudera encourages people to use JDK7.  JDK6 has been EOL
for a while now and is not something we recommend.

As we discussed before, everyone is in favor of upgrading to JDK7.
Every cluster operator of a reasonably modern Hadoop should do it
whatever distro or release you run.  As developers, we run JDK7 as
well.

I'd just like to see a plan for when branch-2 (or some other branch)
will create a stable release that drops support for JDK1.6.  If we
don't have such a plan, I feel like it's too early to talk about this
stuff.

If we drop support for 1.6 in trunk but not in branch-2, we are
fragmenting the project.  People will start writing unreleaseable code
(because it doesn't work on branch-2) and we'll be back to the bad old
days of Hadoop version fragmentation that branch-2 was intended to
solve.  Backports will become harder.  The biggest problem is that
trunk will start to depend on libraries or Maven plugins that branch-2
can't even use, because they're JDK7+-only.

Steve wrote: "if someone actually did file a bug on something on
branch-2 which didn't work on Java 6 but went away on Java7+, we'd
probably close it as a WORKSFORME".

Steve, if this is true, we should just bump the minimum supported
version for branch-2 to 1.7 today and resolve this.  If we truly
believe that there are no issues here, then let's just decide to drop
1.6 in a specific future release of Hadoop 2.  If there are issues
with releasing JDK1.7+ only code, then let's figure out what they are
before proceeding.

best,
Colin


On Wed, Jun 18, 2014 at 1:41 PM, Sandy Ryza  wrote:
> We do release warnings when we are aware of vulnerabilities in our
> dependencies.
>
> However, unless I'm grossly misunderstanding, the vulnerability that you
> point out is not a vulnerability within the context of our software.
>  Hadoop doesn't try to sandbox within JVMs.  In a secure setup, any JVM
> running non-trusted user code is running as that user, so "breaking out"
> doesn't offer the ability to do anything malicious.
>
> -Sandy
>
> On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi > wrote:
>
>> Andrew,
>>
>>
>>
>> “I don't see any point to switching” is an interesting perspective, given
>> the well-known risks of running unsafe software. Clearly customer best
>> interest is stability. JDK6 is in a known unsafe state. The longer anyone
>> delays the necessary transition to safety the longer the door is left open
>> to predictable disaster.
>>
>>
>>
>> You also said "we still test and support JDK6". I searched but have not
>> been able to find Cloudera critical security fixes for JDK6.
>>
>>
>>
>> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other
>> words, did you release to your customers any kind of public alert or
>> warning of this CVSS 10.0 event as part of your JDK6 support?
>>
>>
>>
>> http://www.cvedetails.com/cve/CVE-2013-2465/
>>
>>
>>
>> If you are not releasing your own security fixes for JDK6 post-EOL would
>> it perhaps be safer to say Cloudera is hands-off; neither supports, nor
>> opposes the known insecure and deprecated/unpatched JDK?
>>
>>
>>
>> I mentioned before in this thread the Oracle support timeline:
>>
>>
>>
>> - official public EOL (end of life) was more than a year ago
>>
>> - premier support ended more than six months ago
>>
>> - extended support may get critical security fixes until the end of 2016
>>
>>
>>
>> Given this timeline, does Cloudera officially take responsibility for
>> Hadoop customer safety? Are you going to be releasing critical security
>> fixes to a known unsafe JDK?
>>
>>
>>
>> Davi
>>
>>
>>
>>
>>
>>
>>
>> > -Original Message-
>>
>> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
>>
>> > Sent: Wednesday, June 18, 2014 12:33 PM
>>
>> > To: common-dev@hadoop.apache.org
>>
>> > Subject: Re: Plans of moving towards JDK7 in trunk
>>
>> >
>>
>> > Actually, a lot of our customers are still on JDK6, so if anything, its
>> popularity
>>
>> > hasn't significantly decreased. We still test and support JDK6 for CDH4
>> and
>>
>> > CDH5. The claim that branch-2 is effectively JDK7 because no one supports
>>
>> > JDK6 is untrue.
>>
>> >
>>
>> > One issue with your proposal is that java 7+ libraries can have
>> incompatible
>>
>> > APIs compared to their j

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Steve Loughran
Most of the security problems in Java are sandbox jailbreaking and not
relevant. Anything related to kerberos, HTTPS or other in-cluster security
issues would be a different story...I haven't heard anything. Its a
different matter client-side, but anyone who enables Java in their web
browsers is doomed already.

Java security issues may matter developer-side, as if you really want to
support java6, you need a java6 JVM to hand. There's a risk there...but if
you run an OS/X box apple keep them around for you even after you upgrade
(try /usr/libexec/java_home -V to see this).


On 18 June 2014 13:41, Sandy Ryza  wrote:

> We do release warnings when we are aware of vulnerabilities in our
> dependencies.
>
> However, unless I'm grossly misunderstanding, the vulnerability that you
> point out is not a vulnerability within the context of our software.
>  Hadoop doesn't try to sandbox within JVMs.  In a secure setup, any JVM
> running non-trusted user code is running as that user, so "breaking out"
> doesn't offer the ability to do anything malicious.
>
> -Sandy
>
> On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi <
> davi.ottenhei...@emc.com
> > wrote:
>
> > Andrew,
> >
> >
> >
> > “I don't see any point to switching” is an interesting perspective, given
> > the well-known risks of running unsafe software. Clearly customer best
> > interest is stability. JDK6 is in a known unsafe state. The longer anyone
> > delays the necessary transition to safety the longer the door is left
> open
> > to predictable disaster.
> >
> >
> >
> > You also said "we still test and support JDK6". I searched but have not
> > been able to find Cloudera critical security fixes for JDK6.
> >
> >
> >
> > Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In
> other
> > words, did you release to your customers any kind of public alert or
> > warning of this CVSS 10.0 event as part of your JDK6 support?
> >
> >
> >
> > http://www.cvedetails.com/cve/CVE-2013-2465/
> >
> >
> >
> > If you are not releasing your own security fixes for JDK6 post-EOL would
> > it perhaps be safer to say Cloudera is hands-off; neither supports, nor
> > opposes the known insecure and deprecated/unpatched JDK?
> >
> >
> >
> > I mentioned before in this thread the Oracle support timeline:
> >
> >
> >
> > - official public EOL (end of life) was more than a year ago
> >
> > - premier support ended more than six months ago
> >
> > - extended support may get critical security fixes until the end of 2016
> >
> >
> >
> > Given this timeline, does Cloudera officially take responsibility for
> > Hadoop customer safety? Are you going to be releasing critical security
> > fixes to a known unsafe JDK?
> >
> >
> >
> > Davi
> >
> >
> >
> >
> >
> >
> >
> > > -Original Message-
> >
> > > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
> >
> > > Sent: Wednesday, June 18, 2014 12:33 PM
> >
> > > To: common-dev@hadoop.apache.org
> >
> > > Subject: Re: Plans of moving towards JDK7 in trunk
> >
> > >
> >
> > > Actually, a lot of our customers are still on JDK6, so if anything, its
> > popularity
> >
> > > hasn't significantly decreased. We still test and support JDK6 for CDH4
> > and
> >
> > > CDH5. The claim that branch-2 is effectively JDK7 because no one
> supports
> >
> > > JDK6 is untrue.
> >
> > >
> >
> > > One issue with your proposal is that java 7+ libraries can have
> > incompatible
> >
> > > APIs compared to their java 6 versions. Guava moves very quickly with
> > regard
> >
> > > to the deprecate+remove cycle. This means branch-2 and trunk
> divergence,
> >
> > > as we're stuck using different Guava APIs to do the same thing.
> >
> > >
> >
> > > No one's arguing against moving to Java 7+ in trunk eventually, but
> > there isn't
> >
> > > a clear plan for a trunk-based release. I don't see any point to
> > switching trunk
> >
> > > over until that's true, for the aforementioned reasons.
> >
> > >
> >
> > > Best,
> >
> > > Andrew
> >
> > >
> >
> > >
> >
> > > On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran
> >
> > > mailto:ste...@hortonworks

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Sandy Ryza
We do release warnings when we are aware of vulnerabilities in our
dependencies.

However, unless I'm grossly misunderstanding, the vulnerability that you
point out is not a vulnerability within the context of our software.
 Hadoop doesn't try to sandbox within JVMs.  In a secure setup, any JVM
running non-trusted user code is running as that user, so "breaking out"
doesn't offer the ability to do anything malicious.

-Sandy

On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi  wrote:

> Andrew,
>
>
>
> “I don't see any point to switching” is an interesting perspective, given
> the well-known risks of running unsafe software. Clearly customer best
> interest is stability. JDK6 is in a known unsafe state. The longer anyone
> delays the necessary transition to safety the longer the door is left open
> to predictable disaster.
>
>
>
> You also said "we still test and support JDK6". I searched but have not
> been able to find Cloudera critical security fixes for JDK6.
>
>
>
> Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other
> words, did you release to your customers any kind of public alert or
> warning of this CVSS 10.0 event as part of your JDK6 support?
>
>
>
> http://www.cvedetails.com/cve/CVE-2013-2465/
>
>
>
> If you are not releasing your own security fixes for JDK6 post-EOL would
> it perhaps be safer to say Cloudera is hands-off; neither supports, nor
> opposes the known insecure and deprecated/unpatched JDK?
>
>
>
> I mentioned before in this thread the Oracle support timeline:
>
>
>
> - official public EOL (end of life) was more than a year ago
>
> - premier support ended more than six months ago
>
> - extended support may get critical security fixes until the end of 2016
>
>
>
> Given this timeline, does Cloudera officially take responsibility for
> Hadoop customer safety? Are you going to be releasing critical security
> fixes to a known unsafe JDK?
>
>
>
> Davi
>
>
>
>
>
>
>
> > -----Original Message-
>
> > From: Andrew Wang [mailto:andrew.w...@cloudera.com]
>
> > Sent: Wednesday, June 18, 2014 12:33 PM
>
> > To: common-dev@hadoop.apache.org
>
> > Subject: Re: Plans of moving towards JDK7 in trunk
>
> >
>
> > Actually, a lot of our customers are still on JDK6, so if anything, its
> popularity
>
> > hasn't significantly decreased. We still test and support JDK6 for CDH4
> and
>
> > CDH5. The claim that branch-2 is effectively JDK7 because no one supports
>
> > JDK6 is untrue.
>
> >
>
> > One issue with your proposal is that java 7+ libraries can have
> incompatible
>
> > APIs compared to their java 6 versions. Guava moves very quickly with
> regard
>
> > to the deprecate+remove cycle. This means branch-2 and trunk divergence,
>
> > as we're stuck using different Guava APIs to do the same thing.
>
> >
>
> > No one's arguing against moving to Java 7+ in trunk eventually, but
> there isn't
>
> > a clear plan for a trunk-based release. I don't see any point to
> switching trunk
>
> > over until that's true, for the aforementioned reasons.
>
> >
>
> > Best,
>
> > Andrew
>
> >
>
> >
>
> > On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran
>
> > mailto:ste...@hortonworks.com>>
>
> > wrote:
>
> >
>
> > > I also think we need to recognise that its been three months since
>
> > > that last discussion, and Java 6 has not suddenly burst back into
>
> > > popularity
>
> > >
>
> > >
>
> > >- nobody providing commercial support for Hadoop is offering
> branch-2
>
> > >support on Java 6 AFAIK
>
> > >- therefore, nobody is testing it at scale except privately, and
> they
>
> > >aren't reporting bugs if they are
>
> > >- if someone actually did file a bug on something on branch-2 which
>
> > >didn't work on Java 6 but went away on Java7+, we'd probably close
>
> > > it as a
>
> > >WORKSFORME
>
> > >
>
> > >
>
> > > whether we acknowledge it or not, Hadoop 2.x is now really Java 7+.
>
> > >
>
> > > We do all agree that hadoop 3 will not be java 6, so the only issue is
>
> > > "when and how to make that transition".
>
> > >
>
> > > That patch of mine just makes it possible to do today.
>
> > >
>
> > > I have actually jumped to Java7 in the slider proj

RE: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Ottenheimer, Davi
Andrew,



“I don't see any point to switching” is an interesting perspective, given the 
well-known risks of running unsafe software. Clearly customer best interest is 
stability. JDK6 is in a known unsafe state. The longer anyone delays the 
necessary transition to safety the longer the door is left open to predictable 
disaster.



You also said "we still test and support JDK6". I searched but have not been 
able to find Cloudera critical security fixes for JDK6.



Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other 
words, did you release to your customers any kind of public alert or warning of 
this CVSS 10.0 event as part of your JDK6 support?



http://www.cvedetails.com/cve/CVE-2013-2465/



If you are not releasing your own security fixes for JDK6 post-EOL would it 
perhaps be safer to say Cloudera is hands-off; neither supports, nor opposes 
the known insecure and deprecated/unpatched JDK?



I mentioned before in this thread the Oracle support timeline:



- official public EOL (end of life) was more than a year ago

- premier support ended more than six months ago

- extended support may get critical security fixes until the end of 2016



Given this timeline, does Cloudera officially take responsibility for Hadoop 
customer safety? Are you going to be releasing critical security fixes to a 
known unsafe JDK?



Davi







> -Original Message-

> From: Andrew Wang [mailto:andrew.w...@cloudera.com]

> Sent: Wednesday, June 18, 2014 12:33 PM

> To: common-dev@hadoop.apache.org

> Subject: Re: Plans of moving towards JDK7 in trunk

>

> Actually, a lot of our customers are still on JDK6, so if anything, its 
> popularity

> hasn't significantly decreased. We still test and support JDK6 for CDH4 and

> CDH5. The claim that branch-2 is effectively JDK7 because no one supports

> JDK6 is untrue.

>

> One issue with your proposal is that java 7+ libraries can have incompatible

> APIs compared to their java 6 versions. Guava moves very quickly with regard

> to the deprecate+remove cycle. This means branch-2 and trunk divergence,

> as we're stuck using different Guava APIs to do the same thing.

>

> No one's arguing against moving to Java 7+ in trunk eventually, but there 
> isn't

> a clear plan for a trunk-based release. I don't see any point to switching 
> trunk

> over until that's true, for the aforementioned reasons.

>

> Best,

> Andrew

>

>

> On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran

> mailto:ste...@hortonworks.com>>

> wrote:

>

> > I also think we need to recognise that its been three months since

> > that last discussion, and Java 6 has not suddenly burst back into

> > popularity

> >

> >

> >- nobody providing commercial support for Hadoop is offering branch-2

> >support on Java 6 AFAIK

> >- therefore, nobody is testing it at scale except privately, and they

> >aren't reporting bugs if they are

> >- if someone actually did file a bug on something on branch-2 which

> >didn't work on Java 6 but went away on Java7+, we'd probably close

> > it as a

> >WORKSFORME

> >

> >

> > whether we acknowledge it or not, Hadoop 2.x is now really Java 7+.

> >

> > We do all agree that hadoop 3 will not be java 6, so the only issue is

> > "when and how to make that transition".

> >

> > That patch of mine just makes it possible to do today.

> >

> > I have actually jumped to Java7 in the slider project, and actually

> > being using Java 8 and twill; the new language features there are

> > significant and would be great to use in Hadoop *at some point in the

> > future*

> >

> > For Java 7 though, based on that experience, the language changes are

> > convenient but not essential

> >

> >- try-with-resources simply swallows close failures without the log

> >integration we have with IOUtils.closeStream(), so shoudn't be used in

> >hadoop core anyway.

> >- string based switching: convenient, but not critical

> >- type inference on template constructors. Modern IDEs handle the pain

> >anyway

> >

> > The only feature I like is multi-catch and typed rethrow

> >

> > catch(IOException | ExitException e) {  log.warn(e.toString();  throw

> > e; }

> >

> > this would make "e" look like Exception, but when rethrown go back to

> > its original type.

> >

> > This reduces duplicate work, and is the bit l actually value. Is it

> > enough to justify making code incompatible across branches? No.

> >

> > So i&#

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Steve Loughran
On 18 June 2014 12:32, Andrew Wang  wrote:

> Actually, a lot of our customers are still on JDK6, so if anything, its
> popularity hasn't significantly decreased. We still test and support JDK6
> for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no
> one supports JDK6 is untrue.
>

Really?  I was misinformed
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Requirements-and-Supported-Versions/cdhrsv_jdk.html

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Andrew Wang
Actually, a lot of our customers are still on JDK6, so if anything, its
popularity hasn't significantly decreased. We still test and support JDK6
for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no
one supports JDK6 is untrue.

One issue with your proposal is that java 7+ libraries can have
incompatible APIs compared to their java 6 versions. Guava moves very
quickly with regard to the deprecate+remove cycle. This means branch-2 and
trunk divergence, as we're stuck using different Guava APIs to do the same
thing.

No one's arguing against moving to Java 7+ in trunk eventually, but there
isn't a clear plan for a trunk-based release. I don't see any point to
switching trunk over until that's true, for the aforementioned reasons.

Best,
Andrew


On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran 
wrote:

> I also think we need to recognise that its been three months since that
> last discussion, and Java 6 has not suddenly burst back into popularity
>
>
>- nobody providing commercial support for Hadoop is offering branch-2
>support on Java 6 AFAIK
>- therefore, nobody is testing it at scale except privately, and they
>aren't reporting bugs if they are
>- if someone actually did file a bug on something on branch-2 which
>didn't work on Java 6 but went away on Java7+, we'd probably close it
> as a
>WORKSFORME
>
>
> whether we acknowledge it or not, Hadoop 2.x is now really Java 7+.
>
> We do all agree that hadoop 3 will not be java 6, so the only issue is
> "when and how to make that transition".
>
> That patch of mine just makes it possible to do today.
>
> I have actually jumped to Java7 in the slider project, and actually being
> using Java 8 and twill; the new language features there are significant and
> would be great to use in Hadoop *at some point in the future*
>
> For Java 7 though, based on that experience, the language changes are
> convenient but not essential
>
>- try-with-resources simply swallows close failures without the log
>integration we have with IOUtils.closeStream(), so shoudn't be used in
>hadoop core anyway.
>- string based switching: convenient, but not critical
>- type inference on template constructors. Modern IDEs handle the pain
>anyway
>
> The only feature I like is multi-catch and typed rethrow
>
> catch(IOException | ExitException e) {
>  log.warn(e.toString();
>  throw e;
> }
>
> this would make "e" look like Exception, but when rethrown go back to its
> original type.
>
> This reduces duplicate work, and is the bit l actually value. Is it enough
> to justify making code incompatible across branches? No.
>
> So i'm going to propose this, and would like to start a vote on it soon
>
>
>1. we parameterize java versions in the POMs on all branches, with
>separate JDK versions and Java language
>2. branch-2: java-6-language and JDK-6 minimum JDK
>3. trunk: java-6-language and JDK-7 minimum JDK
>
> This would guarantee that none of the java 7 language features went in, but
> we could move trunk up to java 7+ only libraries (jersey, guava). Adopting
> JDK7 features then becomes no more different from adopting java7+
> libraries: those bits of code that have moved can't be backported.
>
> -Steve
>
>
>
>
>
> On 17 June 2014 22:08, Andrew Wang  wrote:
>
> > Reviving this thread, I noticed there's been a patch and +1 on
> > HADOOP-10530, and I don't think we actually reached a conclusion.
> >
> > I (and others) have expressed concerns about moving to JDK7 for trunk.
> > Summarizing a few points:
> >
> > - We can't move to JDK7 in branch-2 because of compatibility
> > - branch-2 is currently the only Hadoop release vehicle, there are no
> plans
> > for a trunk-based Hadoop 3
> > - Introducing JDK7-only APIs in trunk will increase divergence with
> > branch-2 and make backports harder
> > - Almost all developers care only about branch-2, since it is the only
> > release vehicle
> >
> > With this in mind, I struggle to see any upsides to introducing JDK7-only
> > APIs to trunk. Please let's not do anything on HADOOP-10530 or related
> > until we agree on this.
> >
> > Thanks,
> > Andrew
> >
> >
> > On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran 
> > wrote:
> >
> > > On 14 April 2014 17:46, Andrew Purtell  wrote:
> > >
> > > > How well is trunk tested? Does anyone deploy it with real
> applications
> > > > running on top? When will the trunk codebase next be the basis for a
> > > > production release? An impromptu diff of hadoop-common trunk against
> > > > branch-2 as of today is 38,625 lines. Can they be said to be the same
> > > > animal? I ask because any disincentive toward putting code in trunk
> is
> > > > beside the point, if the only target worth pursuing today is branch-2
> > > > unless one doesn't care if the code is released for production use.
> > > > Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter
> > for
> > > > the vast majority of Hadoopers if talking about branch-2

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Steve Loughran
I also think we need to recognise that its been three months since that
last discussion, and Java 6 has not suddenly burst back into popularity


   - nobody providing commercial support for Hadoop is offering branch-2
   support on Java 6 AFAIK
   - therefore, nobody is testing it at scale except privately, and they
   aren't reporting bugs if they are
   - if someone actually did file a bug on something on branch-2 which
   didn't work on Java 6 but went away on Java7+, we'd probably close it as a
   WORKSFORME


whether we acknowledge it or not, Hadoop 2.x is now really Java 7+.

We do all agree that hadoop 3 will not be java 6, so the only issue is
"when and how to make that transition".

That patch of mine just makes it possible to do today.

I have actually jumped to Java7 in the slider project, and actually being
using Java 8 and twill; the new language features there are significant and
would be great to use in Hadoop *at some point in the future*

For Java 7 though, based on that experience, the language changes are
convenient but not essential

   - try-with-resources simply swallows close failures without the log
   integration we have with IOUtils.closeStream(), so shoudn't be used in
   hadoop core anyway.
   - string based switching: convenient, but not critical
   - type inference on template constructors. Modern IDEs handle the pain
   anyway

The only feature I like is multi-catch and typed rethrow

catch(IOException | ExitException e) {
 log.warn(e.toString();
 throw e;
}

this would make "e" look like Exception, but when rethrown go back to its
original type.

This reduces duplicate work, and is the bit l actually value. Is it enough
to justify making code incompatible across branches? No.

So i'm going to propose this, and would like to start a vote on it soon


   1. we parameterize java versions in the POMs on all branches, with
   separate JDK versions and Java language
   2. branch-2: java-6-language and JDK-6 minimum JDK
   3. trunk: java-6-language and JDK-7 minimum JDK

This would guarantee that none of the java 7 language features went in, but
we could move trunk up to java 7+ only libraries (jersey, guava). Adopting
JDK7 features then becomes no more different from adopting java7+
libraries: those bits of code that have moved can't be backported.

-Steve





On 17 June 2014 22:08, Andrew Wang  wrote:

> Reviving this thread, I noticed there's been a patch and +1 on
> HADOOP-10530, and I don't think we actually reached a conclusion.
>
> I (and others) have expressed concerns about moving to JDK7 for trunk.
> Summarizing a few points:
>
> - We can't move to JDK7 in branch-2 because of compatibility
> - branch-2 is currently the only Hadoop release vehicle, there are no plans
> for a trunk-based Hadoop 3
> - Introducing JDK7-only APIs in trunk will increase divergence with
> branch-2 and make backports harder
> - Almost all developers care only about branch-2, since it is the only
> release vehicle
>
> With this in mind, I struggle to see any upsides to introducing JDK7-only
> APIs to trunk. Please let's not do anything on HADOOP-10530 or related
> until we agree on this.
>
> Thanks,
> Andrew
>
>
> On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran 
> wrote:
>
> > On 14 April 2014 17:46, Andrew Purtell  wrote:
> >
> > > How well is trunk tested? Does anyone deploy it with real applications
> > > running on top? When will the trunk codebase next be the basis for a
> > > production release? An impromptu diff of hadoop-common trunk against
> > > branch-2 as of today is 38,625 lines. Can they be said to be the same
> > > animal? I ask because any disincentive toward putting code in trunk is
> > > beside the point, if the only target worth pursuing today is branch-2
> > > unless one doesn't care if the code is released for production use.
> > > Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter
> for
> > > the vast majority of Hadoopers if talking about branch-2.
> > >
> > >
> > I think its partly a timescale issue; its also because the 1-2 transition
> > was so significant, especially at the YARN layer, that it's still taking
> > time to trickle through.
> >
> > If you do want code to ship this year, branch-2 is where you are going to
> > try and get it in -and like you say, that's where things get tried in the
> > field. At the same time, the constraints of stability are holding us back
> > -already-.
> >
> > I don't see why we should have such another major 1-2 transition in
> future;
> > the rate that Arun is pushing out 2.x releases its almost back to the
> 0.1x
> > timescale -though at that point most people were fending for themselves
> and
> > expectations of stability were less. We do want smaller version
> increments
> > in future, which branch-2 is -mostly- delivering.
> >
> > While Java 7 doesn't have some must-have features, Java 8 is a
> significant
> > improvement in the language, and we should be looking ahead to that,
> maybe
> > even doing some leadin

Re: Plans of moving towards JDK7 in trunk

2014-06-18 Thread Colin McCabe
I think we should come up with a plan for when the next Hadoop release
will drop support for JDK6.  We all know that day needs to come... the
only question is when.  I agree that writing the JDK7-only code
doesn't seem very productive unless we have a plan for when it will be
released and usable.

best,
Colin

On Tue, Jun 17, 2014 at 10:08 PM, Andrew Wang  wrote:
> Reviving this thread, I noticed there's been a patch and +1 on
> HADOOP-10530, and I don't think we actually reached a conclusion.
>
> I (and others) have expressed concerns about moving to JDK7 for trunk.
> Summarizing a few points:
>
> - We can't move to JDK7 in branch-2 because of compatibility
> - branch-2 is currently the only Hadoop release vehicle, there are no plans
> for a trunk-based Hadoop 3
> - Introducing JDK7-only APIs in trunk will increase divergence with
> branch-2 and make backports harder
> - Almost all developers care only about branch-2, since it is the only
> release vehicle
>
> With this in mind, I struggle to see any upsides to introducing JDK7-only
> APIs to trunk. Please let's not do anything on HADOOP-10530 or related
> until we agree on this.
>
> Thanks,
> Andrew
>
>
> On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran 
> wrote:
>
>> On 14 April 2014 17:46, Andrew Purtell  wrote:
>>
>> > How well is trunk tested? Does anyone deploy it with real applications
>> > running on top? When will the trunk codebase next be the basis for a
>> > production release? An impromptu diff of hadoop-common trunk against
>> > branch-2 as of today is 38,625 lines. Can they be said to be the same
>> > animal? I ask because any disincentive toward putting code in trunk is
>> > beside the point, if the only target worth pursuing today is branch-2
>> > unless one doesn't care if the code is released for production use.
>> > Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for
>> > the vast majority of Hadoopers if talking about branch-2.
>> >
>> >
>> I think its partly a timescale issue; its also because the 1-2 transition
>> was so significant, especially at the YARN layer, that it's still taking
>> time to trickle through.
>>
>> If you do want code to ship this year, branch-2 is where you are going to
>> try and get it in -and like you say, that's where things get tried in the
>> field. At the same time, the constraints of stability are holding us back
>> -already-.
>>
>> I don't see why we should have such another major 1-2 transition in future;
>> the rate that Arun is pushing out 2.x releases its almost back to the 0.1x
>> timescale -though at that point most people were fending for themselves and
>> expectations of stability were less. We do want smaller version increments
>> in future, which branch-2 is -mostly- delivering.
>>
>> While Java 7 doesn't have some must-have features, Java 8 is a significant
>> improvement in the language, and we should be looking ahead to that, maybe
>> even doing some leading-edge work on the side, so the same discussion
>> doesn't come up in two years time when java 7 goes EOL.
>>
>>
>> -steve
>>
>> (personal opinions only, etc, )
>>
>>
>> >
>> > On Mon, Apr 14, 2014 at 9:22 AM, Colin McCabe > > >wrote:
>> >
>> > > I think the bottom line here is that as long as our stable release
>> > > uses JDK6, there is going to be a very, very strong disincentive to
>> > > put any code which can't run on JDK6 into trunk.
>> > >
>> > > Like I said earlier, the traditional reason for putting something in
>> > > trunk but not the stable release is that it needs more testing.  If a
>> > > stable release that drops support for JDK6 is more than a year away,
>> > > does it make sense to put anything in trunk like that?  What might
>> > > need more than a year of testing?  Certainly not changes to
>> > > LocalFileSystem to use the new APIs.  I also don't think an upgrade to
>> > > various libraries qualifies.
>> > >
>> > > It might be best to shelve this for now, like we've done in the past,
>> > > until we're ready to talk about a stable release that requires JDK7+.
>> > > At least that's my feeling.
>> > >
>> > > If we're really desperate for the new file APIs JDK7 provides, we
>> > > could consider using loadable modules for it in branch-2.  This is
>> > > similar to how we provide JNI versions of certain things on certain
>> > > platforms, without dropping support for the other platforms.
>> > >
>> > > best,
>> > > Colin
>> > >
>> > > On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata 
>> > > wrote:
>> > > > There's an outstanding question addressed to me: "Are there
>> particular
>> > > > features or new dependencies that you would like to contribute (or
>> see
>> > > > contributed) that require using the Java 1.7 APIs?"  The question
>> > > > misses the point: We'd figure out how to write something we wanted to
>> > > > contribute to Hadoop against the APIs of Java4 if that's what it took
>> > > > to get them into a stable release.  And at current course and speed,
>> > > > that's how ridiculous things

Re: Plans of moving towards JDK7 in trunk

2014-06-17 Thread Andrew Wang
Reviving this thread, I noticed there's been a patch and +1 on
HADOOP-10530, and I don't think we actually reached a conclusion.

I (and others) have expressed concerns about moving to JDK7 for trunk.
Summarizing a few points:

- We can't move to JDK7 in branch-2 because of compatibility
- branch-2 is currently the only Hadoop release vehicle, there are no plans
for a trunk-based Hadoop 3
- Introducing JDK7-only APIs in trunk will increase divergence with
branch-2 and make backports harder
- Almost all developers care only about branch-2, since it is the only
release vehicle

With this in mind, I struggle to see any upsides to introducing JDK7-only
APIs to trunk. Please let's not do anything on HADOOP-10530 or related
until we agree on this.

Thanks,
Andrew


On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran 
wrote:

> On 14 April 2014 17:46, Andrew Purtell  wrote:
>
> > How well is trunk tested? Does anyone deploy it with real applications
> > running on top? When will the trunk codebase next be the basis for a
> > production release? An impromptu diff of hadoop-common trunk against
> > branch-2 as of today is 38,625 lines. Can they be said to be the same
> > animal? I ask because any disincentive toward putting code in trunk is
> > beside the point, if the only target worth pursuing today is branch-2
> > unless one doesn't care if the code is released for production use.
> > Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for
> > the vast majority of Hadoopers if talking about branch-2.
> >
> >
> I think its partly a timescale issue; its also because the 1-2 transition
> was so significant, especially at the YARN layer, that it's still taking
> time to trickle through.
>
> If you do want code to ship this year, branch-2 is where you are going to
> try and get it in -and like you say, that's where things get tried in the
> field. At the same time, the constraints of stability are holding us back
> -already-.
>
> I don't see why we should have such another major 1-2 transition in future;
> the rate that Arun is pushing out 2.x releases its almost back to the 0.1x
> timescale -though at that point most people were fending for themselves and
> expectations of stability were less. We do want smaller version increments
> in future, which branch-2 is -mostly- delivering.
>
> While Java 7 doesn't have some must-have features, Java 8 is a significant
> improvement in the language, and we should be looking ahead to that, maybe
> even doing some leading-edge work on the side, so the same discussion
> doesn't come up in two years time when java 7 goes EOL.
>
>
> -steve
>
> (personal opinions only, etc, )
>
>
> >
> > On Mon, Apr 14, 2014 at 9:22 AM, Colin McCabe  > >wrote:
> >
> > > I think the bottom line here is that as long as our stable release
> > > uses JDK6, there is going to be a very, very strong disincentive to
> > > put any code which can't run on JDK6 into trunk.
> > >
> > > Like I said earlier, the traditional reason for putting something in
> > > trunk but not the stable release is that it needs more testing.  If a
> > > stable release that drops support for JDK6 is more than a year away,
> > > does it make sense to put anything in trunk like that?  What might
> > > need more than a year of testing?  Certainly not changes to
> > > LocalFileSystem to use the new APIs.  I also don't think an upgrade to
> > > various libraries qualifies.
> > >
> > > It might be best to shelve this for now, like we've done in the past,
> > > until we're ready to talk about a stable release that requires JDK7+.
> > > At least that's my feeling.
> > >
> > > If we're really desperate for the new file APIs JDK7 provides, we
> > > could consider using loadable modules for it in branch-2.  This is
> > > similar to how we provide JNI versions of certain things on certain
> > > platforms, without dropping support for the other platforms.
> > >
> > > best,
> > > Colin
> > >
> > > On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata 
> > > wrote:
> > > > There's an outstanding question addressed to me: "Are there
> particular
> > > > features or new dependencies that you would like to contribute (or
> see
> > > > contributed) that require using the Java 1.7 APIs?"  The question
> > > > misses the point: We'd figure out how to write something we wanted to
> > > > contribute to Hadoop against the APIs of Java4 if that's what it took
> > > > to get them into a stable release.  And at current course and speed,
> > > > that's how ridiculous things could get.
> > > >
> > > > To summarize, it seems like there's a vague consensus that it might
> be
> > > > okay to eventually allow the use of Java7 in trunk, but there's no
> > > > decision.  And there's been no answer to the concern that even if
> such
> > > > dependencies were allowed in Java7, the only people using them would
> > > > be people who uninterested in getting their patches into a stable
> > > > release of Hadoop on any knowable timeframe, which doesn't bode well
> 

Re: Plans of moving towards JDK7 in trunk

2014-04-14 Thread Steve Loughran
On 14 April 2014 17:46, Andrew Purtell  wrote:

> How well is trunk tested? Does anyone deploy it with real applications
> running on top? When will the trunk codebase next be the basis for a
> production release? An impromptu diff of hadoop-common trunk against
> branch-2 as of today is 38,625 lines. Can they be said to be the same
> animal? I ask because any disincentive toward putting code in trunk is
> beside the point, if the only target worth pursuing today is branch-2
> unless one doesn't care if the code is released for production use.
> Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for
> the vast majority of Hadoopers if talking about branch-2.
>
>
I think its partly a timescale issue; its also because the 1-2 transition
was so significant, especially at the YARN layer, that it's still taking
time to trickle through.

If you do want code to ship this year, branch-2 is where you are going to
try and get it in -and like you say, that's where things get tried in the
field. At the same time, the constraints of stability are holding us back
-already-.

I don't see why we should have such another major 1-2 transition in future;
the rate that Arun is pushing out 2.x releases its almost back to the 0.1x
timescale -though at that point most people were fending for themselves and
expectations of stability were less. We do want smaller version increments
in future, which branch-2 is -mostly- delivering.

While Java 7 doesn't have some must-have features, Java 8 is a significant
improvement in the language, and we should be looking ahead to that, maybe
even doing some leading-edge work on the side, so the same discussion
doesn't come up in two years time when java 7 goes EOL.


-steve

(personal opinions only, etc, )


>
> On Mon, Apr 14, 2014 at 9:22 AM, Colin McCabe  >wrote:
>
> > I think the bottom line here is that as long as our stable release
> > uses JDK6, there is going to be a very, very strong disincentive to
> > put any code which can't run on JDK6 into trunk.
> >
> > Like I said earlier, the traditional reason for putting something in
> > trunk but not the stable release is that it needs more testing.  If a
> > stable release that drops support for JDK6 is more than a year away,
> > does it make sense to put anything in trunk like that?  What might
> > need more than a year of testing?  Certainly not changes to
> > LocalFileSystem to use the new APIs.  I also don't think an upgrade to
> > various libraries qualifies.
> >
> > It might be best to shelve this for now, like we've done in the past,
> > until we're ready to talk about a stable release that requires JDK7+.
> > At least that's my feeling.
> >
> > If we're really desperate for the new file APIs JDK7 provides, we
> > could consider using loadable modules for it in branch-2.  This is
> > similar to how we provide JNI versions of certain things on certain
> > platforms, without dropping support for the other platforms.
> >
> > best,
> > Colin
> >
> > On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata 
> > wrote:
> > > There's an outstanding question addressed to me: "Are there particular
> > > features or new dependencies that you would like to contribute (or see
> > > contributed) that require using the Java 1.7 APIs?"  The question
> > > misses the point: We'd figure out how to write something we wanted to
> > > contribute to Hadoop against the APIs of Java4 if that's what it took
> > > to get them into a stable release.  And at current course and speed,
> > > that's how ridiculous things could get.
> > >
> > > To summarize, it seems like there's a vague consensus that it might be
> > > okay to eventually allow the use of Java7 in trunk, but there's no
> > > decision.  And there's been no answer to the concern that even if such
> > > dependencies were allowed in Java7, the only people using them would
> > > be people who uninterested in getting their patches into a stable
> > > release of Hadoop on any knowable timeframe, which doesn't bode well
> > > for the ability to stabilize that Java7 code when it comes time to
> > > attempt to.
> > >
> > > I don't have more to add, so I'll go back to lurking.  It'll be
> > > interesting to see where we'll be standing a year from now.
> > >
> > > On Sun, Apr 13, 2014 at 2:09 AM, Tsuyoshi OZAWA
> > >  wrote:
> > >> Hi,
> > >>
> > >> +1 for Karthik's idea(non-binding).
> > >>
> > >> IMO, we should keep the compatibility between JDK 6 and JDK 7 on both
> > branch-1
> > >> and branch-2, because users can be using them. For future releases
> that
> > we can
> > >> declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7
> > >> features if we
> > >> can get benefits. However, it can increase maintenance costs and
> > distributes the
> > >> efforts of contributions to maintain branches. Then, I think it is
> > >> reasonable approach
> > >> that we use limited and minimum JDK-7 APIs when we have reasons we
> need
> > to use
> > >> the features.
> > >> By the way, if we start to

Re: Plans of moving towards JDK7 in trunk

2014-04-14 Thread Sangjin Lee
I would say, to an extent. The current state of the jetty version is
*severe*. We're 3 major versions behind, and if my understanding is
correct, it was a long time ago they EOF'ed version 6.x.

Yes, upgrading jetty could break some customers. However, we need to view
it in balance. We're constantly making customers scramble to work around
this stale dependency (and its transitive dependencies).

Sangjin

On Sat, Apr 12, 2014 at 7:29 AM, Alejandro Abdelnur wrote:

> i disagree, mustn't break downstrea
>
> Alejandro
> (phone typing)
>
> > On Apr 12, 2014, at 3:15, Steve Loughran  wrote:
> >
> > 1. I wasn't thinking of sticking of jetty in in the web ui or webhdfs at
> > all.
> > 2. the later jetties change their packaing, so should be able to co-exist
> > anyway.
> >
> > Jetty is a fundamental cause of problems, especially on things like
> > webhdfs. We can't use the excuse of "mustn't break downstream app
> classpath
> > compatibility" to avoid fixing significant problems
> >
> >
> >> On 11 April 2014 23:05, Alejandro Abdelnur  wrote:
> >>
> >> newer jetties have non backwards compat APIs, we would break any user
> app
> >> using jetty (consumed via hadoop classpath)
> >>
> >>
> >>
> >> On Fri, Apr 11, 2014 at 2:16 PM, Steve Loughran  >>> wrote:
> >>
> >>> that doesn't actually stop is from switching in our own code to
> alternate
> >>> web servers,  only that jetty can remain a published artifact in the
> >>> hadoop/lib dir
> >>>
> >>>
>  On 11 April 2014 21:16, Alejandro Abdelnur  wrote:
> 
>  because it is exposed as classpath dependency, changing it breaks
> >>> backward
>  compatibility.
> 
> 
>  On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran <
> >> ste...@hortonworks.com
> > wrote:
> 
> > Jetty's a big change, it's fairly intimately involved in bits of the
> >>> code
> >
> > but: it's a source of grief, currently webhdfs is an example
> > https://issues.apache.org/jira/browse/HDFS-6221
> >
> > all YARN apps seem to get hosted by it too
> >
> >
> >> On 11 April 2014 20:56, Robert Rati  wrote:
> >>
> >> I don't mean to be dense, but can you expand on why jetty 8 can't
> >> go
>  into
> >> branch2?  What is the concern?
> >>
> >> Rob
> >>
> >>
> >>> On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
> >>>
> >>> if you mean updating jetty on branch2, we cannot do that. it has
> >> to
> >>> be
> >>> done in trunk.
> >>>
> >>> thx
> >>>
> >>> Alejandro
> >>> (phone typing)
> >>>
>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> 
>  Just an FYI, but I'm working on updating that jetty patch for the
>  current 2.4.0 release.  The one that is there won't cleanly apply
> > because
>  so much has changed since it was posted.  I'll post a new patch
> >>> when
> > it's
>  done.
> 
>  Rob
> 
> > On 04/11/2014 04:24 AM, Steve Loughran wrote:
> >
> >> On 10 April 2014 18:12, Eli Collins  wrote:
> >>
> >> Let's speak less abstractly, are there particular features or
> >> new
> >> dependencies that you would like to contribute (or see
> >>> contributed)
> >> that
> >> require using the Java 1.7 APIs?  Breaking compat in v2 or
> >>> rolling
>  a
> > v3
> >> release are both non-trivial, not something I suspect we'd want
> >>> to
>  do
> >> just
> >> because it would be, for example, nicer to have a newer version
> >>> of
> >> Jetty.
> >
> > Oddly enough, rolling the web framework is something I'd like to
> >>> see
> > in
> > a
> > v3. the shuffle may be off jetty, but webhdfs isn't. Moving up
> >>> also
> > lets is
> > reliably switch to servlet API v3
> >
> > But.. I think we may be able to increment Jetty more without
> >> going
>  to
> > java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or
> >>> entity
>  to
> > which it is addressed and may contain information that is
> >> confidential,
> > privileged and exempt from disclosure under applicable law. If the
> >>> reader
> > of this message is not the intended recipient, you are hereby
> >> notified
>  that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
>  immediately
> > and delete it from your system. Thank You.
> 
> 
> 
>  --
>  Alejandro
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >>> which it is addressed and may cont

Re: Plans of moving towards JDK7 in trunk

2014-04-14 Thread Andrew Purtell
How well is trunk tested? Does anyone deploy it with real applications
running on top? When will the trunk codebase next be the basis for a
production release? An impromptu diff of hadoop-common trunk against
branch-2 as of today is 38,625 lines. Can they be said to be the same
animal? I ask because any disincentive toward putting code in trunk is
beside the point, if the only target worth pursuing today is branch-2
unless one doesn't care if the code is released for production use.
Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for
the vast majority of Hadoopers if talking about branch-2.


On Mon, Apr 14, 2014 at 9:22 AM, Colin McCabe wrote:

> I think the bottom line here is that as long as our stable release
> uses JDK6, there is going to be a very, very strong disincentive to
> put any code which can't run on JDK6 into trunk.
>
> Like I said earlier, the traditional reason for putting something in
> trunk but not the stable release is that it needs more testing.  If a
> stable release that drops support for JDK6 is more than a year away,
> does it make sense to put anything in trunk like that?  What might
> need more than a year of testing?  Certainly not changes to
> LocalFileSystem to use the new APIs.  I also don't think an upgrade to
> various libraries qualifies.
>
> It might be best to shelve this for now, like we've done in the past,
> until we're ready to talk about a stable release that requires JDK7+.
> At least that's my feeling.
>
> If we're really desperate for the new file APIs JDK7 provides, we
> could consider using loadable modules for it in branch-2.  This is
> similar to how we provide JNI versions of certain things on certain
> platforms, without dropping support for the other platforms.
>
> best,
> Colin
>
> On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata 
> wrote:
> > There's an outstanding question addressed to me: "Are there particular
> > features or new dependencies that you would like to contribute (or see
> > contributed) that require using the Java 1.7 APIs?"  The question
> > misses the point: We'd figure out how to write something we wanted to
> > contribute to Hadoop against the APIs of Java4 if that's what it took
> > to get them into a stable release.  And at current course and speed,
> > that's how ridiculous things could get.
> >
> > To summarize, it seems like there's a vague consensus that it might be
> > okay to eventually allow the use of Java7 in trunk, but there's no
> > decision.  And there's been no answer to the concern that even if such
> > dependencies were allowed in Java7, the only people using them would
> > be people who uninterested in getting their patches into a stable
> > release of Hadoop on any knowable timeframe, which doesn't bode well
> > for the ability to stabilize that Java7 code when it comes time to
> > attempt to.
> >
> > I don't have more to add, so I'll go back to lurking.  It'll be
> > interesting to see where we'll be standing a year from now.
> >
> > On Sun, Apr 13, 2014 at 2:09 AM, Tsuyoshi OZAWA
> >  wrote:
> >> Hi,
> >>
> >> +1 for Karthik's idea(non-binding).
> >>
> >> IMO, we should keep the compatibility between JDK 6 and JDK 7 on both
> branch-1
> >> and branch-2, because users can be using them. For future releases that
> we can
> >> declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7
> >> features if we
> >> can get benefits. However, it can increase maintenance costs and
> distributes the
> >> efforts of contributions to maintain branches. Then, I think it is
> >> reasonable approach
> >> that we use limited and minimum JDK-7 APIs when we have reasons we need
> to use
> >> the features.
> >> By the way, if we start to use JDK 7 APIs, we should declare the basis
> >> when to use
> >> JDK 7 APIs on Wiki not to confuse contributors.
> >>
> >> Thanks,
> >> - Tsuyoshi
> >>
> >> On Wed, Apr 9, 2014 at 11:44 AM, Raymie Stata 
> wrote:
>  It might make sense to try to enumerate the benefits of switching to
>  Java7 APIs and dependencies.
> >>>
> >>>   - Java7 introduced a huge number of language, byte-code, API, and
> >>> tooling enhancements!  Just to name a few: try-with-resources, newer
> >>> and stronger encyrption methods, more scalable concurrency primitives.
> >>>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
> >>>
> >>>   - We can't update current dependencies, and we can't add cool new
> ones.
> >>>
> >>>   - Putting language/APIs aside, don't forget that a huge amount of
> effort
> >>> goes into qualifying for Java6 (at least, I hope the folks claiming to
> >>> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
> >>> users/customers be better served if qualification effort went into
> >>> Java7/8 versus Java6/7?
> >>>
> >>> Getting to Java7 as a development env (and Java8 as a runtime env)
> >>> seems like a no-brainer.  Question is: How?
> >>>
> >>> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza 
> wrote:
>  It might make sense to try to enu

Re: Plans of moving towards JDK7 in trunk

2014-04-14 Thread Colin McCabe
I think the bottom line here is that as long as our stable release
uses JDK6, there is going to be a very, very strong disincentive to
put any code which can't run on JDK6 into trunk.

Like I said earlier, the traditional reason for putting something in
trunk but not the stable release is that it needs more testing.  If a
stable release that drops support for JDK6 is more than a year away,
does it make sense to put anything in trunk like that?  What might
need more than a year of testing?  Certainly not changes to
LocalFileSystem to use the new APIs.  I also don't think an upgrade to
various libraries qualifies.

It might be best to shelve this for now, like we've done in the past,
until we're ready to talk about a stable release that requires JDK7+.
At least that's my feeling.

If we're really desperate for the new file APIs JDK7 provides, we
could consider using loadable modules for it in branch-2.  This is
similar to how we provide JNI versions of certain things on certain
platforms, without dropping support for the other platforms.

best,
Colin

On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata  wrote:
> There's an outstanding question addressed to me: "Are there particular
> features or new dependencies that you would like to contribute (or see
> contributed) that require using the Java 1.7 APIs?"  The question
> misses the point: We'd figure out how to write something we wanted to
> contribute to Hadoop against the APIs of Java4 if that's what it took
> to get them into a stable release.  And at current course and speed,
> that's how ridiculous things could get.
>
> To summarize, it seems like there's a vague consensus that it might be
> okay to eventually allow the use of Java7 in trunk, but there's no
> decision.  And there's been no answer to the concern that even if such
> dependencies were allowed in Java7, the only people using them would
> be people who uninterested in getting their patches into a stable
> release of Hadoop on any knowable timeframe, which doesn't bode well
> for the ability to stabilize that Java7 code when it comes time to
> attempt to.
>
> I don't have more to add, so I'll go back to lurking.  It'll be
> interesting to see where we'll be standing a year from now.
>
> On Sun, Apr 13, 2014 at 2:09 AM, Tsuyoshi OZAWA
>  wrote:
>> Hi,
>>
>> +1 for Karthik's idea(non-binding).
>>
>> IMO, we should keep the compatibility between JDK 6 and JDK 7 on both 
>> branch-1
>> and branch-2, because users can be using them. For future releases that we 
>> can
>> declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7
>> features if we
>> can get benefits. However, it can increase maintenance costs and distributes 
>> the
>> efforts of contributions to maintain branches. Then, I think it is
>> reasonable approach
>> that we use limited and minimum JDK-7 APIs when we have reasons we need to 
>> use
>> the features.
>> By the way, if we start to use JDK 7 APIs, we should declare the basis
>> when to use
>> JDK 7 APIs on Wiki not to confuse contributors.
>>
>> Thanks,
>> - Tsuyoshi
>>
>> On Wed, Apr 9, 2014 at 11:44 AM, Raymie Stata  wrote:
 It might make sense to try to enumerate the benefits of switching to
 Java7 APIs and dependencies.
>>>
>>>   - Java7 introduced a huge number of language, byte-code, API, and
>>> tooling enhancements!  Just to name a few: try-with-resources, newer
>>> and stronger encyrption methods, more scalable concurrency primitives.
>>>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>>>
>>>   - We can't update current dependencies, and we can't add cool new ones.
>>>
>>>   - Putting language/APIs aside, don't forget that a huge amount of effort
>>> goes into qualifying for Java6 (at least, I hope the folks claiming to
>>> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
>>> users/customers be better served if qualification effort went into
>>> Java7/8 versus Java6/7?
>>>
>>> Getting to Java7 as a development env (and Java8 as a runtime env)
>>> seems like a no-brainer.  Question is: How?
>>>
>>> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza  wrote:
 It might make sense to try to enumerate the benefits of switching to Java7
 APIs and dependencies.  IMO, the ones listed so far on this thread don't
 make a compelling enough case to drop Java6 in branch-2 on any time frame,
 even if this means supporting Java6 through 2015.  For example, the change
 in RawLocalFileSystem semantics might be an incompatible change for
 branch-2 any way.


 On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla 
 wrote:

> +1 to NOT breaking compatibility in branch-2.
>
> I think it is reasonable to require JDK7 for trunk, if we limit use of
> JDK7-only API to security fixes etc. If we make other optimizations (like
> IO), it would be a pain to backport things to branch-2. I guess this all
> depends on when we see ourselves shipping Hadoop-3. Any ideas on that?
>
>
> On 

Re: Plans of moving towards JDK7 in trunk

2014-04-13 Thread Raymie Stata
There's an outstanding question addressed to me: "Are there particular
features or new dependencies that you would like to contribute (or see
contributed) that require using the Java 1.7 APIs?"  The question
misses the point: We'd figure out how to write something we wanted to
contribute to Hadoop against the APIs of Java4 if that's what it took
to get them into a stable release.  And at current course and speed,
that's how ridiculous things could get.

To summarize, it seems like there's a vague consensus that it might be
okay to eventually allow the use of Java7 in trunk, but there's no
decision.  And there's been no answer to the concern that even if such
dependencies were allowed in Java7, the only people using them would
be people who uninterested in getting their patches into a stable
release of Hadoop on any knowable timeframe, which doesn't bode well
for the ability to stabilize that Java7 code when it comes time to
attempt to.

I don't have more to add, so I'll go back to lurking.  It'll be
interesting to see where we'll be standing a year from now.

On Sun, Apr 13, 2014 at 2:09 AM, Tsuyoshi OZAWA
 wrote:
> Hi,
>
> +1 for Karthik's idea(non-binding).
>
> IMO, we should keep the compatibility between JDK 6 and JDK 7 on both branch-1
> and branch-2, because users can be using them. For future releases that we can
> declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7
> features if we
> can get benefits. However, it can increase maintenance costs and distributes 
> the
> efforts of contributions to maintain branches. Then, I think it is
> reasonable approach
> that we use limited and minimum JDK-7 APIs when we have reasons we need to use
> the features.
> By the way, if we start to use JDK 7 APIs, we should declare the basis
> when to use
> JDK 7 APIs on Wiki not to confuse contributors.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Apr 9, 2014 at 11:44 AM, Raymie Stata  wrote:
>>> It might make sense to try to enumerate the benefits of switching to
>>> Java7 APIs and dependencies.
>>
>>   - Java7 introduced a huge number of language, byte-code, API, and
>> tooling enhancements!  Just to name a few: try-with-resources, newer
>> and stronger encyrption methods, more scalable concurrency primitives.
>>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>>
>>   - We can't update current dependencies, and we can't add cool new ones.
>>
>>   - Putting language/APIs aside, don't forget that a huge amount of effort
>> goes into qualifying for Java6 (at least, I hope the folks claiming to
>> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
>> users/customers be better served if qualification effort went into
>> Java7/8 versus Java6/7?
>>
>> Getting to Java7 as a development env (and Java8 as a runtime env)
>> seems like a no-brainer.  Question is: How?
>>
>> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza  wrote:
>>> It might make sense to try to enumerate the benefits of switching to Java7
>>> APIs and dependencies.  IMO, the ones listed so far on this thread don't
>>> make a compelling enough case to drop Java6 in branch-2 on any time frame,
>>> even if this means supporting Java6 through 2015.  For example, the change
>>> in RawLocalFileSystem semantics might be an incompatible change for
>>> branch-2 any way.
>>>
>>>
>>> On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla wrote:
>>>
 +1 to NOT breaking compatibility in branch-2.

 I think it is reasonable to require JDK7 for trunk, if we limit use of
 JDK7-only API to security fixes etc. If we make other optimizations (like
 IO), it would be a pain to backport things to branch-2. I guess this all
 depends on when we see ourselves shipping Hadoop-3. Any ideas on that?


 On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:

 > On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
 >  wrote:
 > >> From: Eli Collins [mailto:e...@cloudera.com]
 > >> Sent: Monday, April 07, 2014 11:54 AM
 > >>
 > >>
 > >> IMO we should not drop support for Java 6 in a minor update of a
 stable
 > >> release (v2).  I don't think the larger Hadoop user base would find it
 > >> acceptable that upgrading to a minor update caused their systems to
 stop
 > >> working because they didn't upgrade Java. There are people still
 getting
 > >> support for Java 6. ...
 > >>
 > >> Thanks,
 > >> Eli
 > >
 > > Hi Eli,
 > >
 > > Technically you are correct those with extended support get critical
 > security fixes for 6 until the end of 2016. I am curious whether many of
 > those are in the Hadoop user base. Do you know? My guess is the vast
 > majority are within Oracle's official public end of life, which was over
 12
 > months ago. Even Premier support ended Dec 2013:
 > >
 > > http://www.oracle.com/technetwork/java/eol-135779.html
 > >
 > > The end of Java 6 support carries much risk. It has to be considered

Re: Plans of moving towards JDK7 in trunk

2014-04-13 Thread Tsuyoshi OZAWA
Hi,

+1 for Karthik's idea(non-binding).

IMO, we should keep the compatibility between JDK 6 and JDK 7 on both branch-1
and branch-2, because users can be using them. For future releases that we can
declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7
features if we
can get benefits. However, it can increase maintenance costs and distributes the
efforts of contributions to maintain branches. Then, I think it is
reasonable approach
that we use limited and minimum JDK-7 APIs when we have reasons we need to use
the features.
By the way, if we start to use JDK 7 APIs, we should declare the basis
when to use
JDK 7 APIs on Wiki not to confuse contributors.

Thanks,
- Tsuyoshi

On Wed, Apr 9, 2014 at 11:44 AM, Raymie Stata  wrote:
>> It might make sense to try to enumerate the benefits of switching to
>> Java7 APIs and dependencies.
>
>   - Java7 introduced a huge number of language, byte-code, API, and
> tooling enhancements!  Just to name a few: try-with-resources, newer
> and stronger encyrption methods, more scalable concurrency primitives.
>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>
>   - We can't update current dependencies, and we can't add cool new ones.
>
>   - Putting language/APIs aside, don't forget that a huge amount of effort
> goes into qualifying for Java6 (at least, I hope the folks claiming to
> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
> users/customers be better served if qualification effort went into
> Java7/8 versus Java6/7?
>
> Getting to Java7 as a development env (and Java8 as a runtime env)
> seems like a no-brainer.  Question is: How?
>
> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza  wrote:
>> It might make sense to try to enumerate the benefits of switching to Java7
>> APIs and dependencies.  IMO, the ones listed so far on this thread don't
>> make a compelling enough case to drop Java6 in branch-2 on any time frame,
>> even if this means supporting Java6 through 2015.  For example, the change
>> in RawLocalFileSystem semantics might be an incompatible change for
>> branch-2 any way.
>>
>>
>> On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla wrote:
>>
>>> +1 to NOT breaking compatibility in branch-2.
>>>
>>> I think it is reasonable to require JDK7 for trunk, if we limit use of
>>> JDK7-only API to security fixes etc. If we make other optimizations (like
>>> IO), it would be a pain to backport things to branch-2. I guess this all
>>> depends on when we see ourselves shipping Hadoop-3. Any ideas on that?
>>>
>>>
>>> On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:
>>>
>>> > On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
>>> >  wrote:
>>> > >> From: Eli Collins [mailto:e...@cloudera.com]
>>> > >> Sent: Monday, April 07, 2014 11:54 AM
>>> > >>
>>> > >>
>>> > >> IMO we should not drop support for Java 6 in a minor update of a
>>> stable
>>> > >> release (v2).  I don't think the larger Hadoop user base would find it
>>> > >> acceptable that upgrading to a minor update caused their systems to
>>> stop
>>> > >> working because they didn't upgrade Java. There are people still
>>> getting
>>> > >> support for Java 6. ...
>>> > >>
>>> > >> Thanks,
>>> > >> Eli
>>> > >
>>> > > Hi Eli,
>>> > >
>>> > > Technically you are correct those with extended support get critical
>>> > security fixes for 6 until the end of 2016. I am curious whether many of
>>> > those are in the Hadoop user base. Do you know? My guess is the vast
>>> > majority are within Oracle's official public end of life, which was over
>>> 12
>>> > months ago. Even Premier support ended Dec 2013:
>>> > >
>>> > > http://www.oracle.com/technetwork/java/eol-135779.html
>>> > >
>>> > > The end of Java 6 support carries much risk. It has to be considered in
>>> > terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
>>> > score 10.0.
>>> > >
>>> > > http://www.cvedetails.com/cve/CVE-2013-2465/
>>> > >
>>> > > Since you mentioned "caused systems to stop" as an example of what
>>> would
>>> > be a concern to Hadoop users, please note the CVE-2013-2465 availability
>>> > impact:
>>> > >
>>> > > "Complete (There is a total shutdown of the affected resource. The
>>> > attacker can render the resource completely unavailable.)"
>>> > >
>>> > > This vulnerability was patched in Java 6 Update 51, but post end of
>>> > life. Apple pushed out the update specifically because of this
>>> > vulnerability (http://support.apple.com/kb/HT5717) as did some other
>>> > vendors privately, but for the majority of people using Java 6 means they
>>> > have a ticking time bomb.
>>> > >
>>> > > Allowing it to stay should be considered in terms of accepting the
>>> whole
>>> > risk posture.
>>> > >
>>> >
>>> > There are some who get extended support, but I suspect many just have
>>> > a if-it's-not-broke mentality when it comes to production deployments.
>>> > The current code supports both java6 and java7 and so allows these
>>> > people to remain compatible, while enabling oth

Re: Plans of moving towards JDK7 in trunk

2014-04-12 Thread Alejandro Abdelnur
i disagree, mustn't break downstrea

Alejandro
(phone typing)

> On Apr 12, 2014, at 3:15, Steve Loughran  wrote:
> 
> 1. I wasn't thinking of sticking of jetty in in the web ui or webhdfs at
> all.
> 2. the later jetties change their packaing, so should be able to co-exist
> anyway.
> 
> Jetty is a fundamental cause of problems, especially on things like
> webhdfs. We can't use the excuse of "mustn't break downstream app classpath
> compatibility" to avoid fixing significant problems
> 
> 
>> On 11 April 2014 23:05, Alejandro Abdelnur  wrote:
>> 
>> newer jetties have non backwards compat APIs, we would break any user app
>> using jetty (consumed via hadoop classpath)
>> 
>> 
>> 
>> On Fri, Apr 11, 2014 at 2:16 PM, Steve Loughran >> wrote:
>> 
>>> that doesn't actually stop is from switching in our own code to alternate
>>> web servers,  only that jetty can remain a published artifact in the
>>> hadoop/lib dir
>>> 
>>> 
 On 11 April 2014 21:16, Alejandro Abdelnur  wrote:
 
 because it is exposed as classpath dependency, changing it breaks
>>> backward
 compatibility.
 
 
 On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran <
>> ste...@hortonworks.com
> wrote:
 
> Jetty's a big change, it's fairly intimately involved in bits of the
>>> code
> 
> but: it's a source of grief, currently webhdfs is an example
> https://issues.apache.org/jira/browse/HDFS-6221
> 
> all YARN apps seem to get hosted by it too
> 
> 
>> On 11 April 2014 20:56, Robert Rati  wrote:
>> 
>> I don't mean to be dense, but can you expand on why jetty 8 can't
>> go
 into
>> branch2?  What is the concern?
>> 
>> Rob
>> 
>> 
>>> On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
>>> 
>>> if you mean updating jetty on branch2, we cannot do that. it has
>> to
>>> be
>>> done in trunk.
>>> 
>>> thx
>>> 
>>> Alejandro
>>> (phone typing)
>>> 
 On Apr 11, 2014, at 4:46, Robert Rati  wrote:
 
 Just an FYI, but I'm working on updating that jetty patch for the
 current 2.4.0 release.  The one that is there won't cleanly apply
> because
 so much has changed since it was posted.  I'll post a new patch
>>> when
> it's
 done.
 
 Rob
 
> On 04/11/2014 04:24 AM, Steve Loughran wrote:
> 
>> On 10 April 2014 18:12, Eli Collins  wrote:
>> 
>> Let's speak less abstractly, are there particular features or
>> new
>> dependencies that you would like to contribute (or see
>>> contributed)
>> that
>> require using the Java 1.7 APIs?  Breaking compat in v2 or
>>> rolling
 a
> v3
>> release are both non-trivial, not something I suspect we'd want
>>> to
 do
>> just
>> because it would be, for example, nicer to have a newer version
>>> of
>> Jetty.
> 
> Oddly enough, rolling the web framework is something I'd like to
>>> see
> in
> a
> v3. the shuffle may be off jetty, but webhdfs isn't. Moving up
>>> also
> lets is
> reliably switch to servlet API v3
> 
> But.. I think we may be able to increment Jetty more without
>> going
 to
> java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> 
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or
>>> entity
 to
> which it is addressed and may contain information that is
>> confidential,
> privileged and exempt from disclosure under applicable law. If the
>>> reader
> of this message is not the intended recipient, you are hereby
>> notified
 that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender
 immediately
> and delete it from your system. Thank You.
 
 
 
 --
 Alejandro
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 
>> 
>> 
>> --
>> Alejandro
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure und

Re: Plans of moving towards JDK7 in trunk

2014-04-12 Thread Steve Loughran
1. I wasn't thinking of sticking of jetty in in the web ui or webhdfs at
all.
2. the later jetties change their packaing, so should be able to co-exist
anyway.

Jetty is a fundamental cause of problems, especially on things like
webhdfs. We can't use the excuse of "mustn't break downstream app classpath
compatibility" to avoid fixing significant problems


On 11 April 2014 23:05, Alejandro Abdelnur  wrote:

> newer jetties have non backwards compat APIs, we would break any user app
> using jetty (consumed via hadoop classpath)
>
>
>
> On Fri, Apr 11, 2014 at 2:16 PM, Steve Loughran  >wrote:
>
> > that doesn't actually stop is from switching in our own code to alternate
> > web servers,  only that jetty can remain a published artifact in the
> > hadoop/lib dir
> >
> >
> > On 11 April 2014 21:16, Alejandro Abdelnur  wrote:
> >
> > > because it is exposed as classpath dependency, changing it breaks
> > backward
> > > compatibility.
> > >
> > >
> > > On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran <
> ste...@hortonworks.com
> > > >wrote:
> > >
> > > > Jetty's a big change, it's fairly intimately involved in bits of the
> > code
> > > >
> > > > but: it's a source of grief, currently webhdfs is an example
> > > > https://issues.apache.org/jira/browse/HDFS-6221
> > > >
> > > > all YARN apps seem to get hosted by it too
> > > >
> > > >
> > > > On 11 April 2014 20:56, Robert Rati  wrote:
> > > >
> > > > > I don't mean to be dense, but can you expand on why jetty 8 can't
> go
> > > into
> > > > > branch2?  What is the concern?
> > > > >
> > > > > Rob
> > > > >
> > > > >
> > > > > On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
> > > > >
> > > > >> if you mean updating jetty on branch2, we cannot do that. it has
> to
> > be
> > > > >> done in trunk.
> > > > >>
> > > > >> thx
> > > > >>
> > > > >> Alejandro
> > > > >> (phone typing)
> > > > >>
> > > > >>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> > > > >>>
> > > > >>> Just an FYI, but I'm working on updating that jetty patch for the
> > > > >>> current 2.4.0 release.  The one that is there won't cleanly apply
> > > > because
> > > > >>> so much has changed since it was posted.  I'll post a new patch
> > when
> > > > it's
> > > > >>> done.
> > > > >>>
> > > > >>> Rob
> > > > >>>
> > > > >>>  On 04/11/2014 04:24 AM, Steve Loughran wrote:
> > > > 
> > > > > On 10 April 2014 18:12, Eli Collins  wrote:
> > > > >
> > > > > Let's speak less abstractly, are there particular features or
> new
> > > > > dependencies that you would like to contribute (or see
> > contributed)
> > > > > that
> > > > > require using the Java 1.7 APIs?  Breaking compat in v2 or
> > rolling
> > > a
> > > > v3
> > > > > release are both non-trivial, not something I suspect we'd want
> > to
> > > do
> > > > > just
> > > > > because it would be, for example, nicer to have a newer version
> > of
> > > > > Jetty.
> > > > >
> > > > 
> > > >  Oddly enough, rolling the web framework is something I'd like to
> > see
> > > > in
> > > >  a
> > > >  v3. the shuffle may be off jetty, but webhdfs isn't. Moving up
> > also
> > > >  lets is
> > > >  reliably switch to servlet API v3
> > > > 
> > > >  But.. I think we may be able to increment Jetty more without
> going
> > > to
> > > >  java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> > > > 
> > > > 
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidenti

Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Alejandro Abdelnur
newer jetties have non backwards compat APIs, we would break any user app
using jetty (consumed via hadoop classpath)



On Fri, Apr 11, 2014 at 2:16 PM, Steve Loughran wrote:

> that doesn't actually stop is from switching in our own code to alternate
> web servers,  only that jetty can remain a published artifact in the
> hadoop/lib dir
>
>
> On 11 April 2014 21:16, Alejandro Abdelnur  wrote:
>
> > because it is exposed as classpath dependency, changing it breaks
> backward
> > compatibility.
> >
> >
> > On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran  > >wrote:
> >
> > > Jetty's a big change, it's fairly intimately involved in bits of the
> code
> > >
> > > but: it's a source of grief, currently webhdfs is an example
> > > https://issues.apache.org/jira/browse/HDFS-6221
> > >
> > > all YARN apps seem to get hosted by it too
> > >
> > >
> > > On 11 April 2014 20:56, Robert Rati  wrote:
> > >
> > > > I don't mean to be dense, but can you expand on why jetty 8 can't go
> > into
> > > > branch2?  What is the concern?
> > > >
> > > > Rob
> > > >
> > > >
> > > > On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
> > > >
> > > >> if you mean updating jetty on branch2, we cannot do that. it has to
> be
> > > >> done in trunk.
> > > >>
> > > >> thx
> > > >>
> > > >> Alejandro
> > > >> (phone typing)
> > > >>
> > > >>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> > > >>>
> > > >>> Just an FYI, but I'm working on updating that jetty patch for the
> > > >>> current 2.4.0 release.  The one that is there won't cleanly apply
> > > because
> > > >>> so much has changed since it was posted.  I'll post a new patch
> when
> > > it's
> > > >>> done.
> > > >>>
> > > >>> Rob
> > > >>>
> > > >>>  On 04/11/2014 04:24 AM, Steve Loughran wrote:
> > > 
> > > > On 10 April 2014 18:12, Eli Collins  wrote:
> > > >
> > > > Let's speak less abstractly, are there particular features or new
> > > > dependencies that you would like to contribute (or see
> contributed)
> > > > that
> > > > require using the Java 1.7 APIs?  Breaking compat in v2 or
> rolling
> > a
> > > v3
> > > > release are both non-trivial, not something I suspect we'd want
> to
> > do
> > > > just
> > > > because it would be, for example, nicer to have a newer version
> of
> > > > Jetty.
> > > >
> > > 
> > >  Oddly enough, rolling the web framework is something I'd like to
> see
> > > in
> > >  a
> > >  v3. the shuffle may be off jetty, but webhdfs isn't. Moving up
> also
> > >  lets is
> > >  reliably switch to servlet API v3
> > > 
> > >  But.. I think we may be able to increment Jetty more without going
> > to
> > >  java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> > > 
> > > 
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Steve Loughran
that doesn't actually stop is from switching in our own code to alternate
web servers,  only that jetty can remain a published artifact in the
hadoop/lib dir


On 11 April 2014 21:16, Alejandro Abdelnur  wrote:

> because it is exposed as classpath dependency, changing it breaks backward
> compatibility.
>
>
> On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran  >wrote:
>
> > Jetty's a big change, it's fairly intimately involved in bits of the code
> >
> > but: it's a source of grief, currently webhdfs is an example
> > https://issues.apache.org/jira/browse/HDFS-6221
> >
> > all YARN apps seem to get hosted by it too
> >
> >
> > On 11 April 2014 20:56, Robert Rati  wrote:
> >
> > > I don't mean to be dense, but can you expand on why jetty 8 can't go
> into
> > > branch2?  What is the concern?
> > >
> > > Rob
> > >
> > >
> > > On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
> > >
> > >> if you mean updating jetty on branch2, we cannot do that. it has to be
> > >> done in trunk.
> > >>
> > >> thx
> > >>
> > >> Alejandro
> > >> (phone typing)
> > >>
> > >>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> > >>>
> > >>> Just an FYI, but I'm working on updating that jetty patch for the
> > >>> current 2.4.0 release.  The one that is there won't cleanly apply
> > because
> > >>> so much has changed since it was posted.  I'll post a new patch when
> > it's
> > >>> done.
> > >>>
> > >>> Rob
> > >>>
> > >>>  On 04/11/2014 04:24 AM, Steve Loughran wrote:
> > 
> > > On 10 April 2014 18:12, Eli Collins  wrote:
> > >
> > > Let's speak less abstractly, are there particular features or new
> > > dependencies that you would like to contribute (or see contributed)
> > > that
> > > require using the Java 1.7 APIs?  Breaking compat in v2 or rolling
> a
> > v3
> > > release are both non-trivial, not something I suspect we'd want to
> do
> > > just
> > > because it would be, for example, nicer to have a newer version of
> > > Jetty.
> > >
> > 
> >  Oddly enough, rolling the web framework is something I'd like to see
> > in
> >  a
> >  v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also
> >  lets is
> >  reliably switch to servlet API v3
> > 
> >  But.. I think we may be able to increment Jetty more without going
> to
> >  java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> > 
> > 
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Alejandro Abdelnur
because it is exposed as classpath dependency, changing it breaks backward
compatibility.


On Fri, Apr 11, 2014 at 1:02 PM, Steve Loughran wrote:

> Jetty's a big change, it's fairly intimately involved in bits of the code
>
> but: it's a source of grief, currently webhdfs is an example
> https://issues.apache.org/jira/browse/HDFS-6221
>
> all YARN apps seem to get hosted by it too
>
>
> On 11 April 2014 20:56, Robert Rati  wrote:
>
> > I don't mean to be dense, but can you expand on why jetty 8 can't go into
> > branch2?  What is the concern?
> >
> > Rob
> >
> >
> > On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
> >
> >> if you mean updating jetty on branch2, we cannot do that. it has to be
> >> done in trunk.
> >>
> >> thx
> >>
> >> Alejandro
> >> (phone typing)
> >>
> >>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> >>>
> >>> Just an FYI, but I'm working on updating that jetty patch for the
> >>> current 2.4.0 release.  The one that is there won't cleanly apply
> because
> >>> so much has changed since it was posted.  I'll post a new patch when
> it's
> >>> done.
> >>>
> >>> Rob
> >>>
> >>>  On 04/11/2014 04:24 AM, Steve Loughran wrote:
> 
> > On 10 April 2014 18:12, Eli Collins  wrote:
> >
> > Let's speak less abstractly, are there particular features or new
> > dependencies that you would like to contribute (or see contributed)
> > that
> > require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a
> v3
> > release are both non-trivial, not something I suspect we'd want to do
> > just
> > because it would be, for example, nicer to have a newer version of
> > Jetty.
> >
> 
>  Oddly enough, rolling the web framework is something I'd like to see
> in
>  a
>  v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also
>  lets is
>  reliably switch to servlet API v3
> 
>  But.. I think we may be able to increment Jetty more without going to
>  java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
> 
> 
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Steve Loughran
Jetty's a big change, it's fairly intimately involved in bits of the code

but: it's a source of grief, currently webhdfs is an example
https://issues.apache.org/jira/browse/HDFS-6221

all YARN apps seem to get hosted by it too


On 11 April 2014 20:56, Robert Rati  wrote:

> I don't mean to be dense, but can you expand on why jetty 8 can't go into
> branch2?  What is the concern?
>
> Rob
>
>
> On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:
>
>> if you mean updating jetty on branch2, we cannot do that. it has to be
>> done in trunk.
>>
>> thx
>>
>> Alejandro
>> (phone typing)
>>
>>  On Apr 11, 2014, at 4:46, Robert Rati  wrote:
>>>
>>> Just an FYI, but I'm working on updating that jetty patch for the
>>> current 2.4.0 release.  The one that is there won't cleanly apply because
>>> so much has changed since it was posted.  I'll post a new patch when it's
>>> done.
>>>
>>> Rob
>>>
>>>  On 04/11/2014 04:24 AM, Steve Loughran wrote:

> On 10 April 2014 18:12, Eli Collins  wrote:
>
> Let's speak less abstractly, are there particular features or new
> dependencies that you would like to contribute (or see contributed)
> that
> require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
> release are both non-trivial, not something I suspect we'd want to do
> just
> because it would be, for example, nicer to have a newer version of
> Jetty.
>

 Oddly enough, rolling the web framework is something I'd like to see in
 a
 v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also
 lets is
 reliably switch to servlet API v3

 But.. I think we may be able to increment Jetty more without going to
 java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Robert Rati
I don't mean to be dense, but can you expand on why jetty 8 can't go 
into branch2?  What is the concern?


Rob

On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:

if you mean updating jetty on branch2, we cannot do that. it has to be done in 
trunk.

thx

Alejandro
(phone typing)


On Apr 11, 2014, at 4:46, Robert Rati  wrote:

Just an FYI, but I'm working on updating that jetty patch for the current 2.4.0 
release.  The one that is there won't cleanly apply because so much has changed 
since it was posted.  I'll post a new patch when it's done.

Rob


On 04/11/2014 04:24 AM, Steve Loughran wrote:

On 10 April 2014 18:12, Eli Collins  wrote:

Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.


Oddly enough, rolling the web framework is something I'd like to see in a
v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
reliably switch to servlet API v3

But.. I think we may be able to increment Jetty more without going to
java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .



Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Alejandro Abdelnur
if you mean updating jetty on branch2, we cannot do that. it has to be done in 
trunk. 

thx

Alejandro
(phone typing)

> On Apr 11, 2014, at 4:46, Robert Rati  wrote:
> 
> Just an FYI, but I'm working on updating that jetty patch for the current 
> 2.4.0 release.  The one that is there won't cleanly apply because so much has 
> changed since it was posted.  I'll post a new patch when it's done.
> 
> Rob
> 
>> On 04/11/2014 04:24 AM, Steve Loughran wrote:
>>> On 10 April 2014 18:12, Eli Collins  wrote:
>>> 
>>> Let's speak less abstractly, are there particular features or new
>>> dependencies that you would like to contribute (or see contributed) that
>>> require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
>>> release are both non-trivial, not something I suspect we'd want to do just
>>> because it would be, for example, nicer to have a newer version of Jetty.
>> 
>> Oddly enough, rolling the web framework is something I'd like to see in a
>> v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
>> reliably switch to servlet API v3
>> 
>> But.. I think we may be able to increment Jetty more without going to
>> java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
>> 


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Robert Rati
Just an FYI, but I'm working on updating that jetty patch for the 
current 2.4.0 release.  The one that is there won't cleanly apply 
because so much has changed since it was posted.  I'll post a new patch 
when it's done.


Rob

On 04/11/2014 04:24 AM, Steve Loughran wrote:

On 10 April 2014 18:12, Eli Collins  wrote:


Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.



Oddly enough, rolling the web framework is something I'd like to see in a
v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
reliably switch to servlet API v3

But.. I think we may be able to increment Jetty more without going to
java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .



Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Steve Loughran
On 10 April 2014 18:12, Eli Collins  wrote:

> Let's speak less abstractly, are there particular features or new
> dependencies that you would like to contribute (or see contributed) that
> require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
> release are both non-trivial, not something I suspect we'd want to do just
> because it would be, for example, nicer to have a newer version of Jetty.
>

Oddly enough, rolling the web framework is something I'd like to see in a
v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
reliably switch to servlet API v3

But.. I think we may be able to increment Jetty more without going to
java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-10 Thread Alejandro Abdelnur
A bit of a different angle.

As the bottom of the stack Hadoop has to be conservative in adopting
things, but it should not preclude consumers of Hadoop (downstream projects
and Hadoop application developers) to have additional requirements such as
a higher JDK API than JDK6.

Hadoop 2.x should stick to using JDK6  API
Hadoop 2.x should be tested with multiple runtimes: JDK6, JDK7 and
eventually JDK8
Downstream projects and Hadoop application developers are free to require
any JDK6+ version for development and runtime.

Hadoop 3.x should allow using JDK7 API, bumping the minimum runtime
requirement to JDK7 and be tested with JDK7 and JDK8 runtimes.

Thanks.



On Thu, Apr 10, 2014 at 10:04 AM, Eli Collins  wrote:

> On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran  >wrote:
>
> > On 9 April 2014 23:52, Eli Collins  wrote:
> >
> > >
> > >
> > > For the sake of this discussion we should separate the runtime from
> > > the programming APIs. Users are already migrating to the java7 runtime
> > > for most of the reasons listed below (support, performance, bugs,
> > > etc), and the various distributions cert their Hadoop 2 based
> > > distributions on java7.  This gives users many of the benefits of
> > > java7, without forcing users off java6. Ie Hadoop does not need to
> > > switch to the java7 programming APIs to make sure everyone has a
> > > supported runtime.
> > >
> > >
> > +1: you can use Java 7 today; I'm not sure how tested Java 8 is
> >
> >
> > > The question here is really about when Hadoop, and the Hadoop
> > > ecosystem (since adjacent projects often end up in the same classpath)
> > > start using the java7 programming APIs and therefore break
> > > compatibility with java6 runtimes. I think our java6 runtime users
> > > would consider dropping support for their java runtime in an update of
> > > a major release to be an incompatible change (the binaries stop
> > > working on their current jvm).
> >
> >
> > do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?
> >
>
> I mean 2.x --> 2.(x+1).  Ie I'm running the 2.4 stable and upgrading to
> 2.5.
>
>
> >
> >
> > > That may be worth it if we can
> > > articulate sufficient value to offset the cost (they have to upgrade
> > > their environment, might make rolling upgrades stop working, etc), but
> > > I've not yet heard an argument that articulates the value relative to
> > > the cost.  Eg upgrading to the java7 APIs allows us to pull in
> > > dependencies with new major versions, but only if those dependencies
> > > don't break compatibility (which is likely given that our classpaths
> > > aren't so isolated), and, realistically, only if the entire Hadoop
> > > stack moves to java7 as well
> >
> >
> >
> >
> > > (eg we have to recompile HBase to
> > > generate v1.7 binaries even if they stick on API v1.6). I'm not aware
> > > of a feature, bug etc that really motivates this.
> > >
> > > I don't see that being needed unless we move up to new java7+ only
> > libraries and HBase needs to track this.
> >
> >  The big "recompile to work" issue is google guava, which is troublesome
> > enough I'd be tempted to say "can we drop it entirely"
> >
> >
> >
> > > An alternate approach is to keep the current stable release series
> > > (v2.x) as is, and start using new APIs in trunk (for v3). This will be
> > > a major upgrade for Hadoop and therefore an incompatible change like
> > > this is to be expected (it would be great if this came with additional
> > > changes to better isolate classpaths and dependencies from each
> > > other). It allows us to continue to support multiple types of users
> > > with different branches, vs forcing all users onto a new version. It
> > > of course means that 2.x users will not get the benefits of the new
> > > API, but its unclear what those benefits are given theIy can already
> > > get the benefits of adopting the newer java runtimes today.
> > >
> > >
> > >
> > I'm (personally) +1 to this, I also think we should plan to do the switch
> > some time this year to not only get the benefits, but discover the costs
> >
>
>
> Agree
>
>
>
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>



-- 
Alejandro


Re: Plans of moving towards JDK7 in trunk

2014-04-10 Thread Eli Collins
On Thu, Apr 10, 2014 at 6:49 AM, Raymie Stata  wrote:

> I think the problem to be solved here is to define a point in time
> when the average Hadoop contributor can start using Java7 dependencies
> in their code.
>
> The "use Java7 dependencies in trunk(/branch3)" plan, by itself, does
> not solve this problem.  The average Hadoop contributor wants to see
> their contributions make it into a stable release in a predictable
> amount of time.  Putting code with a Java7 dependency into trunk means
> the exact opposite: there is no timeline to a stable release.  So most
> contributors will stay away from Java7 dependencies, despite the
> nominal policy that they're allowed in trunk.  (And the few that do
> use Java7 dependencies are people who do not value releasing code into
> stable releases, which arguably could lead to a situation that the
> Java7-dependent code in trunk is, on average, on the buggy side.)
>
> I'm not saying the "branch2-in-the-future" plan is the only way to
> solve the problem of putting Java7 dependencies on a known time-table,
> but at least it solves it.  Is there another solution?
>

All good reasons for why we should start thinking about a plan for v3. The
points above pertain to any features for trunk that break compatibility,
not just ones that use new Java APIs.  We shouldn't permit incompatible
changes to merge to v2 just because we don't yet have a timeline for v3, we
should figure out the latter. Also motivates finishing the work to isolate
dependencies between Hadoop code, other framework code, and user code.

Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.

Thanks,
Eli






>
> On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran 
> wrote:
> > On 9 April 2014 23:52, Eli Collins  wrote:
> >
> >>
> >>
> >> For the sake of this discussion we should separate the runtime from
> >> the programming APIs. Users are already migrating to the java7 runtime
> >> for most of the reasons listed below (support, performance, bugs,
> >> etc), and the various distributions cert their Hadoop 2 based
> >> distributions on java7.  This gives users many of the benefits of
> >> java7, without forcing users off java6. Ie Hadoop does not need to
> >> switch to the java7 programming APIs to make sure everyone has a
> >> supported runtime.
> >>
> >>
> > +1: you can use Java 7 today; I'm not sure how tested Java 8 is
> >
> >
> >> The question here is really about when Hadoop, and the Hadoop
> >> ecosystem (since adjacent projects often end up in the same classpath)
> >> start using the java7 programming APIs and therefore break
> >> compatibility with java6 runtimes. I think our java6 runtime users
> >> would consider dropping support for their java runtime in an update of
> >> a major release to be an incompatible change (the binaries stop
> >> working on their current jvm).
> >
> >
> > do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?
> >
> >
> >> That may be worth it if we can
> >> articulate sufficient value to offset the cost (they have to upgrade
> >> their environment, might make rolling upgrades stop working, etc), but
> >> I've not yet heard an argument that articulates the value relative to
> >> the cost.  Eg upgrading to the java7 APIs allows us to pull in
> >> dependencies with new major versions, but only if those dependencies
> >> don't break compatibility (which is likely given that our classpaths
> >> aren't so isolated), and, realistically, only if the entire Hadoop
> >> stack moves to java7 as well
> >
> >
> >
> >
> >> (eg we have to recompile HBase to
> >> generate v1.7 binaries even if they stick on API v1.6). I'm not aware
> >> of a feature, bug etc that really motivates this.
> >>
> >> I don't see that being needed unless we move up to new java7+ only
> > libraries and HBase needs to track this.
> >
> >  The big "recompile to work" issue is google guava, which is troublesome
> > enough I'd be tempted to say "can we drop it entirely"
> >
> >
> >
> >> An alternate approach is to keep the current stable release series
> >> (v2.x) as is, and start using new APIs in trunk (for v3). This will be
> >> a major upgrade for Hadoop and therefore an incompatible change like
> >> this is to be expected (it would be great if this came with additional
> >> changes to better isolate classpaths and dependencies from each
> >> other). It allows us to continue to support multiple types of users
> >> with different branches, vs forcing all users onto a new version. It
> >> of course means that 2.x users will not get the benefits of the new
> >> API, but its unclear what those benefits are given theIy can already
> >> get the benefits of adopting the newer java runtimes today.
> >>
> >>
>

Re: Plans of moving towards JDK7 in trunk

2014-04-10 Thread Eli Collins
On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran wrote:

> On 9 April 2014 23:52, Eli Collins  wrote:
>
> >
> >
> > For the sake of this discussion we should separate the runtime from
> > the programming APIs. Users are already migrating to the java7 runtime
> > for most of the reasons listed below (support, performance, bugs,
> > etc), and the various distributions cert their Hadoop 2 based
> > distributions on java7.  This gives users many of the benefits of
> > java7, without forcing users off java6. Ie Hadoop does not need to
> > switch to the java7 programming APIs to make sure everyone has a
> > supported runtime.
> >
> >
> +1: you can use Java 7 today; I'm not sure how tested Java 8 is
>
>
> > The question here is really about when Hadoop, and the Hadoop
> > ecosystem (since adjacent projects often end up in the same classpath)
> > start using the java7 programming APIs and therefore break
> > compatibility with java6 runtimes. I think our java6 runtime users
> > would consider dropping support for their java runtime in an update of
> > a major release to be an incompatible change (the binaries stop
> > working on their current jvm).
>
>
> do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?
>

I mean 2.x --> 2.(x+1).  Ie I'm running the 2.4 stable and upgrading to 2.5.


>
>
> > That may be worth it if we can
> > articulate sufficient value to offset the cost (they have to upgrade
> > their environment, might make rolling upgrades stop working, etc), but
> > I've not yet heard an argument that articulates the value relative to
> > the cost.  Eg upgrading to the java7 APIs allows us to pull in
> > dependencies with new major versions, but only if those dependencies
> > don't break compatibility (which is likely given that our classpaths
> > aren't so isolated), and, realistically, only if the entire Hadoop
> > stack moves to java7 as well
>
>
>
>
> > (eg we have to recompile HBase to
> > generate v1.7 binaries even if they stick on API v1.6). I'm not aware
> > of a feature, bug etc that really motivates this.
> >
> > I don't see that being needed unless we move up to new java7+ only
> libraries and HBase needs to track this.
>
>  The big "recompile to work" issue is google guava, which is troublesome
> enough I'd be tempted to say "can we drop it entirely"
>
>
>
> > An alternate approach is to keep the current stable release series
> > (v2.x) as is, and start using new APIs in trunk (for v3). This will be
> > a major upgrade for Hadoop and therefore an incompatible change like
> > this is to be expected (it would be great if this came with additional
> > changes to better isolate classpaths and dependencies from each
> > other). It allows us to continue to support multiple types of users
> > with different branches, vs forcing all users onto a new version. It
> > of course means that 2.x users will not get the benefits of the new
> > API, but its unclear what those benefits are given theIy can already
> > get the benefits of adopting the newer java runtimes today.
> >
> >
> >
> I'm (personally) +1 to this, I also think we should plan to do the switch
> some time this year to not only get the benefits, but discover the costs
>


Agree



> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: Plans of moving towards JDK7 in trunk

2014-04-10 Thread Raymie Stata
I think the problem to be solved here is to define a point in time
when the average Hadoop contributor can start using Java7 dependencies
in their code.

The "use Java7 dependencies in trunk(/branch3)" plan, by itself, does
not solve this problem.  The average Hadoop contributor wants to see
their contributions make it into a stable release in a predictable
amount of time.  Putting code with a Java7 dependency into trunk means
the exact opposite: there is no timeline to a stable release.  So most
contributors will stay away from Java7 dependencies, despite the
nominal policy that they're allowed in trunk.  (And the few that do
use Java7 dependencies are people who do not value releasing code into
stable releases, which arguably could lead to a situation that the
Java7-dependent code in trunk is, on average, on the buggy side.)

I'm not saying the "branch2-in-the-future" plan is the only way to
solve the problem of putting Java7 dependencies on a known time-table,
but at least it solves it.  Is there another solution?

On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran  wrote:
> On 9 April 2014 23:52, Eli Collins  wrote:
>
>>
>>
>> For the sake of this discussion we should separate the runtime from
>> the programming APIs. Users are already migrating to the java7 runtime
>> for most of the reasons listed below (support, performance, bugs,
>> etc), and the various distributions cert their Hadoop 2 based
>> distributions on java7.  This gives users many of the benefits of
>> java7, without forcing users off java6. Ie Hadoop does not need to
>> switch to the java7 programming APIs to make sure everyone has a
>> supported runtime.
>>
>>
> +1: you can use Java 7 today; I'm not sure how tested Java 8 is
>
>
>> The question here is really about when Hadoop, and the Hadoop
>> ecosystem (since adjacent projects often end up in the same classpath)
>> start using the java7 programming APIs and therefore break
>> compatibility with java6 runtimes. I think our java6 runtime users
>> would consider dropping support for their java runtime in an update of
>> a major release to be an incompatible change (the binaries stop
>> working on their current jvm).
>
>
> do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?
>
>
>> That may be worth it if we can
>> articulate sufficient value to offset the cost (they have to upgrade
>> their environment, might make rolling upgrades stop working, etc), but
>> I've not yet heard an argument that articulates the value relative to
>> the cost.  Eg upgrading to the java7 APIs allows us to pull in
>> dependencies with new major versions, but only if those dependencies
>> don't break compatibility (which is likely given that our classpaths
>> aren't so isolated), and, realistically, only if the entire Hadoop
>> stack moves to java7 as well
>
>
>
>
>> (eg we have to recompile HBase to
>> generate v1.7 binaries even if they stick on API v1.6). I'm not aware
>> of a feature, bug etc that really motivates this.
>>
>> I don't see that being needed unless we move up to new java7+ only
> libraries and HBase needs to track this.
>
>  The big "recompile to work" issue is google guava, which is troublesome
> enough I'd be tempted to say "can we drop it entirely"
>
>
>
>> An alternate approach is to keep the current stable release series
>> (v2.x) as is, and start using new APIs in trunk (for v3). This will be
>> a major upgrade for Hadoop and therefore an incompatible change like
>> this is to be expected (it would be great if this came with additional
>> changes to better isolate classpaths and dependencies from each
>> other). It allows us to continue to support multiple types of users
>> with different branches, vs forcing all users onto a new version. It
>> of course means that 2.x users will not get the benefits of the new
>> API, but its unclear what those benefits are given theIy can already
>> get the benefits of adopting the newer java runtimes today.
>>
>>
>>
> I'm (personally) +1 to this, I also think we should plan to do the switch
> some time this year to not only get the benefits, but discover the costs
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-10 Thread Steve Loughran
On 9 April 2014 23:52, Eli Collins  wrote:

>
>
> For the sake of this discussion we should separate the runtime from
> the programming APIs. Users are already migrating to the java7 runtime
> for most of the reasons listed below (support, performance, bugs,
> etc), and the various distributions cert their Hadoop 2 based
> distributions on java7.  This gives users many of the benefits of
> java7, without forcing users off java6. Ie Hadoop does not need to
> switch to the java7 programming APIs to make sure everyone has a
> supported runtime.
>
>
+1: you can use Java 7 today; I'm not sure how tested Java 8 is


> The question here is really about when Hadoop, and the Hadoop
> ecosystem (since adjacent projects often end up in the same classpath)
> start using the java7 programming APIs and therefore break
> compatibility with java6 runtimes. I think our java6 runtime users
> would consider dropping support for their java runtime in an update of
> a major release to be an incompatible change (the binaries stop
> working on their current jvm).


do you mean major 2.x -> 3.y or minor 2.x -> 2.(x+1)  here?


> That may be worth it if we can
> articulate sufficient value to offset the cost (they have to upgrade
> their environment, might make rolling upgrades stop working, etc), but
> I've not yet heard an argument that articulates the value relative to
> the cost.  Eg upgrading to the java7 APIs allows us to pull in
> dependencies with new major versions, but only if those dependencies
> don't break compatibility (which is likely given that our classpaths
> aren't so isolated), and, realistically, only if the entire Hadoop
> stack moves to java7 as well




> (eg we have to recompile HBase to
> generate v1.7 binaries even if they stick on API v1.6). I'm not aware
> of a feature, bug etc that really motivates this.
>
> I don't see that being needed unless we move up to new java7+ only
libraries and HBase needs to track this.

 The big "recompile to work" issue is google guava, which is troublesome
enough I'd be tempted to say "can we drop it entirely"



> An alternate approach is to keep the current stable release series
> (v2.x) as is, and start using new APIs in trunk (for v3). This will be
> a major upgrade for Hadoop and therefore an incompatible change like
> this is to be expected (it would be great if this came with additional
> changes to better isolate classpaths and dependencies from each
> other). It allows us to continue to support multiple types of users
> with different branches, vs forcing all users onto a new version. It
> of course means that 2.x users will not get the benefits of the new
> API, but its unclear what those benefits are given theIy can already
> get the benefits of adopting the newer java runtimes today.
>
>
>
I'm (personally) +1 to this, I also think we should plan to do the switch
some time this year to not only get the benefits, but discover the costs

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-09 Thread Vinayakumar B
+1 for keeping jdk 6 suppprt in branch-2 and start using jdk 7 in trunk.

I agree that this approach makes patch generation difficult for branch-2
and trunk.

Also the actual benefit and real issues after start using jdk7 will be
known only if atleast one of the release is out in trunk version.

Regards,
Vinay
I think this thread isn't so much about whether java7, 8 etc features
are valuable, they are useful of course, and we'll want to adopt them,
it's a question of how we adopt them and in which releases.

For the sake of this discussion we should separate the runtime from
the programming APIs. Users are already migrating to the java7 runtime
for most of the reasons listed below (support, performance, bugs,
etc), and the various distributions cert their Hadoop 2 based
distributions on java7.  This gives users many of the benefits of
java7, without forcing users off java6. Ie Hadoop does not need to
switch to the java7 programming APIs to make sure everyone has a
supported runtime.

The question here is really about when Hadoop, and the Hadoop
ecosystem (since adjacent projects often end up in the same classpath)
start using the java7 programming APIs and therefore break
compatibility with java6 runtimes. I think our java6 runtime users
would consider dropping support for their java runtime in an update of
a major release to be an incompatible change (the binaries stop
working on their current jvm). That may be worth it if we can
articulate sufficient value to offset the cost (they have to upgrade
their environment, might make rolling upgrades stop working, etc), but
I've not yet heard an argument that articulates the value relative to
the cost.  Eg upgrading to the java7 APIs allows us to pull in
dependencies with new major versions, but only if those dependencies
don't break compatibility (which is likely given that our classpaths
aren't so isolated), and, realistically, only if the entire Hadoop
stack moves to java7 as well (eg we have to recompile HBase to
generate v1.7 binaries even if they stick on API v1.6). I'm not aware
of a feature, bug etc that really motivates this.

An alternate approach is to keep the current stable release series
(v2.x) as is, and start using new APIs in trunk (for v3). This will be
a major upgrade for Hadoop and therefore an incompatible change like
this is to be expected (it would be great if this came with additional
changes to better isolate classpaths and dependencies from each
other). It allows us to continue to support multiple types of users
with different branches, vs forcing all users onto a new version. It
of course means that 2.x users will not get the benefits of the new
API, but its unclear what those benefits are given they can already
get the benefits of adopting the newer java runtimes today.

Thanks,
Eli


On Wed, Apr 9, 2014 at 9:38 AM, Andrew Purtell  wrote:
> A Java 8 runtime would also offer transparent performance improvements
like
> a reimplementation of ConcurrentSkipListMap, C2 support for AES cipher
> acceleration with native CPU instructions, perf improvements for going
from
> String to byte[] or vice versa, and IIRC after 8u20 monitor lock elision
> using restricted transactional memory with hardware support (if
available).
> Getting away from fully transparent changes but tractable to deal with
> using reflection, removal of the permanent generation, support for AEAD
> cipher modes like AES-GCM, stronger cipher and key exchange algorithms,
TLS
> 1.2, support for some krb 5 features not handled previously.
>
>
>
> On Tue, Apr 8, 2014 at 7:44 PM, Raymie Stata  wrote:
>
>> > It might make sense to try to enumerate the benefits of switching to
>> > Java7 APIs and dependencies.
>>
>>   - Java7 introduced a huge number of language, byte-code, API, and
>> tooling enhancements!  Just to name a few: try-with-resources, newer
>> and stronger encyrption methods, more scalable concurrency primitives.
>>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>>
>>   - We can't update current dependencies, and we can't add cool new ones.
>>
>>   - Putting language/APIs aside, don't forget that a huge amount of
effort
>> goes into qualifying for Java6 (at least, I hope the folks claiming to
>> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
>> users/customers be better served if qualification effort went into
>> Java7/8 versus Java6/7?
>>
>> Getting to Java7 as a development env (and Java8 as a runtime env)
>> seems like a no-brainer.  Question is: How?
>>
>> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza 
>> wrote:
>> > It might make sense to try to enumerate the benefits of switching to
>> Java7
>> > APIs and dependencies.  IMO, the ones listed so far on this thread
don't
>> > make a compelling enough case to drop Java6 in branch-2 on any time
>> frame,
>> > even if this means supporting Java6 through 2015.  For example, the
>> change
>> > in RawLocalFileSystem semantics might be an incompatible change for
>> > branch-2 any way.
>> >
>

Re: Plans of moving towards JDK7 in trunk

2014-04-09 Thread Eli Collins
I think this thread isn't so much about whether java7, 8 etc features
are valuable, they are useful of course, and we'll want to adopt them,
it's a question of how we adopt them and in which releases.

For the sake of this discussion we should separate the runtime from
the programming APIs. Users are already migrating to the java7 runtime
for most of the reasons listed below (support, performance, bugs,
etc), and the various distributions cert their Hadoop 2 based
distributions on java7.  This gives users many of the benefits of
java7, without forcing users off java6. Ie Hadoop does not need to
switch to the java7 programming APIs to make sure everyone has a
supported runtime.

The question here is really about when Hadoop, and the Hadoop
ecosystem (since adjacent projects often end up in the same classpath)
start using the java7 programming APIs and therefore break
compatibility with java6 runtimes. I think our java6 runtime users
would consider dropping support for their java runtime in an update of
a major release to be an incompatible change (the binaries stop
working on their current jvm). That may be worth it if we can
articulate sufficient value to offset the cost (they have to upgrade
their environment, might make rolling upgrades stop working, etc), but
I've not yet heard an argument that articulates the value relative to
the cost.  Eg upgrading to the java7 APIs allows us to pull in
dependencies with new major versions, but only if those dependencies
don't break compatibility (which is likely given that our classpaths
aren't so isolated), and, realistically, only if the entire Hadoop
stack moves to java7 as well (eg we have to recompile HBase to
generate v1.7 binaries even if they stick on API v1.6). I'm not aware
of a feature, bug etc that really motivates this.

An alternate approach is to keep the current stable release series
(v2.x) as is, and start using new APIs in trunk (for v3). This will be
a major upgrade for Hadoop and therefore an incompatible change like
this is to be expected (it would be great if this came with additional
changes to better isolate classpaths and dependencies from each
other). It allows us to continue to support multiple types of users
with different branches, vs forcing all users onto a new version. It
of course means that 2.x users will not get the benefits of the new
API, but its unclear what those benefits are given they can already
get the benefits of adopting the newer java runtimes today.

Thanks,
Eli


On Wed, Apr 9, 2014 at 9:38 AM, Andrew Purtell  wrote:
> A Java 8 runtime would also offer transparent performance improvements like
> a reimplementation of ConcurrentSkipListMap, C2 support for AES cipher
> acceleration with native CPU instructions, perf improvements for going from
> String to byte[] or vice versa, and IIRC after 8u20 monitor lock elision
> using restricted transactional memory with hardware support (if available).
> Getting away from fully transparent changes but tractable to deal with
> using reflection, removal of the permanent generation, support for AEAD
> cipher modes like AES-GCM, stronger cipher and key exchange algorithms, TLS
> 1.2, support for some krb 5 features not handled previously.
>
>
>
> On Tue, Apr 8, 2014 at 7:44 PM, Raymie Stata  wrote:
>
>> > It might make sense to try to enumerate the benefits of switching to
>> > Java7 APIs and dependencies.
>>
>>   - Java7 introduced a huge number of language, byte-code, API, and
>> tooling enhancements!  Just to name a few: try-with-resources, newer
>> and stronger encyrption methods, more scalable concurrency primitives.
>>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>>
>>   - We can't update current dependencies, and we can't add cool new ones.
>>
>>   - Putting language/APIs aside, don't forget that a huge amount of effort
>> goes into qualifying for Java6 (at least, I hope the folks claiming to
>> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
>> users/customers be better served if qualification effort went into
>> Java7/8 versus Java6/7?
>>
>> Getting to Java7 as a development env (and Java8 as a runtime env)
>> seems like a no-brainer.  Question is: How?
>>
>> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza 
>> wrote:
>> > It might make sense to try to enumerate the benefits of switching to
>> Java7
>> > APIs and dependencies.  IMO, the ones listed so far on this thread don't
>> > make a compelling enough case to drop Java6 in branch-2 on any time
>> frame,
>> > even if this means supporting Java6 through 2015.  For example, the
>> change
>> > in RawLocalFileSystem semantics might be an incompatible change for
>> > branch-2 any way.
>> >
>> >
>> > On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla > >wrote:
>> >
>> >> +1 to NOT breaking compatibility in branch-2.
>> >>
>> >> I think it is reasonable to require JDK7 for trunk, if we limit use of
>> >> JDK7-only API to security fixes etc. If we make other optimizations
>> (like
>> >> IO), it would 

Re: Plans of moving towards JDK7 in trunk

2014-04-09 Thread Andrew Purtell
A Java 8 runtime would also offer transparent performance improvements like
a reimplementation of ConcurrentSkipListMap, C2 support for AES cipher
acceleration with native CPU instructions, perf improvements for going from
String to byte[] or vice versa, and IIRC after 8u20 monitor lock elision
using restricted transactional memory with hardware support (if available).
Getting away from fully transparent changes but tractable to deal with
using reflection, removal of the permanent generation, support for AEAD
cipher modes like AES-GCM, stronger cipher and key exchange algorithms, TLS
1.2, support for some krb 5 features not handled previously.



On Tue, Apr 8, 2014 at 7:44 PM, Raymie Stata  wrote:

> > It might make sense to try to enumerate the benefits of switching to
> > Java7 APIs and dependencies.
>
>   - Java7 introduced a huge number of language, byte-code, API, and
> tooling enhancements!  Just to name a few: try-with-resources, newer
> and stronger encyrption methods, more scalable concurrency primitives.
>  See http://www.slideshare.net/boulderjug/55-things-in-java-7
>
>   - We can't update current dependencies, and we can't add cool new ones.
>
>   - Putting language/APIs aside, don't forget that a huge amount of effort
> goes into qualifying for Java6 (at least, I hope the folks claiming to
> support Java6 are putting in such an effort :-).  Wouldn't Hadoop
> users/customers be better served if qualification effort went into
> Java7/8 versus Java6/7?
>
> Getting to Java7 as a development env (and Java8 as a runtime env)
> seems like a no-brainer.  Question is: How?
>
> On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza 
> wrote:
> > It might make sense to try to enumerate the benefits of switching to
> Java7
> > APIs and dependencies.  IMO, the ones listed so far on this thread don't
> > make a compelling enough case to drop Java6 in branch-2 on any time
> frame,
> > even if this means supporting Java6 through 2015.  For example, the
> change
> > in RawLocalFileSystem semantics might be an incompatible change for
> > branch-2 any way.
> >
> >
> > On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla  >wrote:
> >
> >> +1 to NOT breaking compatibility in branch-2.
> >>
> >> I think it is reasonable to require JDK7 for trunk, if we limit use of
> >> JDK7-only API to security fixes etc. If we make other optimizations
> (like
> >> IO), it would be a pain to backport things to branch-2. I guess this all
> >> depends on when we see ourselves shipping Hadoop-3. Any ideas on that?
> >>
> >>
> >> On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:
> >>
> >> > On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
> >> >  wrote:
> >> > >> From: Eli Collins [mailto:e...@cloudera.com]
> >> > >> Sent: Monday, April 07, 2014 11:54 AM
> >> > >>
> >> > >>
> >> > >> IMO we should not drop support for Java 6 in a minor update of a
> >> stable
> >> > >> release (v2).  I don't think the larger Hadoop user base would
> find it
> >> > >> acceptable that upgrading to a minor update caused their systems to
> >> stop
> >> > >> working because they didn't upgrade Java. There are people still
> >> getting
> >> > >> support for Java 6. ...
> >> > >>
> >> > >> Thanks,
> >> > >> Eli
> >> > >
> >> > > Hi Eli,
> >> > >
> >> > > Technically you are correct those with extended support get critical
> >> > security fixes for 6 until the end of 2016. I am curious whether many
> of
> >> > those are in the Hadoop user base. Do you know? My guess is the vast
> >> > majority are within Oracle's official public end of life, which was
> over
> >> 12
> >> > months ago. Even Premier support ended Dec 2013:
> >> > >
> >> > > http://www.oracle.com/technetwork/java/eol-135779.html
> >> > >
> >> > > The end of Java 6 support carries much risk. It has to be
> considered in
> >> > terms of serious security vulnerabilities such as CVE-2013-2465 with
> CVSS
> >> > score 10.0.
> >> > >
> >> > > http://www.cvedetails.com/cve/CVE-2013-2465/
> >> > >
> >> > > Since you mentioned "caused systems to stop" as an example of what
> >> would
> >> > be a concern to Hadoop users, please note the CVE-2013-2465
> availability
> >> > impact:
> >> > >
> >> > > "Complete (There is a total shutdown of the affected resource. The
> >> > attacker can render the resource completely unavailable.)"
> >> > >
> >> > > This vulnerability was patched in Java 6 Update 51, but post end of
> >> > life. Apple pushed out the update specifically because of this
> >> > vulnerability (http://support.apple.com/kb/HT5717) as did some other
> >> > vendors privately, but for the majority of people using Java 6 means
> they
> >> > have a ticking time bomb.
> >> > >
> >> > > Allowing it to stay should be considered in terms of accepting the
> >> whole
> >> > risk posture.
> >> > >
> >> >
> >> > There are some who get extended support, but I suspect many just have
> >> > a if-it's-not-broke mentality when it comes to production deployments.
> >> > The current code supports both java6 and java7 and so all

Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Raymie Stata
> It might make sense to try to enumerate the benefits of switching to
> Java7 APIs and dependencies.

  - Java7 introduced a huge number of language, byte-code, API, and
tooling enhancements!  Just to name a few: try-with-resources, newer
and stronger encyrption methods, more scalable concurrency primitives.
 See http://www.slideshare.net/boulderjug/55-things-in-java-7

  - We can't update current dependencies, and we can't add cool new ones.

  - Putting language/APIs aside, don't forget that a huge amount of effort
goes into qualifying for Java6 (at least, I hope the folks claiming to
support Java6 are putting in such an effort :-).  Wouldn't Hadoop
users/customers be better served if qualification effort went into
Java7/8 versus Java6/7?

Getting to Java7 as a development env (and Java8 as a runtime env)
seems like a no-brainer.  Question is: How?

On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza  wrote:
> It might make sense to try to enumerate the benefits of switching to Java7
> APIs and dependencies.  IMO, the ones listed so far on this thread don't
> make a compelling enough case to drop Java6 in branch-2 on any time frame,
> even if this means supporting Java6 through 2015.  For example, the change
> in RawLocalFileSystem semantics might be an incompatible change for
> branch-2 any way.
>
>
> On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla wrote:
>
>> +1 to NOT breaking compatibility in branch-2.
>>
>> I think it is reasonable to require JDK7 for trunk, if we limit use of
>> JDK7-only API to security fixes etc. If we make other optimizations (like
>> IO), it would be a pain to backport things to branch-2. I guess this all
>> depends on when we see ourselves shipping Hadoop-3. Any ideas on that?
>>
>>
>> On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:
>>
>> > On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
>> >  wrote:
>> > >> From: Eli Collins [mailto:e...@cloudera.com]
>> > >> Sent: Monday, April 07, 2014 11:54 AM
>> > >>
>> > >>
>> > >> IMO we should not drop support for Java 6 in a minor update of a
>> stable
>> > >> release (v2).  I don't think the larger Hadoop user base would find it
>> > >> acceptable that upgrading to a minor update caused their systems to
>> stop
>> > >> working because they didn't upgrade Java. There are people still
>> getting
>> > >> support for Java 6. ...
>> > >>
>> > >> Thanks,
>> > >> Eli
>> > >
>> > > Hi Eli,
>> > >
>> > > Technically you are correct those with extended support get critical
>> > security fixes for 6 until the end of 2016. I am curious whether many of
>> > those are in the Hadoop user base. Do you know? My guess is the vast
>> > majority are within Oracle's official public end of life, which was over
>> 12
>> > months ago. Even Premier support ended Dec 2013:
>> > >
>> > > http://www.oracle.com/technetwork/java/eol-135779.html
>> > >
>> > > The end of Java 6 support carries much risk. It has to be considered in
>> > terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
>> > score 10.0.
>> > >
>> > > http://www.cvedetails.com/cve/CVE-2013-2465/
>> > >
>> > > Since you mentioned "caused systems to stop" as an example of what
>> would
>> > be a concern to Hadoop users, please note the CVE-2013-2465 availability
>> > impact:
>> > >
>> > > "Complete (There is a total shutdown of the affected resource. The
>> > attacker can render the resource completely unavailable.)"
>> > >
>> > > This vulnerability was patched in Java 6 Update 51, but post end of
>> > life. Apple pushed out the update specifically because of this
>> > vulnerability (http://support.apple.com/kb/HT5717) as did some other
>> > vendors privately, but for the majority of people using Java 6 means they
>> > have a ticking time bomb.
>> > >
>> > > Allowing it to stay should be considered in terms of accepting the
>> whole
>> > risk posture.
>> > >
>> >
>> > There are some who get extended support, but I suspect many just have
>> > a if-it's-not-broke mentality when it comes to production deployments.
>> > The current code supports both java6 and java7 and so allows these
>> > people to remain compatible, while enabling others to upgrade to the
>> > java7 runtime. This seems like the right compromise for a stable
>> > release series. Again, absolutely makes sense for trunk (ie v3) to
>> > require java7 or greater.
>> >
>>


Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Sandy Ryza
It might make sense to try to enumerate the benefits of switching to Java7
APIs and dependencies.  IMO, the ones listed so far on this thread don't
make a compelling enough case to drop Java6 in branch-2 on any time frame,
even if this means supporting Java6 through 2015.  For example, the change
in RawLocalFileSystem semantics might be an incompatible change for
branch-2 any way.


On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla wrote:

> +1 to NOT breaking compatibility in branch-2.
>
> I think it is reasonable to require JDK7 for trunk, if we limit use of
> JDK7-only API to security fixes etc. If we make other optimizations (like
> IO), it would be a pain to backport things to branch-2. I guess this all
> depends on when we see ourselves shipping Hadoop-3. Any ideas on that?
>
>
> On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:
>
> > On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
> >  wrote:
> > >> From: Eli Collins [mailto:e...@cloudera.com]
> > >> Sent: Monday, April 07, 2014 11:54 AM
> > >>
> > >>
> > >> IMO we should not drop support for Java 6 in a minor update of a
> stable
> > >> release (v2).  I don't think the larger Hadoop user base would find it
> > >> acceptable that upgrading to a minor update caused their systems to
> stop
> > >> working because they didn't upgrade Java. There are people still
> getting
> > >> support for Java 6. ...
> > >>
> > >> Thanks,
> > >> Eli
> > >
> > > Hi Eli,
> > >
> > > Technically you are correct those with extended support get critical
> > security fixes for 6 until the end of 2016. I am curious whether many of
> > those are in the Hadoop user base. Do you know? My guess is the vast
> > majority are within Oracle's official public end of life, which was over
> 12
> > months ago. Even Premier support ended Dec 2013:
> > >
> > > http://www.oracle.com/technetwork/java/eol-135779.html
> > >
> > > The end of Java 6 support carries much risk. It has to be considered in
> > terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
> > score 10.0.
> > >
> > > http://www.cvedetails.com/cve/CVE-2013-2465/
> > >
> > > Since you mentioned "caused systems to stop" as an example of what
> would
> > be a concern to Hadoop users, please note the CVE-2013-2465 availability
> > impact:
> > >
> > > "Complete (There is a total shutdown of the affected resource. The
> > attacker can render the resource completely unavailable.)"
> > >
> > > This vulnerability was patched in Java 6 Update 51, but post end of
> > life. Apple pushed out the update specifically because of this
> > vulnerability (http://support.apple.com/kb/HT5717) as did some other
> > vendors privately, but for the majority of people using Java 6 means they
> > have a ticking time bomb.
> > >
> > > Allowing it to stay should be considered in terms of accepting the
> whole
> > risk posture.
> > >
> >
> > There are some who get extended support, but I suspect many just have
> > a if-it's-not-broke mentality when it comes to production deployments.
> > The current code supports both java6 and java7 and so allows these
> > people to remain compatible, while enabling others to upgrade to the
> > java7 runtime. This seems like the right compromise for a stable
> > release series. Again, absolutely makes sense for trunk (ie v3) to
> > require java7 or greater.
> >
>


Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Raymie Stata
Is there broad consensus that, by end of 3Q2014 at the latest, "the
average" contributor to Hadoop should be free to use Java7 features?
And start pulling in libraries that have a Java7 dependency?  And
start doing the "janitorial" work of taking advantage of the Java7
APIs?  Or do we think that the bulk of Hadoop work will be done
against Java6 APIs (and avoiding Java7-dependent libraries) through
the end of the year?

If the consensus is that we introduce Java7 into the bulk of Hadoop
coding, what's the plan for getting there?  The answer can't be "right
now, in trunk."  Even if we agreed to start allowing Java7
dependencies into trunk, as a practical matter this isn't enough.
Right now, if I'm a random Hadoop contributor, I'd be stupid to
contribute to trunk: I know that any stable release in the near term
will be from branch2, so if I want a prayer of seeing my change in a
stable release, I'd better contribute to branch2.

If we want a path to allowing Java7 dependencies by Q4, then we need
one of the following:

1) "branch3 plan:" The major Hadoop vendors (you know who you are)
commit to shipping a "v3" of Hadoop in Q4 that allows Java7
dependencies and show signs of living up to that commitment (e.g., a
branch3 is created sometime soon).  This puts us all on a path towards
a "real" release of Hadoop that allows Java7 dependencies.

2) "branch2 plan:" deprecate Java6 as a runtime environment now,
publicly declare a time frame (e.g., 4Q2014) when _future development_
stops supporting Java6 runtime, and work with our customers in the
meantime to get them off a crazy-old version of Java (that's what
we're doing right now).

I don't see another path to allowing Java7 dependencies.  In the
current state of indecision, the smart programmer would be assuming no
Java7 dependencies into 2015.

On the one hand, I don't see the branch3 plan actually happening.
This is a big decision involving marketing, engineering, customer
support.  Plus it creates a problem for sales: Come summertime,
they'll have a hard time selling 2.x-based releases because they've
pre-announced support for 3.x.  It's just not going to happen.

On the other hand, I don't see the problem with the branch2 plan.  The
branch2 plan also requires the commitment from the major vendors, but
this decision is not nearly as galactic.  By the time 3Q2014 comes
along, this problem will be very rarified.  Also, don't forget that it
typically takes a customer 3-6 months to upgrade their Hadoop -- and a
customer who's afraid to shift off Java6 in 3Q2014 will probably take
a year to upgrade.  The branch2 plan implies a last Java6 release of
Hadoop in 3Q2014.  If we assume a Java7-averse customer will take a
year to upgrade to this release -- and then will take another year to
upgrade their cluster after that -- then they can be happily using
Java6 all the way into 2016.  (Another point, if 3Q2014 comes along
and vendors find they have so many customers still on Java6 that they
can't afford the discontinuity, then they can shift their MAJOR
version number of their product to communicate the discontinuity --
there's nothing that says that a vendor's versioning scheme must agree
exactly with Hadoop's.)

In short, we don't currently have a realistic path for introducing
Java7 dependencies into Hadoop.  Simply allowing them into trunk will
NOT solve this problem: any contributor who wants to see their code in
a stable release knows it'll have to flow through branch2 -- and thus
they'll have to avoid Java6 dependencies.  The branch2 plan is the
only plan proposed so far that gets us to Java7 dependencies by Q4.
And the important part of the branch2 plan is we make the decision
soon -- so we have time to notify folks and otherwise work that
decision out into the field.

  Raymie



On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:
> On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
>  wrote:
>>> From: Eli Collins [mailto:e...@cloudera.com]
>>> Sent: Monday, April 07, 2014 11:54 AM
>>>
>>>
>>> IMO we should not drop support for Java 6 in a minor update of a stable
>>> release (v2).  I don't think the larger Hadoop user base would find it
>>> acceptable that upgrading to a minor update caused their systems to stop
>>> working because they didn't upgrade Java. There are people still getting
>>> support for Java 6. ...
>>>
>>> Thanks,
>>> Eli
>>
>> Hi Eli,
>>
>> Technically you are correct those with extended support get critical 
>> security fixes for 6 until the end of 2016. I am curious whether many of 
>> those are in the Hadoop user base. Do you know? My guess is the vast 
>> majority are within Oracle's official public end of life, which was over 12 
>> months ago. Even Premier support ended Dec 2013:
>>
>> http://www.oracle.com/technetwork/java/eol-135779.html
>>
>> The end of Java 6 support carries much risk. It has to be considered in 
>> terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS 
>> score 10.0.
>>
>> http://www.cvedetails.c

Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Karthik Kambatla
+1 to NOT breaking compatibility in branch-2.

I think it is reasonable to require JDK7 for trunk, if we limit use of
JDK7-only API to security fixes etc. If we make other optimizations (like
IO), it would be a pain to backport things to branch-2. I guess this all
depends on when we see ourselves shipping Hadoop-3. Any ideas on that?


On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins  wrote:

> On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
>  wrote:
> >> From: Eli Collins [mailto:e...@cloudera.com]
> >> Sent: Monday, April 07, 2014 11:54 AM
> >>
> >>
> >> IMO we should not drop support for Java 6 in a minor update of a stable
> >> release (v2).  I don't think the larger Hadoop user base would find it
> >> acceptable that upgrading to a minor update caused their systems to stop
> >> working because they didn't upgrade Java. There are people still getting
> >> support for Java 6. ...
> >>
> >> Thanks,
> >> Eli
> >
> > Hi Eli,
> >
> > Technically you are correct those with extended support get critical
> security fixes for 6 until the end of 2016. I am curious whether many of
> those are in the Hadoop user base. Do you know? My guess is the vast
> majority are within Oracle's official public end of life, which was over 12
> months ago. Even Premier support ended Dec 2013:
> >
> > http://www.oracle.com/technetwork/java/eol-135779.html
> >
> > The end of Java 6 support carries much risk. It has to be considered in
> terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
> score 10.0.
> >
> > http://www.cvedetails.com/cve/CVE-2013-2465/
> >
> > Since you mentioned "caused systems to stop" as an example of what would
> be a concern to Hadoop users, please note the CVE-2013-2465 availability
> impact:
> >
> > "Complete (There is a total shutdown of the affected resource. The
> attacker can render the resource completely unavailable.)"
> >
> > This vulnerability was patched in Java 6 Update 51, but post end of
> life. Apple pushed out the update specifically because of this
> vulnerability (http://support.apple.com/kb/HT5717) as did some other
> vendors privately, but for the majority of people using Java 6 means they
> have a ticking time bomb.
> >
> > Allowing it to stay should be considered in terms of accepting the whole
> risk posture.
> >
>
> There are some who get extended support, but I suspect many just have
> a if-it's-not-broke mentality when it comes to production deployments.
> The current code supports both java6 and java7 and so allows these
> people to remain compatible, while enabling others to upgrade to the
> java7 runtime. This seems like the right compromise for a stable
> release series. Again, absolutely makes sense for trunk (ie v3) to
> require java7 or greater.
>


Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Eli Collins
On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
 wrote:
>> From: Eli Collins [mailto:e...@cloudera.com]
>> Sent: Monday, April 07, 2014 11:54 AM
>>
>>
>> IMO we should not drop support for Java 6 in a minor update of a stable
>> release (v2).  I don't think the larger Hadoop user base would find it
>> acceptable that upgrading to a minor update caused their systems to stop
>> working because they didn't upgrade Java. There are people still getting
>> support for Java 6. ...
>>
>> Thanks,
>> Eli
>
> Hi Eli,
>
> Technically you are correct those with extended support get critical security 
> fixes for 6 until the end of 2016. I am curious whether many of those are in 
> the Hadoop user base. Do you know? My guess is the vast majority are within 
> Oracle's official public end of life, which was over 12 months ago. Even 
> Premier support ended Dec 2013:
>
> http://www.oracle.com/technetwork/java/eol-135779.html
>
> The end of Java 6 support carries much risk. It has to be considered in terms 
> of serious security vulnerabilities such as CVE-2013-2465 with CVSS score 
> 10.0.
>
> http://www.cvedetails.com/cve/CVE-2013-2465/
>
> Since you mentioned "caused systems to stop" as an example of what would be a 
> concern to Hadoop users, please note the CVE-2013-2465 availability impact:
>
> "Complete (There is a total shutdown of the affected resource. The attacker 
> can render the resource completely unavailable.)"
>
> This vulnerability was patched in Java 6 Update 51, but post end of life. 
> Apple pushed out the update specifically because of this vulnerability 
> (http://support.apple.com/kb/HT5717) as did some other vendors privately, but 
> for the majority of people using Java 6 means they have a ticking time bomb.
>
> Allowing it to stay should be considered in terms of accepting the whole risk 
> posture.
>

There are some who get extended support, but I suspect many just have
a if-it's-not-broke mentality when it comes to production deployments.
The current code supports both java6 and java7 and so allows these
people to remain compatible, while enabling others to upgrade to the
java7 runtime. This seems like the right compromise for a stable
release series. Again, absolutely makes sense for trunk (ie v3) to
require java7 or greater.


Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Steve Loughran
Davi,

If you look at the security issues, they mostly come down to the same
thing: the sandbox isn't secure. Instead of running applets or web
applications in a locked down environment, malicious code can get out and
access private data, manipulate the filesystem, get out on the network, etc.

As a result of sandbox vulnerabilities, Java sandbox attacks are the #1 way
to exploit client machines, with flash 0-days following straight after.

I wouldn't recommend anyone having java 6 on their desktop, and even with
java 7u51 "signed apps only" installed, I'd go to the java properties and
disable applets. Then go to firefox and chrome and disable the java plugin,
before going to IE and changing the ActiveX security policy to "never
download". next: install flashblock so you don't get flash loading except
on sites you trust, and set your RSS feader up to subscribe to
https://isc.sans.edu/ to get alerts. Because if you don't do that, your
desktops are not secure.

But that has nothing to do with server-side security: people aren't running
sandbox applets in their Java cluster. So that's not the risk. Stability of
running code is more of an issue, and thats where the pressure of patching
java client code to fix 0-day exploits comes into direct conflict with the
need for server stability. Client security holes: fast patch, minimal
testing, ship ASAP. Stable: test for a while and make sure things don't
crash or leak. Hadoop installations  tend to be trailing edge, because the
latter matters more in a hadoop cluster.

And that's where we are today: some people like java6 because it is stable.
Hadoop is tested on it and it works. Hadoop also now appears to work well
on java7 and openjdk7.  I think everyone who can should move to either of
those, as its where the stability patches go in, its got lots of
performance improvements -as well as the API and library changes we are
discussing.

What I don't see us doing is telling people who are using branch-2 releases
on java 6 to upgrade on a point release. That just increases the risk of
the upgrade -and may just hold them back from updating hadoop itself,

-steve



If there is an issue with java6, it is "who has it on their machines for
builds"? I don't, but I have one linux VM with Java6 -and another with java
8.


On 8 April 2014 10:00, Ottenheimer, Davi  wrote:

>
>
> Hi Eli,
>
> Technically you are correct those with extended support get critical
> security fixes for 6 until the end of 2016. I am curious whether many of
> those are in the Hadoop user base. Do you know? My guess is the vast
> majority are within Oracle's official public end of life, which was over 12
> months ago. Even Premier support ended Dec 2013:
>
> http://www.oracle.com/technetwork/java/eol-135779.html
>
> The end of Java 6 support carries much risk. It has to be considered in
> terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
> score 10.0.
>
> http://www.cvedetails.com/cve/CVE-2013-2465/
>
> Since you mentioned "caused systems to stop" as an example of what would
> be a concern to Hadoop users, please note the CVE-2013-2465 availability
> impact:
>
> "Complete (There is a total shutdown of the affected resource. The
> attacker can render the resource completely unavailable.)"
>
> This vulnerability was patched in Java 6 Update 51, but post end of life.
> Apple pushed out the update specifically because of this vulnerability (
> http://support.apple.com/kb/HT5717) as did some other vendors privately,
> but for the majority of people using Java 6 means they have a ticking time
> bomb.
>
> Allowing it to stay should be considered in terms of accepting the whole
> risk posture.
>
> Davi
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Sandy Ryza
+1 for maintaining Java 6 support in branch-2.

Hadoop continuing to support Java 6 is not an endorsement of Java 6.  It's
an acknowledgement that many users of Hadoop 2 have Java 6 embedded in
their stack, and that upgrading is costly for some users and simply not an
option for others.  If a similar vulnerability were to be discovered in a
recent version of RHEL, I don't think it would make sense for Hadoop to
drop that version as a supported platform.

Assuming that we want to maintain Java 6 compatibility in branch-2, it
seems to me that we should do the same in trunk until we start seriously
planning a release of Hadoop 3.  Since we released 2.2 GA, trunk has mainly
been used as a staging area for changes that will go into branch-2.  The
larger the divergence between trunk and branch-2, the higher the overhead
for developers writing patches that need to go into both.  Eventually we'll
need to stomach this, but is there an advantage to doing so while Hadoop 3
is still remote?

-Sandy

On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi
wrote:

> > From: Eli Collins [mailto:e...@cloudera.com]
> > Sent: Monday, April 07, 2014 11:54 AM
> >
> >
> > IMO we should not drop support for Java 6 in a minor update of a stable
> > release (v2).  I don't think the larger Hadoop user base would find it
> > acceptable that upgrading to a minor update caused their systems to stop
> > working because they didn't upgrade Java. There are people still getting
> > support for Java 6. ...
> >
> > Thanks,
> > Eli
>
> Hi Eli,
>
> Technically you are correct those with extended support get critical
> security fixes for 6 until the end of 2016. I am curious whether many of
> those are in the Hadoop user base. Do you know? My guess is the vast
> majority are within Oracle's official public end of life, which was over 12
> months ago. Even Premier support ended Dec 2013:
>
> http://www.oracle.com/technetwork/java/eol-135779.html
>
> The end of Java 6 support carries much risk. It has to be considered in
> terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS
> score 10.0.
>
> http://www.cvedetails.com/cve/CVE-2013-2465/
>
> Since you mentioned "caused systems to stop" as an example of what would
> be a concern to Hadoop users, please note the CVE-2013-2465 availability
> impact:
>
> "Complete (There is a total shutdown of the affected resource. The
> attacker can render the resource completely unavailable.)"
>
> This vulnerability was patched in Java 6 Update 51, but post end of life.
> Apple pushed out the update specifically because of this vulnerability (
> http://support.apple.com/kb/HT5717) as did some other vendors privately,
> but for the majority of people using Java 6 means they have a ticking time
> bomb.
>
> Allowing it to stay should be considered in terms of accepting the whole
> risk posture.
>
> Davi
>


RE: Plans of moving towards JDK7 in trunk

2014-04-08 Thread Ottenheimer, Davi
> From: Eli Collins [mailto:e...@cloudera.com]
> Sent: Monday, April 07, 2014 11:54 AM
> 
> 
> IMO we should not drop support for Java 6 in a minor update of a stable
> release (v2).  I don't think the larger Hadoop user base would find it
> acceptable that upgrading to a minor update caused their systems to stop
> working because they didn't upgrade Java. There are people still getting
> support for Java 6. ...
> 
> Thanks,
> Eli

Hi Eli, 

Technically you are correct those with extended support get critical security 
fixes for 6 until the end of 2016. I am curious whether many of those are in 
the Hadoop user base. Do you know? My guess is the vast majority are within 
Oracle's official public end of life, which was over 12 months ago. Even 
Premier support ended Dec 2013:

http://www.oracle.com/technetwork/java/eol-135779.html

The end of Java 6 support carries much risk. It has to be considered in terms 
of serious security vulnerabilities such as CVE-2013-2465 with CVSS score 10.0. 

http://www.cvedetails.com/cve/CVE-2013-2465/

Since you mentioned "caused systems to stop" as an example of what would be a 
concern to Hadoop users, please note the CVE-2013-2465 availability impact:

"Complete (There is a total shutdown of the affected resource. The attacker can 
render the resource completely unavailable.)"

This vulnerability was patched in Java 6 Update 51, but post end of life. Apple 
pushed out the update specifically because of this vulnerability 
(http://support.apple.com/kb/HT5717) as did some other vendors privately, but 
for the majority of people using Java 6 means they have a ticking time bomb. 

Allowing it to stay should be considered in terms of accepting the whole risk 
posture.

Davi


Re: Plans of moving towards JDK7 in trunk

2014-04-07 Thread Eli Collins
On Sat, Apr 5, 2014 at 12:54 PM, Raymie Stata  wrote:
> To summarize the thread so far:
>
> a) Java7 is already a supported compile- and runtime environment for
> Hadoop branch2 and trunk
> b) Java6 must remain a supported compile- and runtime environment for
> Hadoop branch2
> c) (b) implies that branch2 must stick to Java6 APIs
>
> I wonder if point (b) should be revised.  We could immediately
> deprecate Java6 as a runtime (and thus compile-time) environment for
> Hadoop.  We could end support for in some published time frame
> (perhaps 3Q2014).  That is, we'd say that all future 2.x release past
> some date would not be guaranteed to run on Java6.  This would set us
> up for using Java7 APIs into branch2.

IMO we should not drop support for Java 6 in a minor update of a
stable release (v2).  I don't think the larger Hadoop user base would
find it acceptable that upgrading to a minor update caused their
systems to stop working because they didn't upgrade Java. There are
people still getting support for Java 6. For the same reason, the
various distributions will not want to drop support in a minor update
of their products also, and since distros are using the Apache v2.x
update releases as the basis for their updates it would mean they have
to stop shipping v2.x updates, which makes it harder to collaborate
upstream.

Your point with regard to testing and releasing trunk is valid, though
we need to address that anyway, outside the context of Java versions.

Thanks,
Eli


Re: Plans of moving towards JDK7 in trunk

2014-04-07 Thread Haohui Mai
It looks to me that the majority of this thread welcomes JDK7. Just to
reiterate, there are two separate questions here:

1. When should hadoop-trunk can be only built on top of JDK7?
2. When should hadoop-branch-2 can be only built on top of JDK7?

The answers of the above questions directly imply when and how hadoop can
break the compatibility for JDK6 runtime.

It looks that there are quite a bit of compatibility concerns of question
(2). Should we focus on question (1) and come up with a plan? Personally
I'd love to see (1) to happen as soon as possible.

~Haohui

On Sun, Apr 6, 2014 at 11:37 AM, Steve Loughran wrote:

> On 5 April 2014 20:54, Raymie Stata  wrote:
>
> > To summarize the thread so far:
> >
> > a) Java7 is already a supported compile- and runtime environment for
> > Hadoop branch2 and trunk
> > b) Java6 must remain a supported compile- and runtime environment for
> > Hadoop branch2
> > c) (b) implies that branch2 must stick to Java6 APIs
> >
> > I wonder if point (b) should be revised.  We could immediately
> > deprecate Java6 as a runtime (and thus compile-time) environment for
> > Hadoop.  We could end support for in some published time frame
> > (perhaps 3Q2014).  That is, we'd say that all future 2.x release past
> > some date would not be guaranteed to run on Java6.  This would set us
> > up for using Java7 APIs into branch2.
> >
>
> I'll let others deal with that question.
>
>
> >
> > An alternative might be to keep branch2 on Java6 APIs forever, and to
> > start using Java7 APIs in trunk relatively soon.  The concern here
> > would be that trunk isn't getting the kind of production torture
> > testing that branch2 is subjected to, and won't be for a while.  If
> > trunk and branch2 diverge too much too quickly, trunk could become a
> > nest of bugs, endangering the timeline and quality of Hadoop 3.  This
> > would argue for keeping trunk and branch2 in closer sync (maybe until
> > a branch3 is created and starts getting used by bleeding-edge users).
> > However, as just suggested, keeping them in closer sync need _not_
> > imply that Java7 features be avoided indefinitely: again, with
> > sufficient warning, Java6 support could be sunset within branch2.
> >
>
> One thing we could do is have a policy towards new features where there's
> consensus that they won't go into branch-2, especially things in their own
> JARs.
>
> Here we could consider a policy of build set up to be Java 7 only, with
> Java7 APIs.
>
> That would be justified if there was some special java 7 feature -such as
> its infiniband support. Add a feature like that in its own module (under
> hadoop-tools, presumably), and use Java7 and Java 7 libraries. If someone
> really did want to use the feature in hadoop-2, they'd be able to, in a
> java7+ only backport.
>
>
> >
> > On a related note, Steve points out that we need to start thinking
> > about Java8.  YES!!  Lambdas are a Really Big Deal!  If we sunset
> > Java6 in a few quarters, maybe we can add Java8 compile and runtime
> > (but not API) support about the same time.  This does NOT imply
> > bringing Java8 APIs into branch2: Even if we do allow Java7 APIs into
> > branch2 in the future, I doubt that bringing Java8 APIs into it will
> > ever make sense.  However, if Java8 is a supported runtime environment
> > for Hadoop 2, that sets us up for using Java8 APIs for the eventual
> > branch3 sometime in 2015.
> >
> >
> Testing Hadoop on Java 8 would let the rest of the stack move forward.
>
> In the meantime, can I point out that both Scala-on-Java7 and
> Groovy-on-Java 7 offer closures quite nicely, with performance by way of
> INVOKEDYNAMIC opcodes.
>
> -steve
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-06 Thread Steve Loughran
On 5 April 2014 20:54, Raymie Stata  wrote:

> To summarize the thread so far:
>
> a) Java7 is already a supported compile- and runtime environment for
> Hadoop branch2 and trunk
> b) Java6 must remain a supported compile- and runtime environment for
> Hadoop branch2
> c) (b) implies that branch2 must stick to Java6 APIs
>
> I wonder if point (b) should be revised.  We could immediately
> deprecate Java6 as a runtime (and thus compile-time) environment for
> Hadoop.  We could end support for in some published time frame
> (perhaps 3Q2014).  That is, we'd say that all future 2.x release past
> some date would not be guaranteed to run on Java6.  This would set us
> up for using Java7 APIs into branch2.
>

I'll let others deal with that question.


>
> An alternative might be to keep branch2 on Java6 APIs forever, and to
> start using Java7 APIs in trunk relatively soon.  The concern here
> would be that trunk isn't getting the kind of production torture
> testing that branch2 is subjected to, and won't be for a while.  If
> trunk and branch2 diverge too much too quickly, trunk could become a
> nest of bugs, endangering the timeline and quality of Hadoop 3.  This
> would argue for keeping trunk and branch2 in closer sync (maybe until
> a branch3 is created and starts getting used by bleeding-edge users).
> However, as just suggested, keeping them in closer sync need _not_
> imply that Java7 features be avoided indefinitely: again, with
> sufficient warning, Java6 support could be sunset within branch2.
>

One thing we could do is have a policy towards new features where there's
consensus that they won't go into branch-2, especially things in their own
JARs.

Here we could consider a policy of build set up to be Java 7 only, with
Java7 APIs.

That would be justified if there was some special java 7 feature -such as
its infiniband support. Add a feature like that in its own module (under
hadoop-tools, presumably), and use Java7 and Java 7 libraries. If someone
really did want to use the feature in hadoop-2, they'd be able to, in a
java7+ only backport.


>
> On a related note, Steve points out that we need to start thinking
> about Java8.  YES!!  Lambdas are a Really Big Deal!  If we sunset
> Java6 in a few quarters, maybe we can add Java8 compile and runtime
> (but not API) support about the same time.  This does NOT imply
> bringing Java8 APIs into branch2: Even if we do allow Java7 APIs into
> branch2 in the future, I doubt that bringing Java8 APIs into it will
> ever make sense.  However, if Java8 is a supported runtime environment
> for Hadoop 2, that sets us up for using Java8 APIs for the eventual
> branch3 sometime in 2015.
>
>
Testing Hadoop on Java 8 would let the rest of the stack move forward.

In the meantime, can I point out that both Scala-on-Java7 and
Groovy-on-Java 7 offer closures quite nicely, with performance by way of
INVOKEDYNAMIC opcodes.

-steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-05 Thread Raymie Stata
To summarize the thread so far:

a) Java7 is already a supported compile- and runtime environment for
Hadoop branch2 and trunk
b) Java6 must remain a supported compile- and runtime environment for
Hadoop branch2
c) (b) implies that branch2 must stick to Java6 APIs

I wonder if point (b) should be revised.  We could immediately
deprecate Java6 as a runtime (and thus compile-time) environment for
Hadoop.  We could end support for in some published time frame
(perhaps 3Q2014).  That is, we'd say that all future 2.x release past
some date would not be guaranteed to run on Java6.  This would set us
up for using Java7 APIs into branch2.

An alternative might be to keep branch2 on Java6 APIs forever, and to
start using Java7 APIs in trunk relatively soon.  The concern here
would be that trunk isn't getting the kind of production torture
testing that branch2 is subjected to, and won't be for a while.  If
trunk and branch2 diverge too much too quickly, trunk could become a
nest of bugs, endangering the timeline and quality of Hadoop 3.  This
would argue for keeping trunk and branch2 in closer sync (maybe until
a branch3 is created and starts getting used by bleeding-edge users).
However, as just suggested, keeping them in closer sync need _not_
imply that Java7 features be avoided indefinitely: again, with
sufficient warning, Java6 support could be sunset within branch2.

On a related note, Steve points out that we need to start thinking
about Java8.  YES!!  Lambdas are a Really Big Deal!  If we sunset
Java6 in a few quarters, maybe we can add Java8 compile and runtime
(but not API) support about the same time.  This does NOT imply
bringing Java8 APIs into branch2: Even if we do allow Java7 APIs into
branch2 in the future, I doubt that bringing Java8 APIs into it will
ever make sense.  However, if Java8 is a supported runtime environment
for Hadoop 2, that sets us up for using Java8 APIs for the eventual
branch3 sometime in 2015.


On Sat, Apr 5, 2014 at 10:52 AM, Steve Loughran  wrote:
> On 5 April 2014 11:53, Colin McCabe  wrote:
>
>> I've been using JDK7 for Hadoop development for a while now, and I
>> know a lot of other folks have as well.  Correct me if I'm wrong, but
>> what we're talking about here is not "moving towards JDK7" but
>> "breaking compatibility with JDK6."
>>
>
> +1
>
>>
>> There are a lot of good reasons to ditch JDK6.  It would let us use
>> new APIs in JDK7, especially the new file APIs.  It would let us
>> update a few dependencies to newer versions.
>>
>>
> +1
>
>
>
>> I don't like the idea of breaking compatibility with JDK6 in trunk,
>> but not in branch-2.  The traditional reason for putting something in
>> trunk but not in branch-2 is that it is new code that needs some time
>> to prove itself.
>
>
> +1. branch-2 must continue to run on JDK6
>
>
>> This doesn't really apply to incrementing min.jdk--
>> we could do that easily whenever we like.  Meanwhile, if trunk starts
>> accumulating jdk7-only code and dependencies, backports from trunk to
>> branch-2 will become harder and harder over time.
>>
>
>
> I agree, but note that trunk diverges from branch-2 over time anyway -it's
> happening.
>
>
>>
>> Since we make stable releases off of branch-2, and not trunk, I don't
>> see any upside to this.  To be honest, I see only negatives here.
>> More time backporting, more issues that show up only in production
>> (branch-2) and not on dev machines (trunk).
>>
>
>> Maybe it's time to start thinking about what version of branch-2 will
>> drop jdk6 support.  But until there is such a version, I don't think
>> trunk should do it.
>>
>
>
>
>1. Let's assume that branch-2 will never drop JDK6 -clusters are
>committed to it, and saying "JDK updated needed" will simply stop updates.
>2. By the hadoop 3.0 ships -2015?- JDK6 will be EOL, java 8 will be in
>common use, and even JDK7 seen as trailing edge.
>3. JDK7  improves JVM performance: NUMA, nativeIO &c -which you get for
>free -as we're confident its stable there's no reason to not move to it in
>production.
>4. As we update the dependencies on hadoop 3, we'll end up upgrading to
>libraries that are JDK7+ only (jetty!), so JDK6 is implicitly abandoned.
>5. There are new packages and APIs in Java7 which we can adopt to make
>our lives better and development more productive -as well as improving the
>user experience.
>
> as a case in point, java.io.File.mkdirs() says "true if and only if the
> directory was created; false otherwise " , and returns false in either of
> the two cases:
>  -the path resolves to a directory that exists
>  -the path resolves to a file
> think about that, anyone using local filesystems could write code that
> assumes that mkdir()==0 is harmless, because if you apply it more than once
> on a directory it is. But call it on a file and you don't get told its only
> a file until you try to do something under it, and then things stop
> behaving.
>
> In comparison, java.nio.files.F

Re: Plans of moving towards JDK7 in trunk

2014-04-05 Thread Steve Loughran
On 5 April 2014 11:53, Colin McCabe  wrote:

> I've been using JDK7 for Hadoop development for a while now, and I
> know a lot of other folks have as well.  Correct me if I'm wrong, but
> what we're talking about here is not "moving towards JDK7" but
> "breaking compatibility with JDK6."
>

+1

>
> There are a lot of good reasons to ditch JDK6.  It would let us use
> new APIs in JDK7, especially the new file APIs.  It would let us
> update a few dependencies to newer versions.
>
>
+1



> I don't like the idea of breaking compatibility with JDK6 in trunk,
> but not in branch-2.  The traditional reason for putting something in
> trunk but not in branch-2 is that it is new code that needs some time
> to prove itself.


+1. branch-2 must continue to run on JDK6


> This doesn't really apply to incrementing min.jdk--
> we could do that easily whenever we like.  Meanwhile, if trunk starts
> accumulating jdk7-only code and dependencies, backports from trunk to
> branch-2 will become harder and harder over time.
>


I agree, but note that trunk diverges from branch-2 over time anyway -it's
happening.


>
> Since we make stable releases off of branch-2, and not trunk, I don't
> see any upside to this.  To be honest, I see only negatives here.
> More time backporting, more issues that show up only in production
> (branch-2) and not on dev machines (trunk).
>

> Maybe it's time to start thinking about what version of branch-2 will
> drop jdk6 support.  But until there is such a version, I don't think
> trunk should do it.
>



   1. Let's assume that branch-2 will never drop JDK6 -clusters are
   committed to it, and saying "JDK updated needed" will simply stop updates.
   2. By the hadoop 3.0 ships -2015?- JDK6 will be EOL, java 8 will be in
   common use, and even JDK7 seen as trailing edge.
   3. JDK7  improves JVM performance: NUMA, nativeIO &c -which you get for
   free -as we're confident its stable there's no reason to not move to it in
   production.
   4. As we update the dependencies on hadoop 3, we'll end up upgrading to
   libraries that are JDK7+ only (jetty!), so JDK6 is implicitly abandoned.
   5. There are new packages and APIs in Java7 which we can adopt to make
   our lives better and development more productive -as well as improving the
   user experience.

as a case in point, java.io.File.mkdirs() says "true if and only if the
directory was created; false otherwise " , and returns false in either of
the two cases:
 -the path resolves to a directory that exists
 -the path resolves to a file
think about that, anyone using local filesystems could write code that
assumes that mkdir()==0 is harmless, because if you apply it more than once
on a directory it is. But call it on a file and you don't get told its only
a file until you try to do something under it, and then things stop
behaving.

In comparison, java.nio.files.Files differentiates this case by declaring
"FileAlreadyExistsException - if dir exists but is not a directory". Which
is the kind of thing that would make RawLocalFS behave a lot more like
HDFS. Similarly, if we could switch to Files.moveTo(), then the destination
file would stop being overwritten if it existed, so RawLocalFS's rename()
semantics would come closer to HDFS.

These are things we just can't do while retaining Java 6 compatibility.
-and why I am looking forward to the time when we can stop caring about
Java7.

Now, assuming that Hadoop 3.x will be Java7+ only, we have the option
between now and its future ship date to move to those Java7 APIs. So when
to make the move?

   1. It can be done late -in which case few changes will happen, nobody
   sees much benefit.
   2. We can do it now, and have 12+ months to adopt the new features, make
   the move -and be set up for Java 8 migration in later versions.

Yes, code that uses the new APIs won't work on Java6, but that doesn't mean
it shouldn't happen Hadoop made the jump from Java 5 to Java 6 after all.

-Steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-05 Thread Colin McCabe
I've been using JDK7 for Hadoop development for a while now, and I
know a lot of other folks have as well.  Correct me if I'm wrong, but
what we're talking about here is not "moving towards JDK7" but
"breaking compatibility with JDK6."

There are a lot of good reasons to ditch JDK6.  It would let us use
new APIs in JDK7, especially the new file APIs.  It would let us
update a few dependencies to newer versions.

I don't like the idea of breaking compatibility with JDK6 in trunk,
but not in branch-2.  The traditional reason for putting something in
trunk but not in branch-2 is that it is new code that needs some time
to prove itself.  This doesn't really apply to incrementing min.jdk--
we could do that easily whenever we like.  Meanwhile, if trunk starts
accumulating jdk7-only code and dependencies, backports from trunk to
branch-2 will become harder and harder over time.

Since we make stable releases off of branch-2, and not trunk, I don't
see any upside to this.  To be honest, I see only negatives here.
More time backporting, more issues that show up only in production
(branch-2) and not on dev machines (trunk).

Maybe it's time to start thinking about what version of branch-2 will
drop jdk6 support.  But until there is such a version, I don't think
trunk should do it.

best,
Colin


On Fri, Apr 4, 2014 at 3:15 PM, Haohui Mai  wrote:
> I'm referring to the later case. Indeed migrating JDK7 for branch-2 is more
> difficult.
>
> I think one reasonable approach is to put the hdfs / yarn clients into
> separate jars. The client-side jars can only use JDK6 APIs, so that
> downstream projects running on top of JDK6 continue to work.
>
> The HDFS/YARN/MR servers need to be run on top of JDK7, and we're free to
> use JDK7 APIs inside them. Given the fact that there're way more code in
> the server-side compared to the client-side, having the ability to use JDK7
> in the server-side only might still be a win.
>
> The downside I can think of is that it might complicate the effort of
> publishing maven jars, but this should be an one-time issue.
>
> ~Haohui
>
>
> On Fri, Apr 4, 2014 at 2:37 PM, Alejandro Abdelnur wrote:
>
>> Haohui,
>>
>> Is the idea to compile/test with JDK7 and recommend it for runtime and stop
>> there? Or to start using JDK7 API stuff as well? If the later is the case,
>> then backporting stuff to branch-2 may break and patches may have to be
>> refactored for JDK6. Given that branch-2 got GA status not so long ago, I
>> assume it will be active for a while.
>>
>> What are your thoughts on this regard?
>>
>> Thanks
>>
>>
>> On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai  wrote:
>>
>> > Hi,
>> >
>> > There have been multiple discussions on deprecating supports of JDK6 and
>> > moving towards JDK7. It looks to me that the consensus is that now hadoop
>> > is ready to drop the support of JDK6 and to move towards JDK7. Based on
>> the
>> > consensus, I wonder whether it is a good time to start the migration.
>> >
>> > Here are my understandings of the current status:
>> >
>> > 1. There is no more public updates of JDK6 since Feb 2013. Users no
>> longer
>> > get fixes of security vulnerabilities through official public updates.
>> > 2. Hadoop core is stuck with out-of-date dependency unless moving towards
>> > JDK7. (see
>> > http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
>> > The implementation can also benefit from it thanks to the new
>> > functionalities in JDK7.
>> > 3. The code is ready for JDK7. Cloudera and Hortonworks have successful
>> > stories of supporting Hadoop on JDK7.
>> >
>> >
>> > It seems that the real work of moving to JDK7 is minimal. We only need to
>> > (1) make sure the jenkins are running on top of JDK7, and (2) to update
>> the
>> > minimum required Java version from 6 to 7. Therefore I propose that let's
>> > move towards JDK7 in trunk in the short term.
>> >
>> > Your feedbacks are appreciated.
>> >
>> > Regards,
>> > Haohui
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>> >
>>
>>
>>
>> --
>> Alejandro
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemina

Re: Plans of moving towards JDK7 in trunk

2014-04-04 Thread Alejandro Abdelnur
So, you want to compile hdfs/yarn/mapred clients (and hadoop-common and
hadoop-auth) with JDK6 and the rest with JDK7?


On Fri, Apr 4, 2014 at 3:15 PM, Haohui Mai  wrote:

> I'm referring to the later case. Indeed migrating JDK7 for branch-2 is more
> difficult.
>
> I think one reasonable approach is to put the hdfs / yarn clients into
> separate jars. The client-side jars can only use JDK6 APIs, so that
> downstream projects running on top of JDK6 continue to work.
>
> The HDFS/YARN/MR servers need to be run on top of JDK7, and we're free to
> use JDK7 APIs inside them. Given the fact that there're way more code in
> the server-side compared to the client-side, having the ability to use JDK7
> in the server-side only might still be a win.
>
> The downside I can think of is that it might complicate the effort of
> publishing maven jars, but this should be an one-time issue.
>
> ~Haohui
>
>
> On Fri, Apr 4, 2014 at 2:37 PM, Alejandro Abdelnur  >wrote:
>
> > Haohui,
> >
> > Is the idea to compile/test with JDK7 and recommend it for runtime and
> stop
> > there? Or to start using JDK7 API stuff as well? If the later is the
> case,
> > then backporting stuff to branch-2 may break and patches may have to be
> > refactored for JDK6. Given that branch-2 got GA status not so long ago, I
> > assume it will be active for a while.
> >
> > What are your thoughts on this regard?
> >
> > Thanks
> >
> >
> > On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai  wrote:
> >
> > > Hi,
> > >
> > > There have been multiple discussions on deprecating supports of JDK6
> and
> > > moving towards JDK7. It looks to me that the consensus is that now
> hadoop
> > > is ready to drop the support of JDK6 and to move towards JDK7. Based on
> > the
> > > consensus, I wonder whether it is a good time to start the migration.
> > >
> > > Here are my understandings of the current status:
> > >
> > > 1. There is no more public updates of JDK6 since Feb 2013. Users no
> > longer
> > > get fixes of security vulnerabilities through official public updates.
> > > 2. Hadoop core is stuck with out-of-date dependency unless moving
> towards
> > > JDK7. (see
> > > http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
> > > The implementation can also benefit from it thanks to the new
> > > functionalities in JDK7.
> > > 3. The code is ready for JDK7. Cloudera and Hortonworks have successful
> > > stories of supporting Hadoop on JDK7.
> > >
> > >
> > > It seems that the real work of moving to JDK7 is minimal. We only need
> to
> > > (1) make sure the jenkins are running on top of JDK7, and (2) to update
> > the
> > > minimum required Java version from 6 to 7. Therefore I propose that
> let's
> > > move towards JDK7 in trunk in the short term.
> > >
> > > Your feedbacks are appreciated.
> > >
> > > Regards,
> > > Haohui
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro


Re: Plans of moving towards JDK7 in trunk

2014-04-04 Thread Haohui Mai
bq. It might not be as clear cut...

Totally agree. I think the key is that we can do the work in an incremental
way. We can only introduce JDK7 dependency on the server side. In order to
do this we need to separate the client-side code to separate jars. I've
already proposed to create a hdfs-client jar in the hdfs-dev mailing list.

bq.  I would have thought it could be easily achieved by marking certain
project poms with source/target 1.6 in their maven compiler plugin
configuration while upgrading the default setting to 1.7. Do you anticipate
more issues?

Correct me if I'm wrong, but I think that's enough. The work should be
minimal.

~Haohui

On Fri, Apr 4, 2014 at 3:43 PM, Sangjin Lee  wrote:

> Please don't forget the mac os build on JDK 7. :)
>
>
> On Fri, Apr 4, 2014 at 3:15 PM, Haohui Mai  wrote:
>
> > I'm referring to the later case. Indeed migrating JDK7 for branch-2 is
> more
> > difficult.
> >
> > I think one reasonable approach is to put the hdfs / yarn clients into
> > separate jars. The client-side jars can only use JDK6 APIs, so that
> > downstream projects running on top of JDK6 continue to work.
> >
>
> It might not be as clear cut. For clients to run clean on JDK 6, not only
> the client projects/artifacts but also any of their dependencies must be
> free of JDK 7 code. And this obviously includes things like hadoop-common
> (or any downstream dependencies for that matter).
>
>
> >
> > The HDFS/YARN/MR servers need to be run on top of JDK7, and we're free to
> > use JDK7 APIs inside them. Given the fact that there're way more code in
> > the server-side compared to the client-side, having the ability to use
> JDK7
> > in the server-side only might still be a win.
> >
> > The downside I can think of is that it might complicate the effort of
> > publishing maven jars, but this should be an one-time issue.
> >
>
> Could you elaborate on why it would complicate maven jar publication?
> Perhaps I'm over-simplifying things, but I would have thought it could be
> easily achieved by marking certain project poms with source/target 1.6 in
> their maven compiler plugin configuration while upgrading the default
> setting to 1.7. Do you anticipate more issues?
>
>
> >
> > ~Haohui
> >
> >
> > On Fri, Apr 4, 2014 at 2:37 PM, Alejandro Abdelnur  > >wrote:
> >
> > > Haohui,
> > >
> > > Is the idea to compile/test with JDK7 and recommend it for runtime and
> > stop
> > > there? Or to start using JDK7 API stuff as well? If the later is the
> > case,
> > > then backporting stuff to branch-2 may break and patches may have to be
> > > refactored for JDK6. Given that branch-2 got GA status not so long
> ago, I
> > > assume it will be active for a while.
> > >
> > > What are your thoughts on this regard?
> > >
> > > Thanks
> > >
> > >
> > > On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai 
> wrote:
> > >
> > > > Hi,
> > > >
> > > > There have been multiple discussions on deprecating supports of JDK6
> > and
> > > > moving towards JDK7. It looks to me that the consensus is that now
> > hadoop
> > > > is ready to drop the support of JDK6 and to move towards JDK7. Based
> on
> > > the
> > > > consensus, I wonder whether it is a good time to start the migration.
> > > >
> > > > Here are my understandings of the current status:
> > > >
> > > > 1. There is no more public updates of JDK6 since Feb 2013. Users no
> > > longer
> > > > get fixes of security vulnerabilities through official public
> updates.
> > > > 2. Hadoop core is stuck with out-of-date dependency unless moving
> > towards
> > > > JDK7. (see
> > > > http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
> > > > The implementation can also benefit from it thanks to the new
> > > > functionalities in JDK7.
> > > > 3. The code is ready for JDK7. Cloudera and Hortonworks have
> successful
> > > > stories of supporting Hadoop on JDK7.
> > > >
> > > >
> > > > It seems that the real work of moving to JDK7 is minimal. We only
> need
> > to
> > > > (1) make sure the jenkins are running on top of JDK7, and (2) to
> update
> > > the
> > > > minimum required Java version from 6 to 7. Therefore I propose that
> > let's
> > > > move towards JDK7 in trunk in the short term.
> > > >
> > > > Your feedbacks are appreciated.
> > > >
> > > > Regards,
> > > > Haohui
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> >

Re: Plans of moving towards JDK7 in trunk

2014-04-04 Thread Sangjin Lee
Please don't forget the mac os build on JDK 7. :)


On Fri, Apr 4, 2014 at 3:15 PM, Haohui Mai  wrote:

> I'm referring to the later case. Indeed migrating JDK7 for branch-2 is more
> difficult.
>
> I think one reasonable approach is to put the hdfs / yarn clients into
> separate jars. The client-side jars can only use JDK6 APIs, so that
> downstream projects running on top of JDK6 continue to work.
>

It might not be as clear cut. For clients to run clean on JDK 6, not only
the client projects/artifacts but also any of their dependencies must be
free of JDK 7 code. And this obviously includes things like hadoop-common
(or any downstream dependencies for that matter).


>
> The HDFS/YARN/MR servers need to be run on top of JDK7, and we're free to
> use JDK7 APIs inside them. Given the fact that there're way more code in
> the server-side compared to the client-side, having the ability to use JDK7
> in the server-side only might still be a win.
>
> The downside I can think of is that it might complicate the effort of
> publishing maven jars, but this should be an one-time issue.
>

Could you elaborate on why it would complicate maven jar publication?
Perhaps I'm over-simplifying things, but I would have thought it could be
easily achieved by marking certain project poms with source/target 1.6 in
their maven compiler plugin configuration while upgrading the default
setting to 1.7. Do you anticipate more issues?


>
> ~Haohui
>
>
> On Fri, Apr 4, 2014 at 2:37 PM, Alejandro Abdelnur  >wrote:
>
> > Haohui,
> >
> > Is the idea to compile/test with JDK7 and recommend it for runtime and
> stop
> > there? Or to start using JDK7 API stuff as well? If the later is the
> case,
> > then backporting stuff to branch-2 may break and patches may have to be
> > refactored for JDK6. Given that branch-2 got GA status not so long ago, I
> > assume it will be active for a while.
> >
> > What are your thoughts on this regard?
> >
> > Thanks
> >
> >
> > On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai  wrote:
> >
> > > Hi,
> > >
> > > There have been multiple discussions on deprecating supports of JDK6
> and
> > > moving towards JDK7. It looks to me that the consensus is that now
> hadoop
> > > is ready to drop the support of JDK6 and to move towards JDK7. Based on
> > the
> > > consensus, I wonder whether it is a good time to start the migration.
> > >
> > > Here are my understandings of the current status:
> > >
> > > 1. There is no more public updates of JDK6 since Feb 2013. Users no
> > longer
> > > get fixes of security vulnerabilities through official public updates.
> > > 2. Hadoop core is stuck with out-of-date dependency unless moving
> towards
> > > JDK7. (see
> > > http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
> > > The implementation can also benefit from it thanks to the new
> > > functionalities in JDK7.
> > > 3. The code is ready for JDK7. Cloudera and Hortonworks have successful
> > > stories of supporting Hadoop on JDK7.
> > >
> > >
> > > It seems that the real work of moving to JDK7 is minimal. We only need
> to
> > > (1) make sure the jenkins are running on top of JDK7, and (2) to update
> > the
> > > minimum required Java version from 6 to 7. Therefore I propose that
> let's
> > > move towards JDK7 in trunk in the short term.
> > >
> > > Your feedbacks are appreciated.
> > >
> > > Regards,
> > > Haohui
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: Plans of moving towards JDK7 in trunk

2014-04-04 Thread Haohui Mai
I'm referring to the later case. Indeed migrating JDK7 for branch-2 is more
difficult.

I think one reasonable approach is to put the hdfs / yarn clients into
separate jars. The client-side jars can only use JDK6 APIs, so that
downstream projects running on top of JDK6 continue to work.

The HDFS/YARN/MR servers need to be run on top of JDK7, and we're free to
use JDK7 APIs inside them. Given the fact that there're way more code in
the server-side compared to the client-side, having the ability to use JDK7
in the server-side only might still be a win.

The downside I can think of is that it might complicate the effort of
publishing maven jars, but this should be an one-time issue.

~Haohui


On Fri, Apr 4, 2014 at 2:37 PM, Alejandro Abdelnur wrote:

> Haohui,
>
> Is the idea to compile/test with JDK7 and recommend it for runtime and stop
> there? Or to start using JDK7 API stuff as well? If the later is the case,
> then backporting stuff to branch-2 may break and patches may have to be
> refactored for JDK6. Given that branch-2 got GA status not so long ago, I
> assume it will be active for a while.
>
> What are your thoughts on this regard?
>
> Thanks
>
>
> On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai  wrote:
>
> > Hi,
> >
> > There have been multiple discussions on deprecating supports of JDK6 and
> > moving towards JDK7. It looks to me that the consensus is that now hadoop
> > is ready to drop the support of JDK6 and to move towards JDK7. Based on
> the
> > consensus, I wonder whether it is a good time to start the migration.
> >
> > Here are my understandings of the current status:
> >
> > 1. There is no more public updates of JDK6 since Feb 2013. Users no
> longer
> > get fixes of security vulnerabilities through official public updates.
> > 2. Hadoop core is stuck with out-of-date dependency unless moving towards
> > JDK7. (see
> > http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
> > The implementation can also benefit from it thanks to the new
> > functionalities in JDK7.
> > 3. The code is ready for JDK7. Cloudera and Hortonworks have successful
> > stories of supporting Hadoop on JDK7.
> >
> >
> > It seems that the real work of moving to JDK7 is minimal. We only need to
> > (1) make sure the jenkins are running on top of JDK7, and (2) to update
> the
> > minimum required Java version from 6 to 7. Therefore I propose that let's
> > move towards JDK7 in trunk in the short term.
> >
> > Your feedbacks are appreciated.
> >
> > Regards,
> > Haohui
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Plans of moving towards JDK7 in trunk

2014-04-04 Thread Alejandro Abdelnur
Haohui,

Is the idea to compile/test with JDK7 and recommend it for runtime and stop
there? Or to start using JDK7 API stuff as well? If the later is the case,
then backporting stuff to branch-2 may break and patches may have to be
refactored for JDK6. Given that branch-2 got GA status not so long ago, I
assume it will be active for a while.

What are your thoughts on this regard?

Thanks


On Fri, Apr 4, 2014 at 2:29 PM, Haohui Mai  wrote:

> Hi,
>
> There have been multiple discussions on deprecating supports of JDK6 and
> moving towards JDK7. It looks to me that the consensus is that now hadoop
> is ready to drop the support of JDK6 and to move towards JDK7. Based on the
> consensus, I wonder whether it is a good time to start the migration.
>
> Here are my understandings of the current status:
>
> 1. There is no more public updates of JDK6 since Feb 2013. Users no longer
> get fixes of security vulnerabilities through official public updates.
> 2. Hadoop core is stuck with out-of-date dependency unless moving towards
> JDK7. (see
> http://hadoop.6.n7.nabble.com/very-old-dependencies-td71486.html)
> The implementation can also benefit from it thanks to the new
> functionalities in JDK7.
> 3. The code is ready for JDK7. Cloudera and Hortonworks have successful
> stories of supporting Hadoop on JDK7.
>
>
> It seems that the real work of moving to JDK7 is minimal. We only need to
> (1) make sure the jenkins are running on top of JDK7, and (2) to update the
> minimum required Java version from 6 to 7. Therefore I propose that let's
> move towards JDK7 in trunk in the short term.
>
> Your feedbacks are appreciated.
>
> Regards,
> Haohui
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro