Re: [discuss] dropping Hadoop 2.2 and 2.3 support in Spark 2.0?

Sean Owen Thu, 14 Jan 2016 02:18:13 -0800

I personally support this. I had suggest drawing the line at Hadoop
2.6, but that's minor. More info:

Hadoop 2.7: April 2015
Hadoop 2.6: Nov 2014
Hadoop 2.5: Aug 2014
Hadoop 2.4: April 2014
Hadoop 2.3: Feb 2014
Hadoop 2.2: Oct 2013

CDH 5.0/5.1 = Hadoop 2.3 + backports
CDH 5.2/5.3 = Hadoop 2.5 + backports
CDH 5.4+ = Hadoop 2.6 + chunks of 2.7 + backports.

I can only imagine that CDH6 this year will be based on something
later still like 2.8 (no idea about the 3.0 schedule). In the sense
that 5.2 was released about a year and half ago, yes, this vendor has
moved on from 2.3 a while ago. These releases will also never contain
a different minor Spark release. For example 5.7 will have Spark 1.6,
I believe, and not 2.0.

Here, I listed some additional things we could clean up in Spark if
Hadoop 2.6 was assumed. By itself, not a lot:
https://github.com/apache/spark/pull/10446#issuecomment-167971026

Yes, we also get less Jenkins complexity. Mostly, the jar-hell that's
biting now gets a little more feasible to fix. And we get Hadoop fixes
as well as new APIs, which helps mostly for YARN.

My general position is that backwards-compatibility and supporting
older platforms needs to be a low priority in a major release; it's a
decision about what to support for users in the next couple years, not
the preceding couple years. Users on older technologies simply stay on
the older Spark until ready to update; they are in no sense suddenly
left behind otherwise.

On Thu, Jan 14, 2016 at 6:29 AM, Reynold Xin <r...@databricks.com> wrote:
> We've dropped Hadoop 1.x support in Spark 2.0.
>
> There is also a proposal to drop Hadoop 2.2 and 2.3, i.e. the minimal Hadoop
> version we support would be Hadoop 2.4. The main advantage is then we'd be
> able to focus our Jenkins resources (and the associated maintenance of
> Jenkins) to create builds for Hadoop 2.6/2.7. It is my understanding that
> all Hadoop vendors have moved away from 2.2/2.3, but there might be some
> users that are on these older versions.
>
> What do you think about this idea?
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: [discuss] dropping Hadoop 2.2 and 2.3 support in Spark 2.0?

Reply via email to