On Fri, Nov 20, 2015 at 10:39 PM, Reynold Xin <r...@databricks.com> wrote: > I don't think we should look at it from only maintenance point of view -- > because in that case the answer is clearly supporting as few versions as > possible (or just rm -rf spark source code and call it a day). It is a > tradeoff between the number of users impacted and the maintenance burden.
The upside to supporting only newer versions is less maintenance (no small thing given how sprawling the build is), but also more ability to use newer functionality. The downside is of course not letting older Hadoop users use the latest Spark. > 1. Can Hadoop 2.6 client read Hadoop 2.4 / 2.3? If the question is about HDFS, really, then I think the answer is "yes". The big compatibility problem has been protobuf but all of 2.2+ is on 2.5. > 3. Can Hadoop 2.6+ YARN work on older versions of YARN clusters? Same client/server question? This is where I'm not as clear. I think the answer is 'yes' to the extent you're using functionality that existed in the older YARN. Of course, using some newer API vs old clusters doesn't work. > 4. (for Hadoop vendors) When did/will support for Hadoop 2.4 and below stop? > To what extent do you care about running Spark on older Hadoop clusters. CDH 5.3 = Hadoop 2.6, FWIW, which was out about a year ago. Support continues for a long time in the sense that CDH 5 will be supported for years. However, Spark 2 would never be shipped / supported in CDH 5. So, it's not an issue for Spark 2; Spark 2 will be "supported" probably only vs Hadoop 3 or at least something later in 2.x than 2.6. The question is here is really about whether Spark should specially support, say, Spark 2 + CDH 5.0 or something. My experience so far is that Spark has not really supported older vendor versions it claims to, and I'd rather not pretend it does. So this doesn't strike me as a great reason either. This is roughly why supporting, say, 2.6 as a pretty safely recent version seems like an OK place to draw the line 6-8 months from now. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org