+1 for Java 8 only I think it will make it easier to make a unified API for Java and Scala, instead of the wrappers of Java over Scala. On Mar 24, 2016 11:46 AM, "Stephen Boesch" <java...@gmail.com> wrote:
> +1 for java8 only +1 for 2.11+ only . At this point scala libraries > supporting only 2.10 are typically less active and/or poorly maintained. > That trend will only continue when considering the lifespan of spark 2.X. > > 2016-03-24 11:32 GMT-07:00 Steve Loughran <ste...@hortonworks.com>: > >> >> On 24 Mar 2016, at 15:27, Koert Kuipers <ko...@tresata.com> wrote: >> >> i think the arguments are convincing, but it also makes me wonder if i >> live in some kind of alternate universe... we deploy on customers clusters, >> where the OS, python version, java version and hadoop distro are not chosen >> by us. so think centos 6, cdh5 or hdp 2.3, java 7 and python 2.6. we simply >> have access to a single proxy machine and launch through yarn. asking them >> to upgrade java is pretty much out of the question or a 6+ month ordeal. of >> the 10 client clusters i can think of on the top of my head all of them are >> on java 7, none are on java 8. so by doing this you would make spark 2 >> basically unusable for us (unless most of them have plans of upgrading in >> near term to java 8, i will ask around and report back...). >> >> >> >> It's not actually mandatory for the process executing in the Yarn cluster >> to run with the same JVM as the rest of the Hadoop stack; all that is >> needed is for the environment variables to set up the JAVA_HOME and PATH. >> Switching JVMs not something which YARN makes it easy to do, but it may be >> possible, especially if Spark itself provides some hooks, so you don't have >> to manually lay with setting things up. That may be something which could >> significantly ease adoption of Spark 2 in YARN clusters. Same for Python. >> >> This is something I could probably help others to address >> >> >