I'd advocate 2.7 over 2.6, primarily due to Kerberos and JVM versions
2.6 is not even qualified for Java 7, let alone Java 8: you've got no
guarantees that things work on the min Java version Spark requires.
Kerberos is always the failure point here, as well as various libraries (jetty)
which
w
ire compatibility is relevant if hadoop is included in spark build
for those of us that build spark without hadoop included hadoop (binary)
api compatibility matters. i wouldn't want to build against hadoop 2.7 and
deploy on hadoop 2.6, but i am ok the other way around. so to get the
I think it would make sense to drop one of them, but not necessarily 2.6.
It kinda depends on what wire compatibility guarantees the Hadoop
libraries have; can a 2.6 client talk to 2.7 (pretty certain it can)?
Is the opposite safe (not sure)?
If the answer to the latter question is "no", then
oh nevermind i am used to spark builds without hadoop included. but i
realize that if hadoop is included it matters if its 2.6 or 2.7...
On Thu, Feb 8, 2018 at 5:06 PM, Koert Kuipers wrote:
> wouldn't hadoop 2.7 profile means someone by introduces usage of some
> hadoop apis
wouldn't hadoop 2.7 profile means someone by introduces usage of some
hadoop apis that dont exist in hadoop 2.6?
why not keep 2.6 and ditch 2.7 given that hadoop 2.7 is backwards
compatible with 2.6? what is the added value of having a 2.7 profile?
On Thu, Feb 8, 2018 at 5:03 PM, Sean Owen
That would still work with a Hadoop-2.7-based profile, as there isn't
actually any code difference in Spark that treats the two versions
differently (nor, really, much different between 2.6 and 2.7 to begin
with). This practice of different profile builds was pretty unnecessary
after 2.2; it's
CDH 5 is still based on hadoop 2.6
On Thu, Feb 8, 2018 at 2:03 PM, Sean Owen wrote:
> Mostly just shedding the extra build complexity, and builds. The primary
> little annoyance is it's 2x the number of flaky build failures to examine.
> I suppose it allows using a 2.7+-only
Mostly just shedding the extra build complexity, and builds. The primary
little annoyance is it's 2x the number of flaky build failures to examine.
I suppose it allows using a 2.7+-only feature, but outside of YARN, not
sure there is anything compelling.
It's something that probably gains us
Does it gain us anything to drop 2.6?
> On Feb 8, 2018, at 10:50 AM, Sean Owen wrote:
>
> At this point, with Hadoop 3 on deck, I think hadoop 2.6 is both fairly old,
> and actually, not different from 2.7 with respect to Spark. That is, I don't
> know if we are actually
At this point, with Hadoop 3 on deck, I think hadoop 2.6 is both fairly
old, and actually, not different from 2.7 with respect to Spark. That is, I
don't know if we are actually maintaining anything here but a separate
profile and 2x the number of test builds.
The cost is, by the same token, low.
10 matches
Mail list logo