This change will be merged shortly for Spark 1.4, and has a minor
implication for those creating their own Spark builds:

https://issues.apache.org/jira/browse/SPARK-7249
https://github.com/apache/spark/pull/5786

The default Hadoop dependency has actually been Hadoop 2.2 for some
time, but the defaults weren't fully consistent as a Hadoop 2.2 build.
That is what this resolves. The discussion highlights that it's
actually not great to rely on the Hadoop defaults, if you care at all
about the Hadoop binding, and that it's good practice to set some
-Phadoop-x.y profile in any build.


The net changes are:

If you don't care about Hadoop at all, you could ignore this. You will
get a consistent Hadoop 2.2 binding by default now. Still, you may
wish to set a Hadoop profile.

If you build for Hadoop 1, you need to set -Phadoop-1 now.

If you build for Hadoop 2.2, you should still set -Phadoop-2.2 even
though this is the default and is a no-op profile now.

You can continue to set other Hadoop profiles and override
hadoop.version; these are unaffected.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to