Re: “mapreduce.job.user.classpath.first” for Spark

Marcelo Vanzin Wed, 04 Feb 2015 13:02:33 -0800

Hi Koert,

On Wed, Feb 4, 2015 at 11:35 AM, Koert Kuipers <ko...@tresata.com> wrote:
> do i understand it correctly that on yarn the the customer jars are truly
> placed before the yarn and spark jars on classpath? meaning at container
> construction time, on the same classloader? that would be great news for me.
> it would open up the possibility of using newer versions of many libraries.


That's correct, the Yarn setting places the user's jars in the system
classpath before Spark/Hadoop jars, so they can override classes
needed by Spark/Hadoop.

That's the main reason why it's not documented and not suggested
unless there's no other workaround. Because you're potentially
overriding classes that might break Spark, Hadoop or something else
that's packaged with those. But if it works for your case, that's
great.

As for the "userClassPath" first thing, I've made some changes to the
class loaders as part of implementing that option for Yarn [1], and
someone also made similar changes in isolation [2]. So maybe the
issues you were running into are fixed by either of those? In the
future, it would be great to be able to declare that feature stable,
since I believe it's a better alternative to overriding libraries that
Spark or Hadoop depend on.

[1] https://github.com/apache/spark/pull/3233
[2] https://github.com/apache/spark/pull/3725

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: “mapreduce.job.user.classpath.first” for Spark

Reply via email to