Re: “mapreduce.job.user.classpath.first” for Spark

bo yang Wed, 04 Feb 2015 09:23:44 -0800

Hi Corey,

I see. Thanks for making it clear. I may be lucky not hitting code path of
such Guava classes. But I hit some other jar conflicts when using Spark to
connect to AWS S3. Then I had to manually try each version of
org.apache.httpcomponents until I found a proper old version.


Another suggestion is to build Spark by yourself.

Anyway, would like to see your update once you figure out the solution.

Best wishes!
Bo



On Wed, Feb 4, 2015 at 4:47 AM, Corey Nolet <cjno...@gmail.com> wrote:

> Bo yang-
>
> I am using Spark 1.2.0 and undoubtedly there are older Guava classes which
> are being picked up and serialized with the closures when they are sent
> from the driver to the executors because the class serial version ids don't
> match from the driver to the executors. Have you tried doing this? Guava
> works fine for me when this is not the case- but as soon as a Guava class
> which was changed from versions <15.0 is serialized, it fails. See [1] fore
> info- we did fairly extensive testing last night. I've isolated the issue
> to Hadoop's really old version of Guava being picked up. Again, this is
> only noticeable when classes are used from Guava 15.0 that were changed
> from previous versions and those classes are being serialized on the driver
> and shipped to the executors.
>
>
> [1] https://github.com/calrissian/mango/issues/158
>
> On Wed, Feb 4, 2015 at 1:31 AM, bo yang <bobyan...@gmail.com> wrote:
>
>> Corey,
>>
>> Which version of Spark do you use? I am using Spark 1.2.0, and  guava
>> 15.0. It seems fine.
>>
>> Best,
>> Bo
>>
>>
>> On Tue, Feb 3, 2015 at 8:56 PM, M. Dale <medal...@yahoo.com.invalid>
>> wrote:
>>
>>>  Try spark.yarn.user.classpath.first (see
>>> https://issues.apache.org/jira/browse/SPARK-2996 - only works for
>>> YARN). Also thread at
>>> http://apache-spark-user-list.1001560.n3.nabble.com/netty-on-classpath-when-using-spark-submit-td18030.html
>>> .
>>>
>>> HTH,
>>> Markus
>>>
>>> On 02/03/2015 11:20 PM, Corey Nolet wrote:
>>>
>>> I'm having a really bad dependency conflict right now with Guava
>>> versions between my Spark application in Yarn and (I believe) Hadoop's
>>> version.
>>>
>>>  The problem is, my driver has the version of Guava which my
>>> application is expecting (15.0) while it appears the Spark executors that
>>> are working on my RDDs have a much older version (assuming it's the old
>>> version on the Hadoop classpath).
>>>
>>>  Is there a property like "mapreduce.job.user.classpath.first' that I
>>> can set to make sure my own classpath is extablished first on the executors?
>>>
>>>
>>>
>>
>

Re: “mapreduce.job.user.classpath.first” for Spark

Reply via email to