That's what I'm saying you don't want to do :) If you have two versions of
a library with different apis the safest approach is shading and ordering
probably can't be relied on. In my experience reflection will behave in
ways you may not like as well as which classpath has priority when a class
is loading.  Spark.Jars will never be able to reorder so you'll need to get
those jars on the system class loader using the driver (and executor) extra
classpath args (with userClasspathFirst). I will stress again that it would
be my last choice for getting it working and I would try shading first if I
really have a conflict.

On Thu, Jul 16, 2020 at 2:17 PM Nupur Shukla <nupur14shu...@gmail.com>
wrote:

> Thank you Russel and Jeff,
>
> My bad, I wasn't clear before about the conflicting jars. By that, I meant
> my application needs to use an updated version of certain jars than what
> are present in the default classpath. What would be the best way to use
> confs spark.jar and spark.driver.extraClassPath both to do a classpath
> reordering so that the updated versions get picked first? Looks like the
> one way to use extraClassPath conf here.
>
>
>
>
> On Thu, 16 Jul 2020 at 12:05, Jeff Evans <jeffrey.wayne.ev...@gmail.com>
> wrote:
>
>> If you can't avoid it, you need to make use of the
>> spark.driver.userClassPathFirst and/or spark.executor.userClassPathFirst
>> properties.
>>
>> On Thu, Jul 16, 2020 at 2:03 PM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> I believe the main issue here is that spark.jars is a bit "too late" to
>>> actually prepend things to the class path. For most use cases this value is
>>> not read until after the JVM has already started and the system classloader
>>> has already loaded.
>>>
>>> The jar argument gets added via the dynamic class loader so it
>>> necessarily has to come after wards :/ Driver extra classpath and it's
>>> friends, modify the actual launch command of the driver (or executors) so
>>> they can prepend whenever they want.
>>>
>>>  In general you do not want to have conflicting jars at all if possible
>>> and I would recommend looking into shading if it's really important for
>>> your application to use a specific incompatible version of a library. Jar
>>> (and extraClasspath) are really just
>>> for adding additional jars and I personally would try not to rely on
>>> classpath ordering to get the right libraries recognized.
>>>
>>> On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla <nupur14shu...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> How can we use *spark.jars* to to specify conflicting jars (that is,
>>>> jars that are already present in the spark's default classpath)? Jars
>>>> specified in this conf gets "appended" to the classpath, and thus gets
>>>> looked at after the default classpath. Is it not intended to be used to
>>>> specify conflicting jars?
>>>> Meanwhile when *spark.driver.extraClassPath* conf is specified, this
>>>> path is "prepended" to the classpath and thus takes precedence over the
>>>> default classpath.
>>>>
>>>> How can I use both to specify different jars and paths but achieve a
>>>> precedence of spark.jars path > spark.driver.extraClassPath > spark default
>>>> classpath (left to right precedence order)?
>>>>
>>>> Experiment conducted:
>>>>
>>>> I am using sample-project.jar which has one class in it SampleProject.
>>>> This has a method which prints the version number of the jar. For this
>>>> experiment I am using 3 versions of this sample-project.jar
>>>> Sample-project-1.0.0.jar is present in the spark default classpath in
>>>> my test cluster
>>>> Sample-project-2.0.0.jar is present in folder
>>>> /home/<user>/ClassPathConf on driver
>>>> Sample-project-3.0.0.jar is present in  folder /home/<user>/JarsConf on
>>>> driver
>>>>
>>>> (Empty cell in img below means that conf was not specified)
>>>>
>>>> [image: image.png]
>>>>
>>>>
>>>> Thank you,
>>>> Nupur
>>>>
>>>>
>>>>

Reply via email to