Thank you Russel and Jeff,

My bad, I wasn't clear before about the conflicting jars. By that, I meant
my application needs to use an updated version of certain jars than what
are present in the default classpath. What would be the best way to use
confs spark.jar and spark.driver.extraClassPath both to do a classpath
reordering so that the updated versions get picked first? Looks like the
one way to use extraClassPath conf here.




On Thu, 16 Jul 2020 at 12:05, Jeff Evans <jeffrey.wayne.ev...@gmail.com>
wrote:

> If you can't avoid it, you need to make use of the
> spark.driver.userClassPathFirst and/or spark.executor.userClassPathFirst
> properties.
>
> On Thu, Jul 16, 2020 at 2:03 PM Russell Spitzer <russell.spit...@gmail.com>
> wrote:
>
>> I believe the main issue here is that spark.jars is a bit "too late" to
>> actually prepend things to the class path. For most use cases this value is
>> not read until after the JVM has already started and the system classloader
>> has already loaded.
>>
>> The jar argument gets added via the dynamic class loader so it
>> necessarily has to come after wards :/ Driver extra classpath and it's
>> friends, modify the actual launch command of the driver (or executors) so
>> they can prepend whenever they want.
>>
>>  In general you do not want to have conflicting jars at all if possible
>> and I would recommend looking into shading if it's really important for
>> your application to use a specific incompatible version of a library. Jar
>> (and extraClasspath) are really just
>> for adding additional jars and I personally would try not to rely on
>> classpath ordering to get the right libraries recognized.
>>
>> On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla <nupur14shu...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> How can we use *spark.jars* to to specify conflicting jars (that is,
>>> jars that are already present in the spark's default classpath)? Jars
>>> specified in this conf gets "appended" to the classpath, and thus gets
>>> looked at after the default classpath. Is it not intended to be used to
>>> specify conflicting jars?
>>> Meanwhile when *spark.driver.extraClassPath* conf is specified, this
>>> path is "prepended" to the classpath and thus takes precedence over the
>>> default classpath.
>>>
>>> How can I use both to specify different jars and paths but achieve a
>>> precedence of spark.jars path > spark.driver.extraClassPath > spark default
>>> classpath (left to right precedence order)?
>>>
>>> Experiment conducted:
>>>
>>> I am using sample-project.jar which has one class in it SampleProject.
>>> This has a method which prints the version number of the jar. For this
>>> experiment I am using 3 versions of this sample-project.jar
>>> Sample-project-1.0.0.jar is present in the spark default classpath in my
>>> test cluster
>>> Sample-project-2.0.0.jar is present in folder /home/<user>/ClassPathConf
>>> on driver
>>> Sample-project-3.0.0.jar is present in  folder /home/<user>/JarsConf on
>>> driver
>>>
>>> (Empty cell in img below means that conf was not specified)
>>>
>>> [image: image.png]
>>>
>>>
>>> Thank you,
>>> Nupur
>>>
>>>
>>>

Reply via email to