Thanks Russell

> Since the driver is responsible for moving jars specified in --jars, you
cannot use a jar specified by --jars to be in driver-class-path, since the
driver is already started and it's classpath is already set before any jars
are moved.

Your point is interesting, however I see some discrepancy with what the
Spark doc that says:

""When using spark-submit, the application jar along with any jars included
with the --jars option will be automatically transferred to the cluster.
URLs supplied after --jars must be separated by commas. That list is
included on the driver and executor classpaths. ""

The most interesting part here (for the discussion) is "That list [from
--jars] is included on the driver and executor classpaths.".

That seems a contradiction with your sentence (as you state that a jar
specified by --jars can't be in the driver classpath)

... hum, I am still thinking about how to reunite both sides.

Thanks anyway

Dominique



Le jeu. 12 nov. 2020 à 17:34, Russell Spitzer <russell.spit...@gmail.com> a
écrit :

> --driver-class-path does not move jars, so it is dependent on your Spark
> resource manager (master). It is interpreted literally so if your files do
> not exist in the location you provide relative where the driver is run,
> they will not be placed on the classpath.
>
> Since the driver is responsible for moving jars specified in --jars, you
> cannot use a jar specified by --jars to be in driver-class-path, since the
> driver is already started and it's classpath is already set before any jars
> are moved.
>
> Some distributions may change this behavior though, but this is the jist
> of it.
>
> On Thu, Nov 12, 2020 at 10:02 AM Dominique De Vito <ddv36...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am using Spark 2.1 (BTW) on YARN.
>>
>> I am trying to upload JAR on YARN cluster, and to use them to replace
>> on-site (alreading in-place) JAR.
>>
>> I am trying to do so through spark-submit.
>>
>> One helpful answer
>> https://stackoverflow.com/questions/37132559/add-jars-to-a-spark-job-spark-submit/37348234
>> is the following one:
>>
>> spark-submit --jars additional1.jar,additional2.jar \
>>   --driver-class-path additional1.jar:additional2.jar \
>>   --conf spark.executor.extraClassPath=additional1.jar:additional2.jar \
>>   --class MyClass main-application.jar
>>
>> So, I understand the following:
>>
>>    - "--jars" is for uploading jar on each node
>>    - "--driver-class-path" is for using uploaded jar for the driver.
>>    - "--conf spark.executor.extraClassPath" is for using uploaded jar
>>    for executors.
>>
>> While I master the filepaths for "--jars" within a spark-submit command,
>> what will be the filepaths of the uploaded JAR to be used in
>> "--driver-class-path" for example ?
>>
>> The doc says: "*JARs and files are copied to the working directory for
>> each SparkContext on the executor nodes*"
>>
>> Fine, but for the following command, what should I put instead of XXX and
>> YYY ?
>>
>> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>>   --driver-class-path XXX:YYY \
>>   --conf spark.executor.extraClassPath=XXX:YYY \
>>   --class MyClass main-application.jar
>>
>> When using spark-submit, how can I reference the "*working directory for
>> the SparkContext*" to form XXX and YYY filepath ?
>>
>> Thanks.
>>
>> Dominique
>>
>> PS: I have tried
>>
>> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>>   --driver-class-path some1.jar:some2.jar \
>>   --conf spark.executor.extraClassPath=some1.jar:some2.jar  \
>>   --class MyClass main-application.jar
>>
>> No success (if I made no mistake)
>>
>> And I have tried also:
>>
>> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>>    --driver-class-path ./some1.jar:./some2.jar \
>>    --conf spark.executor.extraClassPath=./some1.jar:./some2.jar \
>>    --class MyClass main-application.jar
>>
>> No success either.
>>
>

Reply via email to