[jira] [Commented] (SPARK-4831) Current directory always on classpath with spark-submit

Daniel Darabos (JIRA) Thu, 11 Dec 2014 12:19:03 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243074#comment-14243074
 ]


Daniel Darabos commented on SPARK-4831:
---------------------------------------

bq. Is it perhaps finding and exploded directory of classes?

Yes, that is exactly the situation. One instance of the file is in a jar, 
another is just there ("free-floating") in the directory. It is a configuration 
file. (Actually it's in a "conf" directory, but Play looks for both 
"play.plugins" and "conf/play.plugins" with getResources in the classpath. So 
it finds the copy inside the generated jar, also in the "conf" directory of the 
project. We can of course work around this in numerous ways.)

I think there is no reason for spark-submit to add an empty entry to the 
classpath. It will just lead to accidents like ours. If the user wants to add 
an empty entry, they can easily do so.

I've sent https://github.com/apache/spark/pull/3678 as a possible fix. Thanks 
for investigating!

> Current directory always on classpath with spark-submit
> -------------------------------------------------------
>
>                 Key: SPARK-4831
>                 URL: https://issues.apache.org/jira/browse/SPARK-4831
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.1.1, 1.2.0
>            Reporter: Daniel Darabos
>            Priority: Minor
>
> We had a situation where we were launching an application with spark-submit, 
> and a file (play.plugins) was on the classpath twice, causing problems 
> (trying to register plugins twice). Upon investigating how it got on the 
> classpath twice, we found that it was present in one of our jars, and also in 
> the current working directory. But the one in the current working directory 
> should not be on the classpath. We never asked spark-submit to put the 
> current directory on the classpath.
> I think this is caused by a line in 
> [compute-classpath.sh|https://github.com/apache/spark/blob/v1.2.0-rc2/bin/compute-classpath.sh#L28]:
> {code}
> CLASSPATH="$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH"
> {code}
> Now if SPARK_CLASSPATH is empty, the empty string is added to the classpath, 
> which means the current working directory.
> We tried setting SPARK_CLASSPATH to a bogus value, but that is [not 
> allowed|https://github.com/apache/spark/blob/v1.2.0-rc2/core/src/main/scala/org/apache/spark/SparkConf.scala#L312].
> What is the right solution? Only add SPARK_CLASSPATH if it's non-empty? I can 
> send a pull request for that I think. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4831) Current directory always on classpath with spark-submit

Reply via email to