[ 
https://issues.apache.org/jira/browse/SPARK-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375707#comment-14375707
 ] 

vijay commented on SPARK-6435:
------------------------------

I tested this on Linux with the 1.3.0 release, works fine.  Apparently a 
windows-specific issue.  Apparently on windows only the 1st jar is picked up.  
This appears to be a problem with parsing the command line, introduced by the 
change in windows scripts between 1.2.0 and 1.3.0.  A simple fix to 
bin\windows-utils.cmd resolves the issue.

I ran this command to test with 'real' jars:
{code}
%SPARK_HOME%\bin\spark-shell --master local --jars 
c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar,c:\temp\guava-14.0.1.jar
{code}

Here are some snippets from the console - note that only the 1st jar is added; 
I can load classes from the 1st jar but not the 2nd:
{code}
15/03/23 10:57:41 INFO SparkUI: Started SparkUI at http://vgarla-t440P.fritz.box
:4040
15/03/23 10:57:41 INFO SparkContext: Added JAR 
file:/c:/code/elasticsearch-1.4.2/lib/lucene-core-4.10.2.jar at 
http://192.168.178.41:54601/jars/lucene-core-4.10.2.jar with timestamp 
1427104661969
15/03/23 10:57:42 INFO Executor: Starting executor ID <driver> on host localhost
...
scala> import org.apache.lucene.util.IOUtils
import org.apache.lucene.util.IOUtils

scala> import com.google.common.base.Strings
<console>:20: error: object Strings is not a member of package 
com.google.common.base
{code}

Looking at the command line in jvisualvm, I see that only the 1st jar is aded:
{code}
Main class: org.apache.spark.deploy.SparkSubmit
Arguments: --class org.apache.spark.repl.Main --master local --jars 
c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar spark-shell 
c:\temp\guava-14.0.1.jar
{code}
In spark 1.2.0, spark-shell2.cmd just passed arguments "as is" to the java 
command line:
{code}
cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class 
org.apache.spark.repl.Main %* spark-shell
{code}

In spark 1.3.0, spark-shell2.cmd calls windows-utils.cmd to parse arguments 
into SUBMISSION_OPTS and APPLICATION_OPTS.  Only the first jar in the list 
passed to --jars makes it into the SUBMISSION_OPTS; latter jars are added to 
APPLICATION_OPTS:
{code}
call %SPARK_HOME%\bin\windows-utils.cmd %*
if %ERRORLEVEL% equ 1 (
  call :usage
  exit /b 1
)
echo SUBMISSION_OPTS=%SUBMISSION_OPTS%
echo APPLICATION_OPTS=%APPLICATION_OPTS%

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class 
org.apache.spark.repl.Main %SUBMISSION_OPTS% spark-shell %APPLICATION_OPTS%
{code}

The problem is that by the time the command line arguments get to 
windows-utils.cmd, the windows command line processor has split the 
comma-separated list into distinct arguments.  The windows way of saying "treat 
this as a single arg" is to surround in double-quotes.  However, when I 
surround the jars in quotes, I get an error:
{code}
%SPARK_HOME%\bin\spark-shell --master local --jars 
"c:\code\elasticsearch-1.4.2\lib\lucene-core-4.10.2.jar,c:\temp\guava-14.0.1.jar"
c:\temp\guava-14.0.1.jar""=="x" was unexpected at this time.
{code}
Digging in, I see this is caused by this line from windows-utils.cmd:
{code}
  if "x%2"=="x" (
{code}

Replacing the quotes with square brackets does the trick:
{code}
  if [x%2]==[x] (
{code}

Now the command line is processed correctly.



> spark-shell --jars option does not add all jars to classpath
> ------------------------------------------------------------
>
>                 Key: SPARK-6435
>                 URL: https://issues.apache.org/jira/browse/SPARK-6435
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>    Affects Versions: 1.3.0
>         Environment: Win64
>            Reporter: vijay
>
> Not all jars supplied via the --jars option will be added to the driver (and 
> presumably executor) classpath.  The first jar(s) will be added, but not all.
> To reproduce this, just add a few jars (I tested 5) to the --jars option, and 
> then try to import a class from the last jar.  This fails.  A simple 
> reproducer: 
> Create a bunch of dummy jars:
> jar cfM jar1.jar log.txt
> jar cfM jar2.jar log.txt
> jar cfM jar3.jar log.txt
> jar cfM jar4.jar log.txt
> Start the spark-shell with the dummy jars and guava at the end:
> %SPARK_HOME%\bin\spark-shell --master local --jars 
> jar1.jar,jar2.jar,jar3.jar,jar4.jar,c:\code\lib\guava-14.0.1.jar
> In the shell, try importing from guava; you'll get an error:
> {code}
> scala> import com.google.common.base.Strings
> <console>:19: error: object Strings is not a member of package 
> com.google.common.base
>        import com.google.common.base.Strings
>               ^
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to