[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972526#comment-13972526
 ] 

Patrick Wendell commented on SPARK-1520:
----------------------------------------

[~srowen] heading to bed for the night... but would welcome help with this. I 
looked earlier and I don't think breeze is doing anything fancy with their 
manifest or meta-inf directories. I did a diff on the breeze directory itself 
between java 6 and java 7 compiled jars and they were identical. The 
classloading messages don't provide any useful output.

My best guess at present is we are hitting corner cases in the compatibility of 
the jar format itself due to having individual directories with thousands of 
class files. And these are causing the Java 6 RE to silently find the jar 
corrupt. I have no evidence to support that claim, however.

> Inclusion of breeze corrupts assembly when compiled with JDK7 and run on JDK6
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-1520
>                 URL: https://issues.apache.org/jira/browse/SPARK-1520
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Spark Core
>            Reporter: Patrick Wendell
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> This is a real doozie - when compiling a Spark assembly with JDK7, the 
> produced jar does not work well with JRE6. I confirmed the byte code being 
> produced is JDK 6 compatible (major version 50). What happens is that, 
> silently, the JRE will not load any class files from the assembled jar.
> {code}
> $> sbt/sbt assembly/assembly
> $> /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
> /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
>  org.apache.spark.ui.UIWorkloadGenerator
> usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
> [FIFO|FAIR]
> $> /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
> /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
>  org.apache.spark.ui.UIWorkloadGenerator
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/spark/ui/UIWorkloadGenerator
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.spark.ui.UIWorkloadGenerator
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
> Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
> Program will exit.
> {code}
> I also noticed that if the jar is unzipped, and the classpath set to the 
> currently directory, it "just works". Finally, if the assembly jar is 
> compiled with JDK6, it also works. The error is seen with any class, not just 
> the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
> in master.
> *Isolation*
> -I ran a git bisection and this appeared after the MLLib sparse vector patch 
> was merged:-
> https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
> SPARK-1212
> -I narrowed this down specifically to the inclusion of the breeze library. 
> Just adding breeze to an older (unaffected) build triggered the issue.-
> I've found that if I just unpack and re-pack the jar (using `jar` from java 6 
> or 7) it always works:
> {code}
> $ cd assembly/target/scala-2.10/
> $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
> ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
> org.apache.spark.ui.UIWorkloadGenerator # fails
> $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
> $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
> $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
> ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
> org.apache.spark.ui.UIWorkloadGenerator # succeeds
> {code}
> I also noticed something of note. The Breeze package contains single 
> directories that have huge numbers of files in them (e.g. 2000+ class files 
> in one directory). It's possible we are hitting some weird bugs/corner cases 
> with compatibility of the internal storage format of the jar itself.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to