[ 
https://issues.apache.org/jira/browse/SPARK-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187453#comment-14187453
 ] 

Patrick Wendell commented on SPARK-4121:
----------------------------------------

[~srowen] - can you help with this? This is likely happening because the 
PoissonSampler on the driver is using the classpath from Maven (with the 
unmodified version of PoissonSampler) and the executors are using the version 
from the assembly jar, which has package relocations of the commons math 
dependency in the byte code. This is a test that uses "local-cluster" mode.

Is there a reason we are doing these relocations in the assembly only? Would it 
be better to actually shade-and-inline commons-math in both the spark-core and 
spark-mllib package jars?

Having discrepancies between the assmebly and package jars I'm guessing could 
lead to problems other than just this test issue. It also means that 
applications which compile against Spark's dependencies rather than running 
through the Spark assembly packages won't get the benefit of the shading we've 
done.

> Master build failures after shading commons-math3
> -------------------------------------------------
>
>                 Key: SPARK-4121
>                 URL: https://issues.apache.org/jira/browse/SPARK-4121
>             Project: Spark
>          Issue Type: Bug
>          Components: Build, MLlib, Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Xiangrui Meng
>            Priority: Blocker
>
> The Spark master Maven build kept failing after we replace colt with 
> commons-math3 and shade the latter:
> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/
> The error message is:
> {code}
> KMeansClusterSuite:
> Spark assembly has been built with Hive, including Datanucleus jars on 
> classpath
> Spark assembly has been built with Hive, including Datanucleus jars on 
> classpath
> - task size should be small in both training and prediction *** FAILED ***
>   org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 
> in stage 1.0 failed 4 times, most recent failure: Lost task 1.3 in stage 1.0 
> (TID 9, localhost): java.io.InvalidClassException: 
> org.apache.spark.util.random.PoissonSampler; local class incompatible: stream 
> classdesc serialVersionUID = -795011761847245121, local class 
> serialVersionUID = 4249244967777318419
>         java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
>         
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
>         java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>         
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>         
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>         
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>         
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>         
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
>         org.apache.spark.scheduler.Task.run(Task.scala:56)
>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>         
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
> {code}
> This test passed in local sbt build. So the issue should be caused by 
> shading. Maybe there are two versions of commons-math3 (hadoop depends on 
> it), or MLlib doesn't use the shaded version at compile.
> [~srowen] Could you take a look? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to