[ 
https://issues.apache.org/jira/browse/SPARK-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020545#comment-15020545
 ] 

sam commented on SPARK-11853:
-----------------------------

OK, I'll look into that next week and see if I can put together a minimal 
example along with a script to spin up a 1 node emr.

If there is a conflict in the libraries, what I would propose changing in Spark 
is how it falls over. CNFE is misleading since the class can be found, it is in 
the jar, it's just been evicted due to some overlap.  Furthermore, when it 
falls over is undesirable, i.e. during execution rather than during 
initialisation.  If it needed to evict the users classes, it should fall over 
*at that point*, rather than when the users code tries to call those classes 
potentially hours later.  E.g. `ConflictingLibrariesException: Consider 
userClassPathFirst`.  I know that's probably easier said than done.

> java.lang.ClassNotFoundException with spray-json on EMR
> -------------------------------------------------------
>
>                 Key: SPARK-11853
>                 URL: https://issues.apache.org/jira/browse/SPARK-11853
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.1
>            Reporter: sam
>
> I'm using a fat jar (sbt assembly), so there is no reason for spark to do 
> this.
> MORE INFO
> ENVIRONMENT
> Using emr-4.1.0 with latest EMR spark so 1.5.0. We run the job with 
> `spark-submit --master yarn-client ... etc...`.
> SBT
> Build file dependencies
> libraryDependencies ++= Seq(
>   "org.scalacheck" %% "scalacheck" % "1.12.1" % "test" withSources() 
> withJavadoc(),
>   "org.specs2" %% "specs2-core" % "2.4.15" % "test" withSources() 
> withJavadoc(),
>   "org.specs2" %% "specs2-scalacheck" % "2.4.15" % "test" withSources() 
> withJavadoc(),
>   "org.apache.spark" %% "spark-core" % "1.5.1" withSources() withJavadoc(),
>   "org.rogach" %% "scallop" % "0.9.5" withSources() withJavadoc(),
>   "org.scalaz" %% "scalaz-core" % "7.1.4" withSources() withJavadoc(),
>   "io.spray" %% "spray-json" % "1.3.2" withSources() withJavadoc(),
>   "com.m3" %% "curly-scala" % "0.5.+" withSources() withJavadoc(),
>   "com.amazonaws" % "aws-java-sdk" % "1.10.30"
> )
> dependencyOverrides ++= Set(
>   "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
> )
> scalaVersion := "2.10.4"
> With a "first" merge strat instead of deduplicate
> STACK TRACE
> 15/11/19 16:03:30 WARN ThrowableSerializationWrapper: Task exception could 
> not be deserialized
> java.lang.ClassNotFoundException: spray.json.DeserializationException
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>       at java.lang.Class.forName0(Native Method)
>       at java.lang.Class.forName(Class.java:278)
>       at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>       at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>       at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>       at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to