[ https://issues.apache.org/jira/browse/SPARK-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020545#comment-15020545 ]
sam commented on SPARK-11853: ----------------------------- OK, I'll look into that next week and see if I can put together a minimal example along with a script to spin up a 1 node emr. If there is a conflict in the libraries, what I would propose changing in Spark is how it falls over. CNFE is misleading since the class can be found, it is in the jar, it's just been evicted due to some overlap. Furthermore, when it falls over is undesirable, i.e. during execution rather than during initialisation. If it needed to evict the users classes, it should fall over *at that point*, rather than when the users code tries to call those classes potentially hours later. E.g. `ConflictingLibrariesException: Consider userClassPathFirst`. I know that's probably easier said than done. > java.lang.ClassNotFoundException with spray-json on EMR > ------------------------------------------------------- > > Key: SPARK-11853 > URL: https://issues.apache.org/jira/browse/SPARK-11853 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.5.1 > Reporter: sam > > I'm using a fat jar (sbt assembly), so there is no reason for spark to do > this. > MORE INFO > ENVIRONMENT > Using emr-4.1.0 with latest EMR spark so 1.5.0. We run the job with > `spark-submit --master yarn-client ... etc...`. > SBT > Build file dependencies > libraryDependencies ++= Seq( > "org.scalacheck" %% "scalacheck" % "1.12.1" % "test" withSources() > withJavadoc(), > "org.specs2" %% "specs2-core" % "2.4.15" % "test" withSources() > withJavadoc(), > "org.specs2" %% "specs2-scalacheck" % "2.4.15" % "test" withSources() > withJavadoc(), > "org.apache.spark" %% "spark-core" % "1.5.1" withSources() withJavadoc(), > "org.rogach" %% "scallop" % "0.9.5" withSources() withJavadoc(), > "org.scalaz" %% "scalaz-core" % "7.1.4" withSources() withJavadoc(), > "io.spray" %% "spray-json" % "1.3.2" withSources() withJavadoc(), > "com.m3" %% "curly-scala" % "0.5.+" withSources() withJavadoc(), > "com.amazonaws" % "aws-java-sdk" % "1.10.30" > ) > dependencyOverrides ++= Set( > "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4" > ) > scalaVersion := "2.10.4" > With a "first" merge strat instead of deduplicate > STACK TRACE > 15/11/19 16:03:30 WARN ThrowableSerializationWrapper: Task exception could > not be deserialized > java.lang.ClassNotFoundException: spray.json.DeserializationException > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:278) > at > org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org