[ 
https://issues.apache.org/jira/browse/SPARK-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018032#comment-15018032
 ] 

sam commented on SPARK-11853:
-----------------------------

[~srowen] We are not using --jars or anything like that, just executor cores, 
num executors, the one conf I gave you, and the args to our application.

// why not look at what actually was built into it to make sure it matches 
expectation //

Don't see why this is necessary, as mentioned it runs fine outside of Spark, 
but sure:

> jar tf my.jar | grep DeserializationException
spray/json/DeserializationException$.class
spray/json/DeserializationException.class

Just by stabbing in the dark based on my limited understanding of Spark, it 
seems that Spark is helpfully trying to serialize the task exception to give it 
back to the driver so the user can see what the exception was without digging 
in the worker logs. But however the available classes are populated by the 
spark-submit isn't quite working properly since this part of the code doesn't 
seem to have access to the assembled dependencies in the fat jar.  After all, 
it's only when it throws an exception we see the CNFE, under happy running (but 
still using the dependencies) the CNFE is not thrown.

15/11/19 16:03:30 WARN ThrowableSerializationWrapper: Task exception could not 
be deserialized

> java.lang.ClassNotFoundException with spray-json on EMR
> -------------------------------------------------------
>
>                 Key: SPARK-11853
>                 URL: https://issues.apache.org/jira/browse/SPARK-11853
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.1
>            Reporter: sam
>
> I'm using a fat jar (sbt assembly), so there is no reason for spark to do 
> this.
> MORE INFO
> ENVIRONMENT
> Using emr-4.1.0 with latest EMR spark so 1.5.0. We run the job with 
> `spark-submit --master yarn-client ... etc...`.
> SBT
> Build file dependencies
> libraryDependencies ++= Seq(
>   "org.scalacheck" %% "scalacheck" % "1.12.1" % "test" withSources() 
> withJavadoc(),
>   "org.specs2" %% "specs2-core" % "2.4.15" % "test" withSources() 
> withJavadoc(),
>   "org.specs2" %% "specs2-scalacheck" % "2.4.15" % "test" withSources() 
> withJavadoc(),
>   "org.apache.spark" %% "spark-core" % "1.5.1" withSources() withJavadoc(),
>   "org.rogach" %% "scallop" % "0.9.5" withSources() withJavadoc(),
>   "org.scalaz" %% "scalaz-core" % "7.1.4" withSources() withJavadoc(),
>   "io.spray" %% "spray-json" % "1.3.2" withSources() withJavadoc(),
>   "com.m3" %% "curly-scala" % "0.5.+" withSources() withJavadoc(),
>   "com.amazonaws" % "aws-java-sdk" % "1.10.30"
> )
> dependencyOverrides ++= Set(
>   "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
> )
> scalaVersion := "2.10.4"
> With a "first" merge strat instead of deduplicate
> STACK TRACE
> 15/11/19 16:03:30 WARN ThrowableSerializationWrapper: Task exception could 
> not be deserialized
> java.lang.ClassNotFoundException: spray.json.DeserializationException
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>       at java.lang.Class.forName0(Native Method)
>       at java.lang.Class.forName(Class.java:278)
>       at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>       at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>       at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>       at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>       at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>       at 
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to