another idea - you can add this fat jar explicitly to the classpath of executors...it's not a solution, but might be it work... I mean place it somewhere locally on executors and add it to cp with spark.executor.extraClassPath
On 8 September 2015 at 18:30, Nick Peterson <nrpeter...@gmail.com> wrote: > Yeah... none of the jars listed on the classpath contain this class. The > only jar that does is the fat jar that I'm submitting with spark-submit, > which as mentioned isn't showing up on the classpath anywhere. > > -- Nick > > On Tue, Sep 8, 2015 at 8:26 AM Igor Berman <igor.ber...@gmail.com> wrote: > >> hmm...out of ideas. >> can you check in spark ui environment tab that this jar is not somehow >> appears 2 times or more...? or more generally - any 2 jars that can contain >> this class by any chance >> >> regarding your question about classloader - no idea, probably there is, I >> remember stackoverflow has some examples on how to print all classes, but >> how to print all classes of kryo classloader - no idea. >> >> On 8 September 2015 at 16:43, Nick Peterson <nrpeter...@gmail.com> wrote: >> >>> Yes, the jar contains the class: >>> >>> $ jar -tf lumiata-evaluation-assembly-1.0.jar | grep >>> 2028/Document/Document >>> com/i2028/Document/Document$1.class >>> com/i2028/Document/Document.class >>> >>> What else can I do? Is there any way to get more information about the >>> classes available to the particular classloader kryo is using? >>> >>> On Tue, Sep 8, 2015 at 6:34 AM Igor Berman <igor.ber...@gmail.com> >>> wrote: >>> >>>> java.lang.ClassNotFoundException: com.i2028.Document.Document >>>> >>>> 1. so have you checked that jar that you create(fat jar) contains this >>>> class? >>>> >>>> 2. might be there is some stale cache issue...not sure though >>>> >>>> >>>> On 8 September 2015 at 16:12, Nicholas R. Peterson < >>>> nrpeter...@gmail.com> wrote: >>>> >>>>> Here is the stack trace: (Sorry for the duplicate, Igor -- I forgot to >>>>> include the list.) >>>>> >>>>> >>>>> 15/09/08 05:56:43 WARN scheduler.TaskSetManager: Lost task 183.0 in stage >>>>> 41.0 (TID 193386, ds-compute2.lumiata.com): java.io.IOException: >>>>> com.esotericsoftware.kryo.KryoException: Error constructing instance of >>>>> class: com.lumiata.patientanalysis.utils.CachedGraph >>>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1257) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88) >>>>> at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) >>>>> at >>>>> com.lumiata.evaluation.analysis.prod.ProductionAnalyzer$$anonfun$apply$1.apply(ProductionAnalyzer.scala:44) >>>>> at >>>>> com.lumiata.evaluation.analysis.prod.ProductionAnalyzer$$anonfun$apply$1.apply(ProductionAnalyzer.scala:43) >>>>> at >>>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) >>>>> at >>>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) >>>>> at >>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>>>> at >>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>>>> at >>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) >>>>> at org.apache.spark.scheduler.Task.run(Task.scala:70) >>>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> Caused by: com.esotericsoftware.kryo.KryoException: Error constructing >>>>> instance of class: com.lumiata.patientanalysis.utils.CachedGraph >>>>> at >>>>> com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:126) >>>>> at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1065) >>>>> at >>>>> com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:228) >>>>> at >>>>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:217) >>>>> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) >>>>> at >>>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:182) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:217) >>>>> at >>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:178) >>>>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1254) >>>>> ... 24 more >>>>> Caused by: java.lang.reflect.InvocationTargetException >>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >>>>> at >>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) >>>>> at >>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:422) >>>>> at >>>>> com.twitter.chill.Instantiators$$anonfun$normalJava$1.apply(KryoBase.scala:160) >>>>> at >>>>> com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:123) >>>>> ... 32 more >>>>> Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: >>>>> com.i2028.Document.Document >>>>> at >>>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) >>>>> at >>>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) >>>>> at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) >>>>> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) >>>>> at >>>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:134) >>>>> at >>>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) >>>>> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626) >>>>> at >>>>> com.lumiata.patientanalysis.utils.CachedGraph.loadCacheFromSerializedData(CachedGraph.java:221) >>>>> at >>>>> com.lumiata.patientanalysis.utils.CachedGraph.<init>(CachedGraph.java:182) >>>>> at >>>>> com.lumiata.patientanalysis.utils.CachedGraph.<init>(CachedGraph.java:178) >>>>> ... 38 more >>>>> Caused by: java.lang.ClassNotFoundException: com.i2028.Document.Document >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:348) >>>>> at >>>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136) >>>>> ... 47 more >>>>> >>>>> >>>>> >>>>>> On Tue, Sep 8, 2015 at 6:01 AM Igor Berman <igor.ber...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> I wouldn't build on this. local mode & yarn are different so that >>>>>>> jars you use in spark submit are handled differently >>>>>>> >>>>>>> On 8 September 2015 at 15:43, Nicholas R. Peterson < >>>>>>> nrpeter...@gmail.com> wrote: >>>>>>> >>>>>>>> Thans, Igor; I've got it running again right now, and can attach >>>>>>>> the stack trace when it finishes. >>>>>>>> >>>>>>>> In the mean time, I've noticed something interesting: in the Spark >>>>>>>> UI, the application jar that I submit is not being included on the >>>>>>>> classpath. It has been successfully uploaded to the nodes -- in the >>>>>>>> nodemanager directory for the application, I see __app__.jar and >>>>>>>> __spark__.jar. The directory itself is on the classpath, and >>>>>>>> __spark__.jar >>>>>>>> and __hadoop_conf__ are as well. When I do everything the same >>>>>>>> but switch the master to local[*], the jar I submit IS added to the >>>>>>>> classpath. >>>>>>>> >>>>>>>> This seems like a likely culprit. What could cause this, and how >>>>>>>> can I fix it? >>>>>>>> >>>>>>>> Best, >>>>>>>> Nick >>>>>>>> >>>>>>>> On Tue, Sep 8, 2015 at 1:14 AM Igor Berman <igor.ber...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> as a starting point, attach your stacktrace... >>>>>>>>> ps: look for duplicates in your classpath, maybe you include >>>>>>>>> another jar with same class >>>>>>>>> >>>>>>>>> On 8 September 2015 at 06:38, Nicholas R. Peterson < >>>>>>>>> nrpeter...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> I'm trying to run a Spark 1.4.1 job on my CDH5.4 cluster, through >>>>>>>>>> Yarn. Serialization is set to use Kryo. >>>>>>>>>> >>>>>>>>>> I have a large object which I send to the executors as a >>>>>>>>>> Broadcast. The object seems to serialize just fine. When it attempts >>>>>>>>>> to >>>>>>>>>> deserialize, though, Kryo throws a ClassNotFoundException... for a >>>>>>>>>> class >>>>>>>>>> that I include in the fat jar that I spark-submit. >>>>>>>>>> >>>>>>>>>> What could be causing this classpath issue with Kryo on the >>>>>>>>>> executors? Where should I even start looking to try to diagnose the >>>>>>>>>> problem? I appreciate any help you can provide. >>>>>>>>>> >>>>>>>>>> Thank you! >>>>>>>>>> >>>>>>>>>> -- Nick >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>> >>