Hi All, I am running some spark scala code on zeppelin on CDH 5.5.1 (Spark version 1.5.0). I customized the Spark interpreter to use org.apache.spark. serializer.KryoSerializer as spark.serializer. And in the dependency I added Kyro-3.0.3 as following: com.esotericsoftware:kryo:3.0.3
When I wrote the scala notebook and run the program, I got the following errors. But If I compiled these code as jars, and use spark-submit to run it on the cluster, it worked well without errors. WARN [2016-10-10 23:43:40,801] ({task-result-getter-1} Logging.scala[logWarning]:71) - Lost task 0.0 in stage 3.0 (TID 9, svr-A3-A-U20): java.io.EOFException at org.apache.spark.serializer.KryoDeserializationStream. readObject(KryoSerializer.scala:196) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject( TorrentBroadcast.scala:217) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$ readBroadcastBlock$1.apply(TorrentBroadcast.scala:178) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock( TorrentBroadcast.scala:165) at org.apache.spark.broadcast.TorrentBroadcast._value$ lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value( TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue( TorrentBroadcast.scala:88) at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask. scala:62) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run( Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) There were also some errors when I run the Zeppelin Tutorial: Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163) at org.apache.spark.rdd.ParallelCollectionPartition.readObject( ParallelCollectionRDD.scala:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at java.io.ObjectStreamClass.invokeReadObject( ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData( ObjectInputStream.java:1900) at java.io.ObjectInputStream.readOrdinaryObject( ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream. java:1351) at java.io.ObjectInputStream.defaultReadFields( ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData( ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject( ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream. java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.serializer.JavaDeserializationStream. readObject(JavaSerializer.scala:72) at org.apache.spark.serializer.JavaSerializerInstance. deserialize(JavaSerializer.scala:98) at org.apache.spark.executor.Executor$TaskRunner.run( Executor.scala:194) ... 3 more Caused by: java.lang.NullPointerException at com.twitter.chill.WrappedArraySerializer.read( WrappedArraySerializer.scala:38) at com.twitter.chill.WrappedArraySerializer.read( WrappedArraySerializer.scala:23) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream. readObject(KryoSerializer.scala:192) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply( ParallelCollectionRDD.scala:80) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply( ParallelCollectionRDD.scala:80) at org.apache.spark.util.Utils$.deserializeViaNestedStream( Utils.scala:142) at org.apache.spark.rdd.ParallelCollectionPartition$$ anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:80) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160) Is there anyone knowing why it happended? Thanks in advance, Fei