You might be referring to some class level variables from your code. I got to see the actual field which caused the error when i marked the class as serializable and run it on cluster.
class MyClass extends java.io.Serializable The following resources will also help: https://youtu.be/mHF3UPqLOL8?t=54m57s http://stackoverflow.com/questions/22592811/task-not-serializable-java-io-notserializableexception-when-calling-function-ou ~Pratik On Fri, Oct 23, 2015 at 10:30 AM Ted Yu <yuzhih...@gmail.com> wrote: > Mind sharing your code, if possible ? > > Thanks > > On Fri, Oct 23, 2015 at 9:49 AM, crakjie <w...@hotmail.fr> wrote: > >> Hello. >> >> I have activated the file checkpointing for a DStream to unleach the >> updateStateByKey. >> My unit test worked with no problem but when I have integrated this in my >> full stream I got this exception. : >> >> java.io.NotSerializableException: DStream checkpointing has been enabled >> but >> the DStreams with their functions are not serializable >> Serialization stack: >> >> at >> >> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550) >> at >> >> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587) >> at >> >> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586) >> at com.misterbell.shiva.StreamingApp$.main(StreamingApp.scala:196) >> at com.misterbell.shiva.StreamingApp.main(StreamingApp.scala) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:497) >> at >> >> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664) >> at >> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169) >> at >> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192) >> at >> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> >> But this exception is not very clear about what part of my stream is not >> serializable. >> >> I try to add >> >> >> .set("spark.driver.extraJavaOptions","-Dsun.io.serialization.extendedDebugInfo=true") >> >> .set("spark.executor.extraJavaOptions","-Dsun.io.serialization.extendedDebugInfo=true") >> >> to my spark conf to have more information, but it changes nothing ( it >> should ) >> >> So how can I find which function or part of my stream is not serializable? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Stream-are-not-serializable-tp25185.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >