If you just add the "extends Serializable" changes from here it should work.
On Tue, Oct 29, 2013 at 9:36 AM, Patrick Wendell <pwend...@gmail.com> wrote: > This was fixed on 0.8 branch and master: > https://github.com/apache/incubator-spark/pull/63/files > > - Patrick > > On Tue, Oct 29, 2013 at 9:17 AM, Thunder Stumpges > <thunder.stump...@gmail.com> wrote: >> I vaguely remember running into this same error. It says there >> "java.io.NotSerializableException: >> org.apache.spark.streaming.examples.clickstream.PageView"... can you >> check the PageView class in the examples and make sure it has the >> @serializable directive? I seem to remember having to add it. >> >> good luck, >> Thunder >> >> >> On Tue, Oct 29, 2013 at 6:54 AM, dachuan <hdc1...@gmail.com> wrote: >>> Hi, >>> >>> I have tried the clickstream example, it runs into an exception, anybody met >>> this before? >>> >>> Since the program mentioned "local[2]", so I run it in my local machine. >>> >>> thanks in advance, >>> dachuan. >>> >>> Log Snippet 1: >>> >>> 13/10/29 08:50:25 INFO scheduler.DAGScheduler: Submitting 46 missing tasks >>> from Stage 12 (MapPartitionsRDD[63] at combineByKey at >>> ShuffledDStream.scala:41) >>> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Size of task 75 is 4230 >>> bytes >>> 13/10/29 08:50:25 INFO local.LocalScheduler: Running 75 >>> 13/10/29 08:50:25 INFO spark.CacheManager: Cache key is rdd_9_0 >>> 13/10/29 08:50:25 INFO spark.CacheManager: Computing partition >>> org.apache.spark.rdd.BlockRDDPartition@0 >>> 13/10/29 08:50:25 WARN storage.BlockManager: Putting block rdd_9_0 failed >>> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Loss was due to >>> java.io.NotSerializableException >>> java.io.NotSerializableException: >>> org.apache.spark.streaming.examples.clickstream.PageView >>> >>> Log Snippet 2: >>> org.apache.spark.SparkException: Job failed: Task 12.0:0 failed more than 4 >>> times; aborting job java.io.NotSerializableException: >>> org.apache.spark.streaming.examples.clickstream.PageView >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60) >>> at >>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758) >>> at >>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379) >>> at >>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149) >>> >>> Two commands that run this app: >>> ./run-example >>> org.apache.spark.streaming.exampl.clickstream.PageViewGenerator 44444 10 >>> ./run-example org.apache.spark.streaming.examples.clickstream.PageViewStream >>> errorRatePerZipCode localhost 44444 >>>