Hello, Spark community! My name is Paul. I am a Spark newbie, evaluating version 0.9.0 without any Hadoop at all, and need some help. I run into the following error with the StatefulNetworkWordCount example (and similarly in my prototype app, when I use the updateStateByKey operation). I get this when running against my small cluster, but not (so far) against local[2].
61904 [spark-akka.actor.default-dispatcher-2] ERROR org.apache.spark.streaming.scheduler.JobScheduler - Error running job streaming job 1396905956000 ms.0 org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[310] at take at DStream.scala:586(0) has different number of partitions than original RDD MapPartitionsRDD[309] at mapPartitions at StateDStream.scala:66(2) at org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:99) at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:989) at org.apache.spark.SparkContext.runJob(SparkContext.scala:855) at org.apache.spark.SparkContext.runJob(SparkContext.scala:870) at org.apache.spark.SparkContext.runJob(SparkContext.scala:884) at org.apache.spark.rdd.RDD.take(RDD.scala:844) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:586) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:585) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744) Please let me know what other information would be helpful; I didn't find any question submission guidelines. Thanks, Paul