Hi, Thank you or your help. With the new code I am getting the following error in the driver. What is going wrong here?
14/08/07 13:22:28 ERROR JobScheduler: Error running job streaming job 1407450148000 ms.0 org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[4528] at apply at List.scala:318(0) has different number of partitions than original RDD MappedValuesRDD[4526] at mapValues at ReducedWindowedDStream.scala:169(1) at org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:98) at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1164) at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1166) at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1166) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1166) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1054) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1069) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1083) at org.apache.spark.rdd.RDD.take(RDD.scala:989) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:593) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:592) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-reduceByWindow-reduceFunc-invReduceFunc-windowDuration-slideDuration-tp11591p11727.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org