Hi, I'm using spark streaming 1.0. I create dstream with kafkautils and apply some operations on it. There's a reduceByWindow operation at last so I suppose the checkpoint interval should be automatically set to more than 10 seconds. But what I see is it still checkpoint every 2 seconds (my batch interval), and from the log I see:
[2014-09-17 16:43:25,096] INFO Checkpoint interval automatically set to 12000 ms (org.apache.spark.streaming.dstream.ReducedWindowedDStream) [2014-09-17 16:43:25,105] INFO Checkpoint interval = null (org.apache.spark.streaming.kafka.KafkaInputDStream) [2014-09-17 16:43:25,107] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.MappedDStream) [2014-09-17 16:43:25,108] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.MappedDStream) [2014-09-17 16:43:25,108] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.FilteredDStream) [2014-09-17 16:43:25,109] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.FlatMappedDStream) [2014-09-17 16:43:25,110] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.FlatMappedDStream) [2014-09-17 16:43:25,110] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.ShuffledDStream) [2014-09-17 16:43:25,111] INFO Checkpoint interval = 12000 ms (org.apache.spark.streaming.dstream.ReducedWindowedDStream) [2014-09-17 16:43:25,111] INFO Checkpoint interval = null (org.apache.spark.streaming.dstream.ForEachDStream) So does it mean I have to set checkpoint interval for all the dstreams? Thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org