Need some clarification about the documentation. According to Spark doc
"the default interval is a multiple of the batch interval that is at least 10
seconds. It can be set by using dstream.checkpoint(checkpointInterval).
Typically, a checkpoint interval of 5 - 10 sliding intervals of a DStream
You are right. "checkpointInterval" is only for data checkpointing.
"metadata checkpoint" is done for each batch. Feel free to send a PR to add
the missing doc.
Best Regards,
Shixiong Zhu
2015-12-18 8:26 GMT-08:00 Lan Jiang :
> Need some clarification about the documentation.