You need to trigger an action on that rdd to checkpoint it. ``` scala> spark.sparkContext.setCheckpointDir(".")
scala> val df = spark.createDataFrame(List(("Scala", 35), ("Python", 30), ("R", 15), ("Java", 20))) df: org.apache.spark.sql.DataFrame = [_1: string, _2: int] scala> df.rdd.checkpoint() scala> df.rdd.isCheckpointed res2: Boolean = false scala> df.show() +------+---+ | _1| _2| +------+---+ | Scala| 35| |Python| 30| | R| 15| | Java| 20| +------+---+ scala> df.rdd.isCheckpointed res4: Boolean = false scala> df.rdd.count() res5: Long = 4 scala> df.rdd.isCheckpointed res6: Boolean = true ``` On Thu, Jul 13, 2017 at 11:35 AM, Bernard Jesop <bernard.je...@gmail.com> wrote: > Hi everyone, I just tried this simple program : > > > > > > > > > > > > > > > > > > > * import > org.apache.spark.sql.SparkSession > object CheckpointTest extends App > { > val spark = > SparkSession > > .builder() > > .appName("Toto") > > .getOrCreate() > > spark.sparkContext.setCheckpointDir(".") > val df = spark.createDataFrame(List(("Scala", 35), ("Python", 30), ("R", > 15), ("Java", > 20))) > > df.show() > > df.rdd.checkpoint() > println(if (df.rdd.isCheckpointed) "checkpointed" else "not > checkpointed") > }* > But the result is still *"not checkpointed"*. > Do you have any idea why? (knowing that the checkpoint file is created) > > Best regards, > Bernard JESOP >