Jacek Laskowski created SPARK-11051: ---------------------------------------
Summary: NullPointerException when action called on localCheckpointed RDD (that was checkpointed before) Key: SPARK-11051 URL: https://issues.apache.org/jira/browse/SPARK-11051 Project: Spark Issue Type: Bug Affects Versions: 1.6.0 Environment: Spark built from the sources as of today - Oct, 10th Reporter: Jacek Laskowski While toying with {{RDD.checkpoint}} and {{RDD.localCheckpoint}} methods, the following NullPointerException was thrown: {code} scala> lines.count java.lang.NullPointerException at org.apache.spark.rdd.RDD.firstParent(RDD.scala:1587) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1927) at org.apache.spark.rdd.RDD.count(RDD.scala:1115) ... 48 elided {code} To reproduce the issue do the following: {code} $ ./bin/spark-shell Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.0-SNAPSHOT /_/ Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60) Type in expressions to have them evaluated. Type :help for more information. scala> val lines = sc.textFile("README.md") lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:24 scala> sc.setCheckpointDir("checkpoints") scala> lines.checkpoint scala> lines.count res2: Long = 98 scala> lines.localCheckpoint 15/10/10 22:59:20 WARN MapPartitionsRDD: RDD was already marked for reliable checkpointing: overriding with local checkpoint. res4: lines.type = MapPartitionsRDD[1] at textFile at <console>:24 scala> lines.count java.lang.NullPointerException at org.apache.spark.rdd.RDD.firstParent(RDD.scala:1587) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1927) at org.apache.spark.rdd.RDD.count(RDD.scala:1115) ... 48 elided {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org