Nick Pritchard created SPARK-5934: ------------------------------------- Summary: DStreamGraph.clearMetadata attempts to unpersist the same RDD multiple times Key: SPARK-5934 URL: https://issues.apache.org/jira/browse/SPARK-5934 Project: Spark Issue Type: Bug Components: Block Manager, Streaming Affects Versions: 1.2.1 Reporter: Nick Pritchard Priority: Minor
It seems that since DStream.clearMetadata calls itself recursively on the dependencies, that it attempts to unpersist the same RDD, which results in warn logs like this: {quote} WARN BlockManager: Asked to remove block rdd_2_1, which does not exist {quote} or this: {quote} WARN BlockManager: Block rdd_2_1 could not be removed as it was not found in either the disk, memory, or tachyon store {quote} This is preceded by logs like: {quote} DEBUG TransformedDStream: Unpersisting old RDDs: 2 DEBUG QueueInputDStream: Unpersisting old RDDs: 2 {quote} Here is a reproducible case: {code:scala} object Test { def main(args: Array[String]): Unit = { val conf = new SparkConf().setMaster("local[2]").setAppName("Test") val ssc = new StreamingContext(conf, Seconds(1)) val queue = new mutable.Queue[RDD[Int]] val input = ssc.queueStream(queue) val output = input.cache().transform(x => x) output.print() ssc.start() for (i <- 1 to 5) { val rdd = ssc.sparkContext.parallelize(Seq(i)) queue.enqueue(rdd) Thread.sleep(1000) } ssc.stop() } } {code} It doesn't seem to be a fatal error, but the WARN messages are a bit unsettling. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org