Nick Pritchard created SPARK-5934:
-------------------------------------

             Summary: DStreamGraph.clearMetadata attempts to unpersist the same 
RDD multiple times
                 Key: SPARK-5934
                 URL: https://issues.apache.org/jira/browse/SPARK-5934
             Project: Spark
          Issue Type: Bug
          Components: Block Manager, Streaming
    Affects Versions: 1.2.1
            Reporter: Nick Pritchard
            Priority: Minor


It seems that since DStream.clearMetadata calls itself recursively on the 
dependencies, that it attempts to unpersist the same RDD, which results in warn 
logs like this:
{quote}
WARN BlockManager: Asked to remove block rdd_2_1, which does not exist
{quote}

or this:
{quote}
WARN BlockManager: Block rdd_2_1 could not be removed as it was not found in 
either the disk, memory, or tachyon store
{quote}

This is preceded by logs like:
{quote}
DEBUG TransformedDStream: Unpersisting old RDDs: 2
DEBUG QueueInputDStream: Unpersisting old RDDs: 2
{quote}

Here is a reproducible case:
{code:scala}
object Test {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setMaster("local[2]").setAppName("Test")
    val ssc = new StreamingContext(conf, Seconds(1))
    val queue = new mutable.Queue[RDD[Int]]

    val input = ssc.queueStream(queue)
    val output = input.cache().transform(x => x)
    output.print()

    ssc.start()
    for (i <- 1 to 5) {
      val rdd = ssc.sparkContext.parallelize(Seq(i))
      queue.enqueue(rdd)
      Thread.sleep(1000)
    }
    ssc.stop()
  }
}
{code}

It doesn't seem to be a fatal error, but the WARN messages are a bit unsettling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to