Andrei Taleanu created SPARK-20323: -------------------------------------- Summary: Calling stop in a transform stage causes the app to hang Key: SPARK-20323 URL: https://issues.apache.org/jira/browse/SPARK-20323 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.1.0 Reporter: Andrei Taleanu
I'm not sure if this is a bug or just the way it needs to happen but I've run in this issue with the following code: {noformat} object ImmortalStreamingJob extends App { val conf = new SparkConf().setAppName("fun-spark").setMaster("local[*]") val ssc = new StreamingContext(conf, Seconds(1)) val elems = (1 to 1000).grouped(10) .map(seq => ssc.sparkContext.parallelize(seq)) .toSeq val stream = ssc.queueStream(mutable.Queue[RDD[Int]](elems: _*)) val transformed = stream.transform { rdd => try { if (Random.nextInt(6) == 5) throw new RuntimeException("boom") else println("lucky bastard") rdd } catch { case e: Throwable => println("stopping streaming context", e) ssc.stop(stopSparkContext = true, stopGracefully = false) throw e } } transformed.foreachRDD { rdd => println(rdd.collect().mkString(",")) } ssc.start() ssc.awaitTermination() } {noformat} There are two things I can note here: * if the exception is thrown in the first transformation (when the first RDD is processed), the spark context is stopped and the app dies * if the exception is thrown after at least one RDD has been processed, the app hangs after printing the error message and never stops I think there's some sort of deadlock in the second case, is that normal? I also asked this [here|http://stackoverflow.com/questions/43273783/immortal-spark-streaming-job/43373624#43373624] but up two this point there's no answer pointing exactly to what happens, only guidelines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org