[ 
https://issues.apache.org/jira/browse/SPARK-20323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968943#comment-15968943
 ] 

Sean Owen commented on SPARK-20323:
-----------------------------------

Not being able to use a context in a remote operation? I think it's kind of 
implicit because it doesn't make sense in the context of the Spark model, but I 
don't know if it's explicitly stated. It will throw an obvious error in most 
other cases that you try it.

The examples show stopping the context at the end of the program. They don't 
consistently show using a try-finally block for it, but that is good practice. 
You just do it outside of streaming operations.

> Calling stop in a transform stage causes the app to hang
> --------------------------------------------------------
>
>                 Key: SPARK-20323
>                 URL: https://issues.apache.org/jira/browse/SPARK-20323
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Andrei Taleanu
>
> I'm not sure if this is a bug or just the way it needs to happen but I've run 
> in this issue with the following code:
> {noformat}
> object ImmortalStreamingJob extends App {
>   val conf = new SparkConf().setAppName("fun-spark").setMaster("local[*]")
>   val ssc  = new StreamingContext(conf, Seconds(1))
>   val elems = (1 to 1000).grouped(10)
>     .map(seq => ssc.sparkContext.parallelize(seq))
>     .toSeq
>   val stream = ssc.queueStream(mutable.Queue[RDD[Int]](elems: _*))
>   val transformed = stream.transform { rdd =>
>     try {
>       if (Random.nextInt(6) == 5) throw new RuntimeException("boom")
>       else println("lucky bastard")
>       rdd
>     } catch {
>       case e: Throwable =>
>         println("stopping streaming context", e)
>         ssc.stop(stopSparkContext = true, stopGracefully = false)
>         throw e
>     }
>   }
>   transformed.foreachRDD { rdd =>
>     println(rdd.collect().mkString(","))
>   }
>   ssc.start()
>   ssc.awaitTermination()
> }
> {noformat}
> There are two things I can note here:
> * if the exception is thrown in the first transformation (when the first RDD 
> is processed), the spark context is stopped and the app dies
> * if the exception is thrown after at least one RDD has been processed, the 
> app hangs after printing the error message and never stops
> I think there's some sort of deadlock in the second case, is that normal? I 
> also asked this 
> [here|http://stackoverflow.com/questions/43273783/immortal-spark-streaming-job/43373624#43373624]
>  but up two this point there's no answer pointing exactly to what happens, 
> only guidelines.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to