[ 
https://issues.apache.org/jira/browse/SPARK-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603001#comment-14603001
 ] 

yuemeng commented on SPARK-8663:
--------------------------------

I think the reason becasue:
1)eventProcessActor ! JobSubmitted(         
      jobId, rdd, func2, partitions.toArray, allowLocal, callSite, waiter, 
properties)
    waiter
  }  //eventProcessActor  had dead, and this meassage sent to deadmailbox.so it 
will be lost waiter,
2)
def awaitResult(): JobResult = synchronized {
    while (!_jobFinished) {
      this.wait()
    }
    return jobResult
  } //this will enter loop stituation


> Dirver will be hang if there is a job submit during SparkContex stop Interval
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-8663
>                 URL: https://issues.apache.org/jira/browse/SPARK-8663
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0, 1.1.1, 1.2.0
>         Environment: SUSE Linux Enterprise Server 11 SP3  (x86_64)
>            Reporter: yuemeng
>             Fix For: 1.0.0, 1.1.1, 1.2.2
>
>
> Driver process will be hang if a job had submit during sc.stop Interval.This 
> interval mean from start stop SparkContext to finish .
> The probability of this situation is very small,but If present, will cause 
> driver process never exit.
> Reproduce step:
> 1)modify source code to make SparkContext stop() method sleep 2s
> in my situation,i make DAGScheduler stop method sleep 2s
> 2)submit an application ,code like:
> object DriverThreadTest {
>   def main(args: Array[String]) {
>     val sconf = new SparkConf().setAppName("TestJobWaitor")
>     val sc= new SparkContext(sconf)
>     Thread.sleep(5000)
>     val t = new Thread {
>       override def run() {
>         while (true) {
>           try {
>             val rdd = sc.parallelize( 1 to 1000)
>             var i = 0
>             println("calcfunc start")
>             while ( i < 10){
>               i+=1
>               rdd.count
>             }
>             println("calcfunc end")
>           }catch{
>             case e: Exception =>
>               e.printStackTrace()
>           }
>         }
>       }
>     }
>     
>     t.start()
>     
>     val t2 = new Thread {
>       override def run() {
>         Thread.sleep(2000)
>         println("stop sc thread")
>         sc.stop()
>         println("sc already stoped")
>       }
>     }
>     t2.start()
>   }
> }
> driver will be never exit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to