[ https://issues.apache.org/jira/browse/SPARK-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603001#comment-14603001 ]
yuemeng commented on SPARK-8663: -------------------------------- I think the reason becasue: 1)eventProcessActor ! JobSubmitted( jobId, rdd, func2, partitions.toArray, allowLocal, callSite, waiter, properties) waiter } //eventProcessActor had dead, and this meassage sent to deadmailbox.so it will be lost waiter, 2) def awaitResult(): JobResult = synchronized { while (!_jobFinished) { this.wait() } return jobResult } //this will enter loop stituation > Dirver will be hang if there is a job submit during SparkContex stop Interval > ----------------------------------------------------------------------------- > > Key: SPARK-8663 > URL: https://issues.apache.org/jira/browse/SPARK-8663 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.0.0, 1.1.1, 1.2.0 > Environment: SUSE Linux Enterprise Server 11 SP3 (x86_64) > Reporter: yuemeng > Fix For: 1.0.0, 1.1.1, 1.2.2 > > > Driver process will be hang if a job had submit during sc.stop Interval.This > interval mean from start stop SparkContext to finish . > The probability of this situation is very small,but If present, will cause > driver process never exit. > Reproduce step: > 1)modify source code to make SparkContext stop() method sleep 2s > in my situation,i make DAGScheduler stop method sleep 2s > 2)submit an application ,code like: > object DriverThreadTest { > def main(args: Array[String]) { > val sconf = new SparkConf().setAppName("TestJobWaitor") > val sc= new SparkContext(sconf) > Thread.sleep(5000) > val t = new Thread { > override def run() { > while (true) { > try { > val rdd = sc.parallelize( 1 to 1000) > var i = 0 > println("calcfunc start") > while ( i < 10){ > i+=1 > rdd.count > } > println("calcfunc end") > }catch{ > case e: Exception => > e.printStackTrace() > } > } > } > } > > t.start() > > val t2 = new Thread { > override def run() { > Thread.sleep(2000) > println("stop sc thread") > sc.stop() > println("sc already stoped") > } > } > t2.start() > } > } > driver will be never exit -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org