[ https://issues.apache.org/jira/browse/SPARK-15606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shixiong Zhu updated SPARK-15606: --------------------------------- Fix Version/s: 1.6.2 > Driver hang in o.a.s.DistributedSuite on 2 core machine > ------------------------------------------------------- > > Key: SPARK-15606 > URL: https://issues.apache.org/jira/browse/SPARK-15606 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.6.2, 2.0.0 > Environment: AMD64 box with only 2 cores > Reporter: Pete Robbins > Assignee: Pete Robbins > Fix For: 1.6.2, 2.0.0 > > > repeatedly failing task that crashes JVM *** FAILED *** > The code passed to failAfter did not complete within 100000 milliseconds. > (DistributedSuite.scala:128) > This test started failing and DistrbutedSuite hanging following > https://github.com/apache/spark/pull/13055 > It looks like the extra message to remove the BlockManager deadlocks as there > are only 2 message processing loop threads. Related to > https://issues.apache.org/jira/browse/SPARK-13906 > {code} > /** Thread pool used for dispatching messages. */ > private val threadpool: ThreadPoolExecutor = { > val numThreads = > nettyEnv.conf.getInt("spark.rpc.netty.dispatcher.numThreads", > math.max(2, Runtime.getRuntime.availableProcessors())) > val pool = ThreadUtils.newDaemonFixedThreadPool(numThreads, > "dispatcher-event-loop") > for (i <- 0 until numThreads) { > pool.execute(new MessageLoop) > } > pool > } > {code} > Setting a minimum of 3 threads alleviates this issue but I'm not sure there > isn't another underlying problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org