[ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624231#comment-16624231 ]
Ankur Gupta commented on SPARK-24523: ------------------------------------- That is concerning actually. You should also see the following message in the start of your application logs: "Dropping event from queue appStatus. This likely means one of the listeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler." I am not sure why this occurs though. It seems that your application is creating a lot of events. Based on the code, all the dropped events (7937) were created in previous 60 seconds and maybe even more than that as some events may have been processed as well. You can increase the capacity of queue: spark.scheduler.listenerbus.eventqueue.capacity (default is 10000) to ensure that events are not dropped. But I am not sure how to speed up the listener processing these events. The only way that I understand is to increase the number of cores which allows the thread processing these events to run more often. > InterruptedException when closing SparkContext > ---------------------------------------------- > > Key: SPARK-24523 > URL: https://issues.apache.org/jira/browse/SPARK-24523 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 2.3.0, 2.3.1 > Environment: EMR 5.14.0, S3/HDFS inputs and outputs; EMR 5.17 > > > > Reporter: Umayr Hassan > Priority: Major > Attachments: spark-stop-jstack.log.1, spark-stop-jstack.log.2, > spark-stop-jstack.log.3 > > > I'm running a Scala application in EMR with the following properties: > {{--master yarn --deploy-mode cluster --driver-memory 13g --executor-memory > 30g --executor-cores 5 --conf spark.default.parallelism=400 --conf > spark.dynamicAllocation.enabled=true --conf > spark.dynamicAllocation.maxExecutors=20 --conf > spark.eventLog.dir=hdfs:///var/log/spark/apps --conf > spark.eventLog.enabled=true --conf > spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 --conf > spark.scheduler.listenerbus.eventqueue.capacity=20000 --conf > spark.shuffle.service.enabled=true --conf spark.sql.shuffle.partitions=400 > --conf spark.yarn.maxAppAttempts=1}} > The application runs fine till SparkContext is (automatically) closed, at > which point the SparkContext object throws. > {{18/06/10 10:44:43 ERROR Utils: Uncaught exception in thread pool-4-thread-1 > java.lang.InterruptedException at java.lang.Object.wait(Native Method) at > java.lang.Thread.join(Thread.java:1252) at > java.lang.Thread.join(Thread.java:1326) at > org.apache.spark.scheduler.AsyncEventQueue.stop(AsyncEventQueue.scala:133) at > org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219) > at > org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > org.apache.spark.scheduler.LiveListenerBus.stop(LiveListenerBus.scala:219) at > org.apache.spark.SparkContext$$anonfun$stop$6.apply$mcV$sp(SparkContext.scala:1915) > at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at > org.apache.spark.SparkContext.stop(SparkContext.scala:1914) at > org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:572) > at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) > at scala.util.Try$.apply(Try.scala:192) at > org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) > at > org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748)}} > > I've not seen this behavior in Spark 2.0.2 and Spark 2.2.0 (for the same > application), so I'm not sure which change is causing Spark 2.3 to throw. Any > ideas? > best, > Umayr -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org