[jira] [Commented] (SPARK-24523) InterruptedException when closing SparkContext

Umayr Hassan (JIRA) Fri, 21 Sep 2018 14:22:27 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624202#comment-16624202
 ]


Umayr Hassan commented on SPARK-24523:
--------------------------------------

[~irashid] The event logs are written to HDFS. 

[~ankur.gupta] Would the event logs have this information. Also, increasing 
driver cores to 4 didn't seem to help. BTW, I also see a warning message like:

{{18/09/21 20:15:47 WARN AsyncEventQueue: Dropped 7937 events from appStatus 
since Fri Sep 21 20:14:47 UTC 2018. }}

The one thing that, for some jobs seem to help somewhat is reducing the number 
of partitions. E.g. one job (that also uses LinearRegression) runs in 30min 
instead of 2hrs when the number of partitions is reduced from 10 to 1 (the job 
took 15min in Spark 2.0.2).

> InterruptedException when closing SparkContext
> ----------------------------------------------
>
>                 Key: SPARK-24523
>                 URL: https://issues.apache.org/jira/browse/SPARK-24523
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.3.0, 2.3.1
>         Environment: EMR 5.14.0, S3/HDFS inputs and outputs; EMR 5.17
>  
>  
>  
>            Reporter: Umayr Hassan
>            Priority: Major
>         Attachments: spark-stop-jstack.log.1, spark-stop-jstack.log.2, 
> spark-stop-jstack.log.3
>
>
> I'm running a Scala application in EMR with the following properties:
> {{--master yarn --deploy-mode cluster --driver-memory 13g --executor-memory 
> 30g --executor-cores 5 --conf spark.default.parallelism=400 --conf 
> spark.dynamicAllocation.enabled=true --conf 
> spark.dynamicAllocation.maxExecutors=20 --conf 
> spark.eventLog.dir=hdfs:///var/log/spark/apps --conf 
> spark.eventLog.enabled=true --conf 
> spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 --conf 
> spark.scheduler.listenerbus.eventqueue.capacity=20000 --conf 
> spark.shuffle.service.enabled=true --conf spark.sql.shuffle.partitions=400 
> --conf spark.yarn.maxAppAttempts=1}}
> The application runs fine till SparkContext is (automatically) closed, at 
> which point the SparkContext object throws. 
> {{18/06/10 10:44:43 ERROR Utils: Uncaught exception in thread pool-4-thread-1 
> java.lang.InterruptedException at java.lang.Object.wait(Native Method) at 
> java.lang.Thread.join(Thread.java:1252) at 
> java.lang.Thread.join(Thread.java:1326) at 
> org.apache.spark.scheduler.AsyncEventQueue.stop(AsyncEventQueue.scala:133) at 
> org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219)
>  at 
> org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893) at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
> org.apache.spark.scheduler.LiveListenerBus.stop(LiveListenerBus.scala:219) at 
> org.apache.spark.SparkContext$$anonfun$stop$6.apply$mcV$sp(SparkContext.scala:1915)
>  at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at 
> org.apache.spark.SparkContext.stop(SparkContext.scala:1914) at 
> org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:572) 
> at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) 
> at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
>  at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
>  at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
>  at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
>  at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
>  at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
>  at scala.util.Try$.apply(Try.scala:192) at 
> org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
>  at 
> org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)}}
>  
> I've not seen this behavior in Spark 2.0.2 and Spark 2.2.0 (for the same 
> application), so I'm not sure which change is causing Spark 2.3 to throw. Any 
> ideas?
> best,
> Umayr



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24523) InterruptedException when closing SparkContext

Reply via email to