Those are daemon threads and not the cause of the problem. The main
thread is waiting for the "org.apache.hadoop.util.ShutdownHookManager"
thread, but I don't see that one in your list.

On Wed, Jan 16, 2019 at 12:08 PM Pola Yao <pola....@gmail.com> wrote:
>
> Hi Marcelo,
>
> Thanks for your response.
>
> I have dumped the threads on the server where I submitted the spark 
> application:
>
> '''
> ...
> "dispatcher-event-loop-2" #28 daemon prio=5 os_prio=0 tid=0x00007f56cee0e000 
> nid=0x1cb6 waiting on condition [0x00007f5699811000]
>    java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00000006400161b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> "dispatcher-event-loop-1" #27 daemon prio=5 os_prio=0 tid=0x00007f56cee0c800 
> nid=0x1cb5 waiting on condition [0x00007f5699912000]
>    java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00000006400161b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> "dispatcher-event-loop-0" #26 daemon prio=5 os_prio=0 tid=0x00007f56cee0c000 
> nid=0x1cb4 waiting on condition [0x00007f569a120000]
>    java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00000006400161b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> "Service Thread" #20 daemon prio=9 os_prio=0 tid=0x00007f56cc12d800 
> nid=0x1ca5 runnable [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>
> "C1 CompilerThread14" #19 daemon prio=9 os_prio=0 tid=0x00007f56cc12a000 
> nid=0x1ca4 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> ...
> "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f56cc0ce000 nid=0x1c93 in 
> Object.wait() [0x00007f56ab3f2000]
>    java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
> - locked <0x00000006400cd498> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
>
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f56cc0c9800 
> nid=0x1c92 in Object.wait() [0x00007f55cfffe000]
>    java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0x00000006400a2660> (a java.lang.ref.Reference$Lock)
> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
>
> "main" #1 prio=5 os_prio=0 tid=0x00007f56cc021000 nid=0x1c74 in Object.wait() 
> [0x00007f56d344c000]
>    java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Thread.join(Thread.java:1249)
> - locked <0x000000064056f6a0> (a org.apache.hadoop.util.ShutdownHookManager$1)
> at java.lang.Thread.join(Thread.java:1323)
> at 
> java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
> at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
> at java.lang.Shutdown.runHooks(Shutdown.java:123)
> at java.lang.Shutdown.sequence(Shutdown.java:167)
> at java.lang.Shutdown.exit(Shutdown.java:212)
> - locked <0x00000006404e65b8> (a java.lang.Class for java.lang.Shutdown)
> at java.lang.Runtime.exit(Runtime.java:109)
> at java.lang.System.exit(System.java:971)
> at scala.sys.package$.exit(package.scala:40)
> at scala.sys.package$.exit(package.scala:33)
> at 
> actionmodel.ParallelAdvertiserBeaconModel$.main(ParallelAdvertiserBeaconModel.scala:252)
> at 
> actionmodel.ParallelAdvertiserBeaconModel.main(ParallelAdvertiserBeaconModel.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> "VM Thread" os_prio=0 tid=0x00007f56cc0c1800 nid=0x1c91 runnable
> ...
> '''
>
> I have no clear idea what went wrong. I did call  awaitTermination to 
> terminate the thread pool. Or is there any way to force close all those 
> 'WAITING' threads associated with my spark application?
>
> On Wed, Jan 16, 2019 at 8:31 AM Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> If System.exit() doesn't work, you may have a bigger problem
>> somewhere. Check your threads (using e.g. jstack) to see what's going
>> on.
>>
>> On Wed, Jan 16, 2019 at 8:09 AM Pola Yao <pola....@gmail.com> wrote:
>> >
>> > Hi Marcelo,
>> >
>> > Thanks for your reply! It made sense to me. However, I've tried many ways 
>> > to exit the spark (e.g., System.exit()), but failed. Is there an explicit 
>> > way to shutdown all the alive threads in the spark application and then 
>> > quit afterwards?
>> >
>> >
>> > On Tue, Jan 15, 2019 at 2:38 PM Marcelo Vanzin <van...@cloudera.com> wrote:
>> >>
>> >> You should check the active threads in your app. Since your pool uses
>> >> non-daemon threads, that will prevent the app from exiting.
>> >>
>> >> spark.stop() should have stopped the Spark jobs in other threads, at
>> >> least. But if something is blocking one of those threads, or if
>> >> something is creating a non-daemon thread that stays alive somewhere,
>> >> you'll see that.
>> >>
>> >> Or you can force quit with sys.exit.
>> >>
>> >> On Tue, Jan 15, 2019 at 1:30 PM Pola Yao <pola....@gmail.com> wrote:
>> >> >
>> >> > I submitted a Spark job through ./spark-submit command, the code was 
>> >> > executed successfully, however, the application got stuck when trying 
>> >> > to quit spark.
>> >> >
>> >> > My code snippet:
>> >> > '''
>> >> > {
>> >> >
>> >> > val spark = SparkSession.builder.master(...).getOrCreate
>> >> >
>> >> > val pool = Executors.newFixedThreadPool(3)
>> >> > implicit val xc = ExecutionContext.fromExecutorService(pool)
>> >> > val taskList = List(train1, train2, train3)  // where train* is a 
>> >> > Future function which wrapped up some data reading and feature 
>> >> > engineering and machine learning steps
>> >> > val results = Await.result(Future.sequence(taskList), 20 minutes)
>> >> >
>> >> > println("Shutting down pool and executor service")
>> >> > pool.shutdown()
>> >> > xc.shutdown()
>> >> >
>> >> > println("Exiting spark")
>> >> > spark.stop()
>> >> >
>> >> > }
>> >> > '''
>> >> >
>> >> > After I submitted the job, from terminal, I could see the code was 
>> >> > executed and printing "Exiting spark", however, after printing that 
>> >> > line, it never existed spark, just got stuck.
>> >> >
>> >> > Does any body know what the reason is? Or how to force quitting?
>> >> >
>> >> > Thanks!
>> >> >
>> >> >
>> >>
>> >>
>> >> --
>> >> Marcelo
>>
>>
>>
>> --
>> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to