[ https://issues.apache.org/jira/browse/KAFKA-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712225#comment-14712225 ]
Ashish K Singh edited comment on KAFKA-2468 at 8/26/15 12:13 AM: ----------------------------------------------------------------- [~ewencp] unlike exit(), halt() forcibly terminates the jvm. Below is an excerpt from [here|http://geekexplains.blogspot.com/2008/06/runtimeexit-vs-runtimehalt-in-java.html]. {quote} You might have noticed so far that the difference between the two methods is that Runtime.exit() invokes the shutdown sequence of the underlying JVM whereas Runtime.halt() forcibly terminates the JVM process. So, Runtime.exit() causes the registered shutdown hooks to be executed and then also lets all the uninvoked finalizers to be executed before the JVM process shuts down whereas Runtime.halt() simply terminates the JVM process immediately and abruptly. {quote} I have tested that the current PR resolves the issue. I did think about protecting exits with a flag. However, exit() can be called in Kafka.scala as well. Also, it does not make a lot of sense to wait for anything in catch block of shutdown. was (Author: singhashish): [~ewencp] unlike exit(), halt() forcibly terminates the jvm. Below is an excerpt from [here](http://geekexplains.blogspot.com/2008/06/runtimeexit-vs-runtimehalt-in-java.html). {quote} You might have noticed so far that the difference between the two methods is that Runtime.exit() invokes the shutdown sequence of the underlying JVM whereas Runtime.halt() forcibly terminates the JVM process. So, Runtime.exit() causes the registered shutdown hooks to be executed and then also lets all the uninvoked finalizers to be executed before the JVM process shuts down whereas Runtime.halt() simply terminates the JVM process immediately and abruptly. {quote} I have tested that the current PR resolves the issue. I did think about protecting exits with a flag. However, exit() can be called in Kafka.scala as well. Also, it does not make a lot of sense to wait for anything in catch block of shutdown. > SIGINT during Kafka server startup can leave server deadlocked > -------------------------------------------------------------- > > Key: KAFKA-2468 > URL: https://issues.apache.org/jira/browse/KAFKA-2468 > Project: Kafka > Issue Type: Bug > Reporter: Ashish K Singh > Assignee: Ashish K Singh > > KafkaServer on receiving a SIGINT will try to shutdown and if this happens > while the server is starting up, it will get into deadlock. > Thread dump after deadlock > {code} > 2015-08-24 22:03:52 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode): > "Attach Listener" daemon prio=5 tid=0x00007fc08e827800 nid=0x5807 waiting on > condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "Thread-2" prio=5 tid=0x00007fc08b9de000 nid=0x6b03 waiting for monitor entry > [0x000000011ad3a000] > java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Shutdown.exit(Shutdown.java:212) > - waiting to lock <0x00000007bae86ac0> (a java.lang.Class for > java.lang.Shutdown) > at java.lang.Runtime.exit(Runtime.java:109) > at java.lang.System.exit(System.java:962) > at > kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) > at kafka.Kafka$$anon$1.run(Kafka.scala:65) > "SIGINT handler" daemon prio=5 tid=0x00007fc08ca51800 nid=0x6503 in > Object.wait() [0x000000011aa31000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00000007bcb40610> (a kafka.Kafka$$anon$1) > at java.lang.Thread.join(Thread.java:1281) > - locked <0x00000007bcb40610> (a kafka.Kafka$$anon$1) > at java.lang.Thread.join(Thread.java:1355) > at > java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106) > at > java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46) > at java.lang.Shutdown.runHooks(Shutdown.java:123) > at java.lang.Shutdown.sequence(Shutdown.java:167) > at java.lang.Shutdown.exit(Shutdown.java:212) > - locked <0x00000007bae86ac0> (a java.lang.Class for java.lang.Shutdown) > at java.lang.Terminator$1.handle(Terminator.java:52) > at sun.misc.Signal$1.run(Signal.java:212) > at java.lang.Thread.run(Thread.java:745) > "RMI TCP Accept-0" daemon prio=5 tid=0x00007fc08c164000 nid=0x5c07 runnable > [0x0000000119fe8000] > java.lang.Thread.State: RUNNABLE > at java.net.PlainSocketImpl.socketAccept(Native Method) > at > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) > at java.net.ServerSocket.implAccept(ServerSocket.java:530) > at java.net.ServerSocket.accept(ServerSocket.java:498) > at > sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52) > at > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:388) > at > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:360) > at java.lang.Thread.run(Thread.java:745) > "Service Thread" daemon prio=5 tid=0x00007fc08d015000 nid=0x5503 runnable > [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "C2 CompilerThread1" daemon prio=5 tid=0x00007fc08c82b000 nid=0x5303 waiting > on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "C2 CompilerThread0" daemon prio=5 tid=0x00007fc08c82a000 nid=0x5103 waiting > on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "Signal Dispatcher" daemon prio=5 tid=0x00007fc08c829800 nid=0x4f03 runnable > [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "Surrogate Locker Thread (Concurrent GC)" daemon prio=5 > tid=0x00007fc08d002000 nid=0x400b waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "Finalizer" daemon prio=5 tid=0x00007fc08d012800 nid=0x3b03 in Object.wait() > [0x0000000117ee6000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00000007bae05568> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) > - locked <0x00000007bae05568> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189) > "Reference Handler" daemon prio=5 tid=0x00007fc08c803000 nid=0x3903 in > Object.wait() [0x0000000117de3000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00000007bae050f0> (a java.lang.ref.Reference$Lock) > at java.lang.Object.wait(Object.java:503) > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) > - locked <0x00000007bae050f0> (a java.lang.ref.Reference$Lock) > "main" prio=5 tid=0x00007fc08d000800 nid=0x1303 waiting for monitor entry > [0x000000010f353000] > java.lang.Thread.State: BLOCKED (on object monitor) > at java.lang.Shutdown.exit(Shutdown.java:212) > - waiting to lock <0x00000007bae86ac0> (a java.lang.Class for > java.lang.Shutdown) > at java.lang.Runtime.exit(Runtime.java:109) > at java.lang.System.exit(System.java:962) > at > kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:35) > at kafka.Kafka$.main(Kafka.scala:69) > at kafka.Kafka.main(Kafka.scala) > "VM Thread" prio=5 tid=0x00007fc08b83b000 nid=0x3703 runnable > "Gang worker#0 (Parallel GC Threads)" prio=5 tid=0x00007fc08d00f800 > nid=0x2103 runnable > "Gang worker#1 (Parallel GC Threads)" prio=5 tid=0x00007fc08b80e000 > nid=0x2303 runnable > "Gang worker#2 (Parallel GC Threads)" prio=5 tid=0x00007fc08c801000 > nid=0x2503 runnable > "Gang worker#3 (Parallel GC Threads)" prio=5 tid=0x00007fc08c801800 > nid=0x2703 runnable > "Gang worker#4 (Parallel GC Threads)" prio=5 tid=0x00007fc08c804000 > nid=0x2903 runnable > "Gang worker#5 (Parallel GC Threads)" prio=5 tid=0x00007fc08c804800 > nid=0x2b03 runnable > "Gang worker#6 (Parallel GC Threads)" prio=5 tid=0x00007fc08c805000 > nid=0x2d03 runnable > "Gang worker#7 (Parallel GC Threads)" prio=5 tid=0x00007fc08c806000 > nid=0x2f03 runnable > "Concurrent Mark-Sweep GC Thread" prio=5 tid=0x00007fc08c806800 nid=0x3503 > runnable > "Gang worker#0 (Parallel CMS Threads)" prio=5 tid=0x00007fc08c0bd800 > nid=0x3103 runnable > "Gang worker#1 (Parallel CMS Threads)" prio=5 tid=0x00007fc08c0be800 > nid=0x3303 runnable > "VM Periodic Task Thread" prio=5 tid=0x00007fc08c155000 nid=0x5d03 waiting on > condition > JNI global references: 239 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)