[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2718 +1 ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user agresch commented on the issue: https://github.com/apache/storm/pull/2718 If it helps, here is the callstack from our original bug report: 2018-05-24 09:27:05.636 o.a.s.d.n.Nimbus main [INFO] Starting nimbus server for storm version '2.0.0.y' 2018-05-24 09:27:06.012 o.a.s.d.n.Nimbus timer [ERROR] Error while processing event java.lang.RuntimeException: java.lang.NullPointerException at org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$37(Nimbus.java:2685) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.StormTimer$1.run(StormTimer.java:111) ~[storm-client-2.0.0.y.jar:2.0.0.y] at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:227) ~[storm-client-2.0.0.y.jar:2.0.0.y] Caused by: java.lang.NullPointerException at org.apache.storm.daemon.nimbus.Nimbus.readAllSupervisorDetails(Nimbus.java:1814) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.daemon.nimbus.Nimbus.computeNewSchedulerAssignments(Nimbus.java:1906) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2057) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2003) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$37(Nimbus.java:2681) ~[storm-server-2.0.0.y.jar:2.0.0.y] ... 2 more 2018-05-24 09:27:06.023 o.a.s.u.Utils timer [ERROR] Halting process: Error while processing event java.lang.RuntimeException: Halting process: Error while processing event at org.apache.storm.utils.Utils.exitProcess(Utils.java:469) ~[storm-client-2.0.0.y.jar:2.0.0.y] at org.apache.storm.daemon.nimbus.Nimbus.lambda$new$17(Nimbus.java:484) ~[storm-server-2.0.0.y.jar:2.0.0.y] at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:252) ~[storm-client-2.0.0.y.jar:2.0.0.y] 2018-05-24 09:27:06.032 o.a.s.d.n.Nimbus Thread-12 [INFO] Shutting down master 2018-05-24 09:27:06.032 o.a.s.u.Utils Thread-13 [INFO] Halting after 10 seconds No further shutdown was processed. It's entirely reproducible by forcing an NPE from the timer. ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user agresch commented on the issue: https://github.com/apache/storm/pull/2718 @danny0405 - The case I saw was the Timer throwing a null pointer exception from mkAssignments() - fixed in https://github.com/apache/storm/pull/2693. This causes the onKill callback to be called and locks up the timer thread. ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2718 @agresch Yes the [https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/StormTimer.java#L173](https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/StormTimer.java#L173) may throw an Interrupted Exception, but `} catch (Throwable e) { if (!(Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)) && !(Utils.exceptionCauseIsInstanceOf(ClosedByInterruptException.class, e))) { this.onKill.uncaughtException(this, e); this.setActive(false); } }` here it already make a decision that if the exception is not InterruptedException.class we shall register the uncaughtException, so in what situation you mean the deadlock will occur? ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user kishorvpatil commented on the issue: https://github.com/apache/storm/pull/2718 Good catch @agresch ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user agresch commented on the issue: https://github.com/apache/storm/pull/2718 It seems not to, as Nimbus was able to continue the shutdown process cleanly. ---
[GitHub] storm issue #2718: STORM-3103 allow nimbus to shutdown properly
Github user Ethanlm commented on the issue: https://github.com/apache/storm/pull/2718 The code change makes sense to me. Just wonder if the whole JVM is terminated immediately after exitProcess() is being called. ---