[ https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529017#comment-15529017 ]
Alex Petrov commented on CASSANDRA-12461: ----------------------------------------- Thank you for the detailed review. I've made the corresponding fixes and rebased. I didn't squash commits, mostly to make it easier to look over. bq. shall we rename inShutdownHook to something like draining or drained? Sure, renamed. bq. there is still one unprotected call to logger.warn in drain(), line 4462, and a call in runShutdownHooks() The one in {{4462}} (I assume that's a line number after rebase, [this one in original file|https://github.com/ifesdjeen/cassandra/blob/c49dd44710f5ba816eb0cd414ef31efa914f7776/src/java/org/apache/cassandra/service/StorageService.java#L4449]). I've only "turned off" the logging that is related strictly to draining. I might be missing something however. bq. let's update the documentation for drain() True, it was incorrect. Fixed now. Stating differences with a normal shutdown hook now is too technical (both logging and windows timers are unrelated to core of what Cassandra does), so I left it out. bq. shall we catch any exceptions in drain() to ensure the post shutdown hooks are run also if there is an exception? You're right, it's a good idea to do that. I've also changed catching {{Exception}} to catching {{Throwable}} (with {{Throwables}} as you described) when running hooks in order to avoid unintended errors breaking drain. bq. the documentation says that the post shutdown hooks should only be called on final shutdown, not on drain, is this still the case? Since there's just one process: either shutdown or drain (you can't re-run the drain code after termination), it's not true anymore. Actually, that was the reason to provide the second part of patch. bq. here is a proposal for some extra work (feel free to turn it down): refactor setMode() to accept an optional log level instead of a boolean, when the optional is empty it should not log, so we could call this method also from the shutdown hook and possibly replace inShutdownHook with operationMode, provided this becomes volatile. I would also add an override since most of the times it is called with the boolean set to true, so the override would have the logging level set to INFO. I like the idea a lot, and I've implemented at least a part of it (using {{DRAINED}} instead of an additional boolean). I think that transitioning to {{DRAINING}} / {{DRAINED}} state on shutdown is also correct. As regards log level, I could not find a good intuitive way to pass log level, as slf4j API uses method calls instead of levels, both native, log4j and apache logging levels do not translate fully to slf4j method calls. Hope it's still ok.. bq. given that drain() is synchronized, can we not just look at inShutdownHook (or operationMode) at the beginning of drain(), to solve CASSANDRA-12509? Is there anything else I am missing about it? Actually my intention was that this issue fully contains #12509 already. I will add the corresponding link. bq. the StorageService import is unused in the Enabled*.java files Cleaned up. bq. we could replace runShutdownHooks(); with Throwables.perform(null, postShutdownHooks.stream().map(h -> h::run));, logging any non-null returned exceptions, if not in final shutdown. This would not only avoid one method implementation, but also make the log call visible in drain(). I've made it log exceptions even if it's a final shutdown. It might be useful to have them logged in both cases. I've triggered a CI, will report as soon as I have passed tests. > Add hooks to StorageService shutdown > ------------------------------------ > > Key: CASSANDRA-12461 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12461 > Project: Cassandra > Issue Type: Bug > Reporter: Anthony Cozzie > Assignee: Anthony Cozzie > Fix For: 3.x > > Attachments: > 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch > > > The JVM will usually run shutdown hooks in parallel. This can lead to > synchronization problems between Cassandra, services that depend on it, and > services it depends on. This patch adds some simple support for shutdown > hooks to StorageService. > This should nearly solve CASSANDRA-12011 -- This message was sent by Atlassian JIRA (v6.3.4#6332)