[ https://issues.apache.org/jira/browse/SLING-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093738#comment-15093738 ]
Thomas Mueller commented on SLING-5416: --------------------------------------- > I guess the danger here is that the shutdown blocks indefinitely. Yes, I thought about that as well. It is very tempting to use a showdown wait time (you have used 5 seconds, somebody else might suggest 1 minute). However, I think it's not a good idea because it is not deterministic, hard to test, and can be fatal. In most cases, the thread probably stops within 5 seconds, but not in all cases. Most likely, unit tests (with a small repository) will only test the "good" case, but not the bad case where it takes more than 5 seconds. But in production, it can take more than 5 seconds, so that Thread.interrupt is called. And Thread.interrupt can cause all kinds of trouble: If a thread is reading from a file, the file is _closed_ (so other threads that use the file are affected). I think this is not just for files, but also for sockets. This affects not just our code, but also library code (it affects Apache Lucene with FileChannel, it affected MapDB, it affected the persistent cache we use in Oak, it can affect Jetty, and others (libraries we use in the future). Handling Thread.interrupt in code is quite complex and very often done incorrectly (for example, you should call "Thread.interrupt" when handling a InterruptedException - this is rarely done; most of then time the exception is ignored). In case something is _really_ blocked, it's much, much safer and better to kill a process than to use Thread.interrupt. > Thread Pool should stop "gracefully" > ------------------------------------ > > Key: SLING-5416 > URL: https://issues.apache.org/jira/browse/SLING-5416 > Project: Sling > Issue Type: Improvement > Components: Commons > Reporter: Thomas Mueller > > By default, the Sling thread pool manager calls "Thread.interrupt" to close > the thread pool. This is because it is configured to be "not graceful" by > default: > {noformat} > ./bundles/commons/threads/src/main/java/org/apache/sling/commons/threads/ModifiableThreadPoolConfig.java > ... > * The default values for this configuration are: > ... > * - shutdown graceful: false > * - shutdown wait time: -1 > ... > > ./bundles/commons/threads/src/main/java/org/apache/sling/commons/threads/impl/DefaultThreadPool.java > public void shutdown() { > this.logger.info("Shutting down thread pool [{}] ...", name); > if ( this.executor != null ) { > if (this.configuration.isShutdownGraceful()) { > this.executor.shutdown(); > } else { > this.executor.shutdownNow(); > } > try { > if (this.configuration.getShutdownWaitTimeMs() > 0) { > if > (!this.executor.awaitTermination(this.configuration.getShutdownWaitTimeMs(), > TimeUnit.MILLISECONDS)) { > logger.warn("Running commands have not terminated > within " > + this.configuration.getShutdownWaitTimeMs() > + "ms. Will shut them down by interruption"); > this.executor.shutdownNow(); // TODO: shouldn't this > be outside the if statement?! > } > } > } catch (final InterruptedException ie) { > this.logger.error("Cannot shutdown thread pool [" + this.name > + "]", ie); > } > {noformat} > I think Sling should change the default to be graceful (not call > Thread.interrupt()). Thread.interrupt can can result in many problems, > because it closes channels, possibly TCP/IP connections, and so on. When > using external libraries (JDBC, MongoDB, Apache Lucene,...) it is hard to > ensure that Thread.interrupt does not cause problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)