[ https://issues.apache.org/jira/browse/SOLR-17744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris M. Hostetter updated SOLR-17744: -------------------------------------- Attachment: SOLR-17744-1.patch Status: Open (was: Open) I've updated my patch to include a test w/concurrency guarantees (via Semaphores used by a custom request handler) to reliably demonstrate the problem and the fix. (Unfortunately, since we never followed through with SOLR-14903, the "fix" had to be made redundantly to JettySolrRunner) This updated patch also fixes a bug in SplitShardTest: the test was previously only waiting for the replicas to be created, w/o ever testing that they were able to finish recovery and become active. Once the jetties started to gracefully wait for shutdown instead of just abruptly terminating existing connections, the PrepRecovery commands that were still in progress started causing the MiniSolrCloudCluster.shutdown executor to timeout. > Solr shutdown does not graceful close Jetty requests/connections > ---------------------------------------------------------------- > > Key: SOLR-17744 > URL: https://issues.apache.org/jira/browse/SOLR-17744 > Project: Solr > Issue Type: New Feature > Reporter: Chris M. Hostetter > Assignee: Chris M. Hostetter > Priority: Major > Attachments: SOLR-17744-1.patch, SOLR-17744.patch > > > Solr does a lot of work internally (via things like SolrCore reference > counting) to ensure that we "finish" in-flight requests on orderly shutdown > (ie: when the user has issued a "stop" command) – but it does not appear that > we are doing anything to ensure that *Jetty* managed resources will also wait > for in process requests to finish. > In particular, Jetty seems to abruptly close any existing & active network > connections to clients, even as Solr continues to process those requests and > try to write out the responses. > There are Jetty features to ensure that shutdown is genuinely "graceful" > (refusing new requests while letting existing ones finish) but Solr doesn't > appear to use/enable these features: > * In Jetty 10 & 11, this is apparently done using the {{StatisticsHandler}} > (as a wrapper around the main handler collection i think?) > ** > [https://github.com/jetty/jetty.project/issues/2076#issuecomment-353578130] > ** > [https://javadoc.jetty.org/jetty-11/org/eclipse/jetty/server/handler/StatisticsHandler.html] > ** > [https://jetty.org/docs/jetty/11/programming-guide/server/http.html#handler-use-util-stats-handler] > * In Jetty 12+ there is a {{graceful}} module that provides a > {{GracefulHandler}} (which seems like a slightly more robust version of what > {{StatisticsHandler}} does in jetty-10, but with less statistics tracking > overhead) > ** > [https://jetty.org/docs/jetty/12/operations-guide/start/index.html#stop-graceful] > ** > [https://jetty.org/docs/jetty/12/operations-guide/modules/standard.html#graceful] > > The net result is that even during planned shutdown (or restart) of Solr > nodes, clients can get lots of errors. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org