[
https://issues.apache.org/jira/browse/SOLR-17744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris M. Hostetter updated SOLR-17744:
--------------------------------------
Attachment: SOLR-17744-1.patch
Status: Open (was: Open)
I've updated my patch to include a test w/concurrency guarantees (via
Semaphores used by a custom request handler) to reliably demonstrate the
problem and the fix.
(Unfortunately, since we never followed through with SOLR-14903, the "fix" had
to be made redundantly to JettySolrRunner)
This updated patch also fixes a bug in SplitShardTest: the test was previously
only waiting for the replicas to be created, w/o ever testing that they were
able to finish recovery and become active. Once the jetties started to
gracefully wait for shutdown instead of just abruptly terminating existing
connections, the PrepRecovery commands that were still in progress started
causing the MiniSolrCloudCluster.shutdown executor to timeout.
> Solr shutdown does not graceful close Jetty requests/connections
> ----------------------------------------------------------------
>
> Key: SOLR-17744
> URL: https://issues.apache.org/jira/browse/SOLR-17744
> Project: Solr
> Issue Type: New Feature
> Reporter: Chris M. Hostetter
> Assignee: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-17744-1.patch, SOLR-17744.patch
>
>
> Solr does a lot of work internally (via things like SolrCore reference
> counting) to ensure that we "finish" in-flight requests on orderly shutdown
> (ie: when the user has issued a "stop" command) – but it does not appear that
> we are doing anything to ensure that *Jetty* managed resources will also wait
> for in process requests to finish.
> In particular, Jetty seems to abruptly close any existing & active network
> connections to clients, even as Solr continues to process those requests and
> try to write out the responses.
> There are Jetty features to ensure that shutdown is genuinely "graceful"
> (refusing new requests while letting existing ones finish) but Solr doesn't
> appear to use/enable these features:
> * In Jetty 10 & 11, this is apparently done using the {{StatisticsHandler}}
> (as a wrapper around the main handler collection i think?)
> **
> [https://github.com/jetty/jetty.project/issues/2076#issuecomment-353578130]
> **
> [https://javadoc.jetty.org/jetty-11/org/eclipse/jetty/server/handler/StatisticsHandler.html]
> **
> [https://jetty.org/docs/jetty/11/programming-guide/server/http.html#handler-use-util-stats-handler]
> * In Jetty 12+ there is a {{graceful}} module that provides a
> {{GracefulHandler}} (which seems like a slightly more robust version of what
> {{StatisticsHandler}} does in jetty-10, but with less statistics tracking
> overhead)
> **
> [https://jetty.org/docs/jetty/12/operations-guide/start/index.html#stop-graceful]
> **
> [https://jetty.org/docs/jetty/12/operations-guide/modules/standard.html#graceful]
>
> The net result is that even during planned shutdown (or restart) of Solr
> nodes, clients can get lots of errors.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]