[ 
https://issues.apache.org/jira/browse/SOLR-17744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-17744:
--------------------------------------
    Attachment: SOLR-17744-1.patch
        Status: Open  (was: Open)

I've updated my patch to include a test w/concurrency guarantees (via 
Semaphores used by a custom request handler) to reliably demonstrate the 
problem and the fix.

 

(Unfortunately, since we never followed through with SOLR-14903, the "fix" had 
to be made redundantly to JettySolrRunner)

 

This updated patch also fixes a bug in SplitShardTest: the test was previously 
only waiting for the replicas to be created, w/o ever testing that they were 
able to finish recovery and become active. Once the jetties started to 
gracefully wait for shutdown instead of just abruptly terminating existing 
connections, the PrepRecovery commands that were still in progress started 
causing the MiniSolrCloudCluster.shutdown executor to timeout.

> Solr shutdown does not graceful close Jetty requests/connections
> ----------------------------------------------------------------
>
>                 Key: SOLR-17744
>                 URL: https://issues.apache.org/jira/browse/SOLR-17744
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-17744-1.patch, SOLR-17744.patch
>
>
> Solr does a lot of work internally (via things like SolrCore reference 
> counting) to ensure that we "finish" in-flight requests on orderly shutdown 
> (ie: when the user has issued a "stop" command) – but it does not appear that 
> we are doing anything to ensure that *Jetty* managed resources will also wait 
> for in process requests to finish.
> In particular, Jetty seems to abruptly close any existing & active network 
> connections to clients, even as Solr continues to process those requests and 
> try to write out the responses.
> There are Jetty features to ensure that  shutdown is genuinely "graceful" 
> (refusing new requests while letting existing ones finish) but Solr doesn't 
> appear to use/enable these features:
>  * In Jetty 10 & 11, this is apparently done using the {{StatisticsHandler}} 
> (as a wrapper around the main handler collection i think?)
>  ** 
> [https://github.com/jetty/jetty.project/issues/2076#issuecomment-353578130]
>  ** 
> [https://javadoc.jetty.org/jetty-11/org/eclipse/jetty/server/handler/StatisticsHandler.html]
>  ** 
> [https://jetty.org/docs/jetty/11/programming-guide/server/http.html#handler-use-util-stats-handler]
>  * In Jetty 12+ there is a {{graceful}} module that provides a 
> {{GracefulHandler}} (which seems like a slightly more robust version of what 
> {{StatisticsHandler}} does in jetty-10, but with less statistics tracking 
> overhead)
>  ** 
> [https://jetty.org/docs/jetty/12/operations-guide/start/index.html#stop-graceful]
>  ** 
> [https://jetty.org/docs/jetty/12/operations-guide/modules/standard.html#graceful]
>  
> The net result is that even during planned shutdown (or restart) of Solr 
> nodes, clients can get lots of errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to