[ 
https://issues.apache.org/jira/browse/SOLR-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-18087:
----------------------------------
    Labels: pull-request-available  (was: )

> HTTP/2 Struggles With Streaming Large Responses
> -----------------------------------------------
>
>                 Key: SOLR-18087
>                 URL: https://issues.apache.org/jira/browse/SOLR-18087
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Luke Kot-Zaniewski
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: flow-control-stall.log, index-recovery-tests.md, 
> stream-benchmark-results.md
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There appear to be some severe http/2 regressions since at least 9.8, most 
> notably with the stream handler as well as index recovery. The impact is at 
> the very least slowness and in some cases outright response stalling. The 
> response stalling appears to be caused by HTTP/2's flow control. The obvious 
> thing these two very different workloads share in common is that they stream 
> large responses. This means, among other things, that they may be more 
> directly impacted by HTTP2's flow control mechanism.
> In my testing I have tweaked the following parameters:
>  # http1 vs http2 - as stated, http1 seems to be strictly better as in faster 
> and more stable.
>  # shards per node - the greater the number of shards per node the more 
> (large, simultaneous) responses share a single connection during inter-node 
> communication. This has generally resulted in poorer performance.
>  #  maxConcurrentStreams - reducing this to, say 1, can effectively 
> circumvent multiplexing. Circumventing multiplexing does seem to improve 
> index recovery somewhat (still slower than HTTP/1). On the other hand, this 
> seems antithetical to the point of http2. It's also interesting this doesn't 
> help 
>  #  initialSessionRecvWindow - This is the amount of buffer the client gets 
> for each connection. This gets shared by the many responses that share the 
> multiplexed connection.
>  #  initialStreamRecvWindow - This is the amount of buffer each stream gets 
> within a single HTTP/2 session. I've found that when this is too big relative 
> to initialSessionRecvWindow it can lead to stalling because of flow control 
> enforcement
> I’m attaching summaries of my findings, some of which can be reproduced by 
> running the appropriate benchmark in this branch: 
> [https://github.com/kotman12/solr/tree/http2-shenanigans.]
> My next step is to solicit some feedback from the community. Absent anything 
> else I may try reproducing this in a pure jetty example. I am beginning to 
> think multiple large responses getting streamed simultaneously between the 
> same client and server may some kind of edge case the library doesn't handle 
> well. It may have something to do with how Jetty's 
> InputStreamResponseListener is implemented although according to the docs it 
> _should_ be compatible with HTTP/2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to