[jira] [Commented] (SOLR-18087) HTTP/2 Struggles With Streaming Large Responses

David Smiley (Jira) Wed, 28 Jan 2026 19:34:07 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055022#comment-18055022
 ]


David Smiley commented on SOLR-18087:
-------------------------------------

Fantastic write-up, Luke!
 
Do you know if we can easily configure IndexFetcher in particular for HTTP 1.1? 
 Not the whole server, which I know is easy.

> HTTP/2 Struggles With Streaming Large Responses
> -----------------------------------------------
>
>                 Key: SOLR-18087
>                 URL: https://issues.apache.org/jira/browse/SOLR-18087
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Luke Kot-Zaniewski
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: flow-control-stall.log, index-recovery-tests.md, 
> stream-benchmark-results.md
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There appear to be some severe regressions after expansion of HTTP/2 client 
> usage since at least 9.8, most notably with the stream handler as well as 
> index recovery. The impact is at the very least slowness and in some cases 
> outright response stalling. The obvious thing these two very different 
> workloads share in common is that they stream large responses. This means, 
> among other things, that they may be more directly impacted by HTTP2's flow 
> control mechanism. More specifically, the response stalling appears to be 
> caused by session window "cannibalization", i.e.  shards 1 and 2's responses 
> occupy the entirety of the session window *but* haven't been consumed yet, 
> and then, say, TupleStream calls next on shard N (because it is at the top of 
> the priority queue) but the server has nowhere to put this response since 
> shards 1 and 2 have exhausted the client buffer.
> In my testing I have tweaked the following parameters:
>  # http1 vs http2 - as stated, http1 seems to be strictly better as in faster 
> and more stable.
>  # shards per node - the greater the number of shards per node the more 
> (large, simultaneous) responses share a single connection during inter-node 
> communication. This has generally resulted in poorer performance.
>  # maxConcurrentStreams - reducing this to, say 1, can effectively circumvent 
> multiplexing. Circumventing multiplexing does seem to improve index recovery 
> in HTTP/2 but this is not a good setting to keep for production use because 
> it is global and affects *everything*, not just recovery or streaming.
>  # initialSessionRecvWindow - This is the amount of buffer the client gets 
> initially for each connection. This gets shared by the many responses that 
> share the multiplexed connection.
>  #  initialStreamRecvWindow - This is the amount of buffer each stream gets 
> initially within a single HTTP/2 session. I've found that when this is too 
> big relative to initialSessionRecvWindow it can lead to stalling because of 
> flow control enforcement
> # Simple vs Buffering Flow Control Strategy - Controls how frequently the 
> client sends a WINDOW_UPDATE frame to signal the server to send more data. 
> "Simple" sends the frame after consuming any amount of bytes while 
> "Buffering" waits until a consumption threshold is met. So far "Simple" has 
> NOT worked reliably for me and probably why the default is "Buffering".
> I’m attaching summaries of my findings, some of which can be reproduced by 
> running the appropriate benchmark in this 
> [branch|https://github.com/kotman12/solr/tree/http2-shenanigans|https://github.com/kotman12/solr/tree/http2-shenanigans].
>  The stream benchmark results md file includes the command I ran to achieve 
> the result described. 
> Next steps:
> Reproduce this in a pure jetty example. I am beginning to think multiple 
> large responses getting streamed simultaneously between the same client and 
> server may some kind of edge case in the library or protocol, itself. It may 
> have something to do with how Jetty's InputStreamResponseListener is 
> implemented although according to the docs it _should_ be compatible with 
> HTTP/2. Furthermore, there may be some other levers offered by HTTP/2 which 
> are not yet exposed by the Jetty API.
> On the other hand, we could consider having separate connection pools for 
> HTTP clients that stream large responses. There seems to be at least [some 
> precedent|https://www.akamai.com/site/en/documents/research-paper/domain-sharding-for-faster-http2-in-lossy-cellular-networks.pdf]
>  for doing this.
> > We investigate and develop a new domain-sharding technique that isolates 
> > large downloads on separate TCP connections, while keeping downloads of 
> > small objects on a single connection.
> HTTP/2 seems designed for [bursty, small 
> traffic|https://hpbn.co/http2/?utm_source=chatgpt.com#one-connection-per-origin]
>  which is why flow-control may not impact it as much. Also, if your payload 
> is small relative to your header then HTTP/2's header compression might be a 
> big win for you but in the case of large responses, not as much. 
> > Most HTTP transfers are short and bursty, whereas TCP is optimized for 
> > long-lived, bulk data transfers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-18087) HTTP/2 Struggles With Streaming Large Responses

Reply via email to