Luke Kot-Zaniewski created SOLR-18087:
-----------------------------------------
Summary: HTTP/2 Struggles With Streaming Large Responses
Key: SOLR-18087
URL: https://issues.apache.org/jira/browse/SOLR-18087
Project: Solr
Issue Type: Bug
Reporter: Luke Kot-Zaniewski
Attachments: flow-control-stall.log, index-recovery-tests.md,
stream-benchmark-results.md
There appear to be some severe http/2 regressions since at least 9.8, most
notably with the stream handler as well as index recovery. The impact is at the
very least slowness and in some cases outright response stalling. The response
stalling appears to be caused by HTTP/2's flow control. The obvious thing these
two very different workloads share in common is that they stream large
responses. This means, among other things, that they may be more directly
impacted by HTTP2's flow control mechanism.
In my testing I have tweaked the following parameters:
# http1 vs http2 - as stated, http1 seems to be strictly better as in faster
and more stable.
# shards per node - the greater the number of shards per node the more (large,
simultaneous) responses share a single connection during inter-node
communication. This has generally resulted in poorer performance.
# maxConcurrentStreams - reducing this to, say 1, can effectively circumvent
multiplexing. Circumventing multiplexing does seem to improve index recovery
somewhat (still slower than HTTP/1). On the other hand, this seems antithetical
to the point of http2. It's also interesting this doesn't help
# initialSessionRecvWindow - This is the amount of buffer the client gets for
each connection. This gets shared by the many responses that share the
multiplexed connection.
# initialStreamRecvWindow - This is the amount of buffer each stream gets
within a single HTTP/2 session. I've found that when this is too big relative
to initialSessionRecvWindow it can lead to stalling because of flow control
enforcement
I’m attaching summaries of my findings, some of which can be reproduced by
running the appropriate benchmark in this branch:
[https://github.com/kotman12/solr/tree/http2-shenanigans.]
My next step is to solicit some feedback from the community. Absent anything
else I may try reproducing this in a pure jetty example. I am beginning to
think multiple large responses getting streamed simultaneously between the same
client and server may some kind of edge case the library doesn't handle well.
It may have something to do with how Jetty's InputStreamResponseListener is
implemented although according to the docs it _should_ be compatible with
HTTP/2.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]