Re: mod_proxy_http2 questions (was: [VOTE] backport mod_proxy_http2 to 2.4.x as experimental)

Yann Ylavic Thu, 10 Mar 2016 13:25:23 -0800

On Thu, Mar 10, 2016 at 5:38 PM, Stefan Eissing
<[email protected]> wrote:
>
> Iterative, the Common Case
> --------------------------
[]
>
> As to the input/output handling for that request_rec, that is basic mod_http2
> stuff. The core filters have been replaced with ones that shuffle the data
> to h2 internal things (mainly through the h2_mplx) and the response headers
> are taken directly from the request_rec->headers_out. The http2 usual way.


OK, got that, h2_filter_stream_output() is the network filter of the
slave (http/1) connections.

>
> Concurrent Handling
> -------------------
> I was then thinking: how do I get all the concurrent proxy_http2 requests to
> use the same connection? I was originally thinking about some dort of de-mux,
> a sort of concentrator that sits on a single connection and communicates with
> all the worker threads that have proxy requests...
>
> But that would be insane! There would be one worker sitting on the master
> connection, several workers for each single proxy request and one more
> worker that sits on the backend connection. And all the proxy requests
> would do is move data from the master connection thread to the backend
> one and vice versa. But what we need is just 2 workers: one for the master
> connection and one for the backend connection. They should handle as many
> concurrent requests as configured!

Hmm, I'm not sure to follow here.
I read this as no need for workers because after all we don't need h1
requests to forward h2.

But proxying is not only about routing, it's also rewriting, caching,
filtering, securing..., so who would do that actual work per request
if not workers?

I understand the need for an efficient pure h2 proxy which would only
route or balance requests, but that's not the only use case, we need
to handle the other cases too...

Sure there could be a single pollset and queue(s) on each side
(actually listeners, even if that may sound weird on the backend
side), but I don't see how we would fill in those without (a pool of)
workers that do the h1 work (if any).
That's pretty much what the MPM (event) does, let's make it available
for backend connections (there is some code like this alredy in
mod_proxy_wstunnel which makes use of SUSPENDED and MPM callbacks,
that may be a way).

>
> Backend Engines
> ---------------
> How can we do that? Let's discuss one master http2 connection with
> no proxy request ongoing. The next request which is handled by proxy_http2
> is the "first" - it sees no other ongoing request against the same
> backend. It registers itself at the master http2 session for the backend.
> It performs the request in the usual way, de-registers itself again
> and returns. Same as with HTTP/1, plus some registration dance of what
> I named a "request engine".
>
> If another worker handles a proxy_http2 request, while the "first"
> engine is ongoing, it sees that someone has registered for the backend
> that it also needs. What it then does is "hand over" the request_rec
> to the "first" engine and immediately returns as if done.

Which sends out the EOR since the core thinks the request is done, right?
If so, wouldn't returning SUSPENDED be more appropriate (that would
free the worker thread for some other work)?

>
> So, the "first" proxy_http2 worker collects all request_recs against
> the same backend and continues running until it is out of requests. Then
> it also shuts down.
>
> So, while several requests against the same backend are ongoing, there
> are two main workers handling it, plus, very shortly, some workers that
> create request_recs, find out that they are proxy_http2, hand them over
> and return.
>
> So, how do we hand request_recs over to another thread?
>
> Request Transfer
> ----------------
> First of all, the request_recs inside a HTTP/2 connection are all created
> on top of slave connections and mod_http2 has a tight grip of the lifetime
> of those connections. So, all that has been allocated from that slave
> connection directly will still be there when the mod_proxy handler
> returns.
>
> But the request_rec itself and all filters that have been installed
> use a child pool and the usual runtime gets rid of that pool at the
> end of the request. How does it know that the request has ended? When
> the EOR bucket gets destroyed.
>
> So, mod_http2, when handing over request_recs, first *freezes* it
> by adding a top output filter that sets all buckets it gets aside. Which
> basically is the EOR bucket. It passes nothing on while frozen, so
> output filters further down never get called at this time.

That possibly shouldn't be needed, nothing would go out until the
backend provides the h1 response (or chunks, via h2 of course)
corresponding to a h1 request (both kept in relation with a
context/baton) and then the response bytes could go out through normal
filtering (finally h1_to_h2 on the client side if that's h2 there
too).

>
> When the proxy_http2 worker for that request returns, mod_http2
> notices the frozen request, thaws it and signals the backend connection
> worker that the request is ready for processing. The backend worker
> picks it up, handles it and *closes the output* which will write
> the buckets set aside before. This will write the EOR and the
> request deallocates itself. During this time, all the input/output
> filter etc. will remain in place and work.

With my description above, that would translate to: before the worker
thread available to handle the last bytes of a response sends them, it
also flushes the EOR.

Each worker thread could still do as much work as possible (as you
describe) to save some queuing in the MPM, but that's possibly further
optimization.

>
> Conclusion, TODOs
> -----------------
> As to you question re filters on the backend connection: those are
> used unchanged. However they will be used for HTTP/2 data and while
> the byte counters will do, any request counter will not. I do not
> know what else is done there...
>
> Also I have not had the time to look at the logging of such request.
> There might be some more work done there.

Yes, that's another story (and mod_logio would better count h2
connections/bytes if that's the underlying protocol after all...).

>
> I hope this give an explanation of how mod_proxy_http2 works
> together with mod_http2. If not, please ask. Happy to explain and
> get feedback.

Thanks for explaining Stefan, same here :)

Regards,
Yann.

Re: mod_proxy_http2 questions (was: [VOTE] backport mod_proxy_http2 to 2.4.x as experimental)

Reply via email to