Re: mod_proxy_http2 questions (was: [VOTE] backport mod_proxy_http2 to 2.4.x as experimental)

Stefan Eissing Fri, 11 Mar 2016 02:34:03 -0800

> Am 10.03.2016 um 22:24 schrieb Yann Ylavic <ylavic....@gmail.com>:
> 
> [...]
>> 
>> Concurrent Handling
>> -------------------
>> I was then thinking: how do I get all the concurrent proxy_http2 requests to
>> use the same connection? I was originally thinking about some dort of de-mux,
>> a sort of concentrator that sits on a single connection and communicates with
>> all the worker threads that have proxy requests...
>> 
>> But that would be insane! There would be one worker sitting on the master
>> connection, several workers for each single proxy request and one more
>> worker that sits on the backend connection. And all the proxy requests
>> would do is move data from the master connection thread to the backend
>> one and vice versa. But what we need is just 2 workers: one for the master
>> connection and one for the backend connection. They should handle as many
>> concurrent requests as configured!
> 
> Hmm, I'm not sure to follow here.
> I read this as no need for workers because after all we don't need h1
> requests to forward h2.


That is not what I meant. I'd like to draw some pictures, but ASCII art
in emails... I give this a try ;-)

Let's say we have a http/2 connection running on a regular mpm worker. Name
that "m1". This spawns off requests to h2_workers on slave connections
s1, s2, etc, which create and process the requests r1.3.
These do process the requests in the usual http/1.1 infrastructure
of httpd. This would look like this:

mpm-worker: m1
 h2-worker: |--- s1[r1]
 h2-worker: |--- s2[r2]
 h2-worker: |--- s3[r3]

This happens for *all* requests incoming on m1 - always.

If s1-3 encounter a typical mod_proxy_http configuration, the requests
would open backend connections b1-3 (let's keep caching etc. out of 
this for now):

mpm-worker: m1
 h2-worker: |--- s1[r1] (proxy_http) ---> b1
 h2-worker: |--- s2[r2] (proxy_http) ---> b2
 h2-worker: |--- s3[r3] (proxy_http) ---> b3

Now, with a h1 backend connection, we need three of them to process
three requests in parallel. Not so with h2. With proxy_http2, we'd like
b1-3 to be only a single connection.

So my original idea was to build something like this:

mpm-worker: m1
 h2-worker: |--- s1[r1] (proxy_http2) ----> h2-multiplexer <--> b1
 h2-worker: |--- s2[r2] (proxy_http2) ------|
 h2-worker: |--- s3[r3] (proxy_http2) ------|

But having 3 h2-workers occupied the whole time only to shuffle
request/response bodies between m1 and b2 is a waste. The current
design therefore establishes this:

mpm-worker: m1
 h2-worker: |--- proxy_http2_engine (s1[r1],s2[r2],s3[r3]) <--> b1

When a new request comes in, slave s4 is created and started
on a worker:

mpm-worker: m1
 h2-worker: |--- proxy_http2_engine (s1[r1],s2[r2],s3[r3]) <--> b1
 h2-worker: |--- s4[r4]

When that runs into the handler of mod_proxy_http2 and is about
to allocate a new backend connection, it first checks with m1 if
there is already some h2-worker processing requests for the same
backend. If there is, we make (assuming for the fun that r1 and
r2 have finished in the meantime):

A. same backend
mpm-worker: m1
 h2-worker: |--- proxy_http2_engine (s3[r3],s4[r4]) <--> b1

if b1 is to another backend however, we will establish this:

B. different backend
mpm-worker: m1
 h2-worker: |--- proxy_http2_engine (s3[r3]) <--> b1
 h2-worker: |--- proxy_http2_engine (s4[r4]) <--> b2

> But proxying is not only about routing, it's also rewriting, caching,
> filtering, securing..., so who would do that actual work per request
> if not workers?

If I understand you correctly, all this is still happening. The things
I describe above are only happening when the handler of mod_proxy_http2
is invoked to open the backend connection.

I assume caching, rewriting etc. happens before that? 

> I understand the need for an efficient pure h2 proxy which would only
> route or balance requests, but that's not the only use case, we need
> to handle the other cases too...

No, I did not build that. All request and response data are passed
through the normal request_rec->in/out filters. 

> Sure there could be a single pollset and queue(s) on each side
> (actually listeners, even if that may sound weird on the backend
> side), but I don't see how we would fill in those without (a pool of)
> workers that do the h1 work (if any).

I think a "proxy_http_engine", similar to http2 but with a separate
backend connection for each slave could be built and that could use
a pollset.

> That's pretty much what the MPM (event) does, let's make it available
> for backend connections (there is some code like this alredy in
> mod_proxy_wstunnel which makes use of SUSPENDED and MPM callbacks,
> that may be a way).

Yes, that makes sense. Maybe we can go even further and make the
whole request processing not rely on the stack so much. Then we could
suspend/resume more often and, as I understand it, that is what
Graham is working for also.

(Comments on the rest in another mail)

-Stefan

Re: mod_proxy_http2 questions (was: [VOTE] backport mod_proxy_http2 to 2.4.x as experimental)

Reply via email to