On Sat, Mar 28, 2020 at 11:24 AM Graham Leggett <[email protected]> wrote:
>
> On 27 Mar 2020, at 14:01, Yann Ylavic <[email protected]> wrote:
>
> >> We need to find the reason that in a non-async case, data is being
> >> setaside, and we need to fix it.
> >
> > Connection and network output filters shouldn't care about async or
> > not, they just feed the pipe as much as possible, and setaside what
> > they can't (up to reinstate limits). For non-async this is pipelining,
> > let's not disable that.
> >
> > For non-async (though async would also benefit from it), what we need
> > to do is make sure that each request (worth handling in httpd) gets
> > finally EOR-ed, so that it's accounted (pipelining for non-async, or
> > blocking for async).
>
> Async shouldn’t block, that’s the whole point of being async.
>
> It only blocks in pathological cases, where the choice is between block, or
> die because out-of-ram or out-of-file-handles.
That's precisely what we are accounting for in
ap_filter_reinstate_brigade() right?
And when the core/network output filter reaches its limits what else
can it do than block (if you don't want setaside RAM or FDs to go
unbounded)?
The current model is that if a handler (or WRITE_COMPLETION) doesn't
want the network filter to block, it shouldn't give the output filters
more data than they can setaside, and first test should_yield, and
never FLUSH.
This is easy for WRITE_COMPLETIION because it adds nothing (empty
brigade), so the output filters won't start to block all of a sudden
if they previously released the thread without (or after) blocking.
But what are handlers supposed to do when should_yield? Existing
filters know nothing about this, they need to be rewritten to go async
and be called back. I see no automagic way we can do this for them.
In the meantime, output filters will block unless their setaside data
fit the limits.
>
> I plan to add EAGAIN support to the core (that was the point behind a recent
> POC patch), at this point all the blocking vanishes. Adding blocking back in
> breaks this.
By core do you mean the default_handler()? mod_proxy handlers?
All ap_pass_brigade() callers could get EAGAIN?
>
> In my original design (network and connnection filters only),
(and TRANSCODE with ap_request_core_filter)
> In your design (add request filters)
Huh? No, I didn't add ap_request_core_filter(), it was there from the
start (r1706669).
I just tried to fix it several times, and failed (having a better
picture each time ;)
Anyway, the proposal to simply remove it should work for you then,
care to comment on the proposed patch (in this thread) which does
that?
Possibly you could tell us what is ap_request_core_filter() original point too?
I think it was (and still is) to not let morphing buckets through, and
if so I (now) think it's unnecessary.
Provided we trust the caller to pass either "request pooled" morphing
bucket (which we trust to precede EOR), or "connection pooled"
morphing buckets (it's not necessarily leaky to do so), setaside can
be a move from user brigade to filter brigade and vice versa.
And then the request core filter is useless, connection/network ones
can do their job.
Let's see how it works for both event and worker MPMs, when e.g. a 2MB
file is requested (suppose CONN_STATE_ is always READ_REQUEST_LINE for
mpm_worker, the rest in this common part is really identical for the
two MPMs, so it's easier to write it like this):
0.1 process_socket()
1.1 CONN_STATE_READ_REQUEST_LINE?
2.1 process_connection()
3.1 process_request()
4.1 _handler()
5.0 ap_pass_brigade(next, 2MB.file.bucket, EOS)
6.0 core_output_filter(2MB.file.bucket, EOS)
6.1 send_brigade_nonblocking() = EAGAIN/SUCCESS
6.2 remaining > flush_max_threshold (say 512KB)
block (POLLOUT)
goto 6.1
6.3 setaside EAGAIN data (say 256KB)
4.2 ap_pass_brigade(c->output_filters, EOR)
Now with mpm_event:
3.2 CONN_STATE_WRITE_COMPLETION!
1.2 CONN_STATE_WRITE_COMPLETION?
1.3.1 some ap_run_output_pending()?
WRITE_COMPLETION queue (POLLOUT)
1.3.2 else
CONN_STATE_READ_REQUEST_LINE! (almost)
KEEPALIVE queue (POLLIN)
1.4 CONN_STATE_LINGER?
LINGER queue
1.5 <thread available for another event>
0.2 time passes
0.3 socket event
0.4 goto 0
Whereas with mpm_worker:
4.3 no ap_run_input_pending()?
ap_pass_brigade(c->output_filters, FLUSH)
3.2 AP_CONN_KEEPALIVE?
goto 2.1
0.2 ap_lingering_close(), including ap_flush_conn()
0.3 <thread available for another connection>
So both MPMs might block in 6.2 (depending on whether the 1.5MB can be
sent nonblocking in a single pass or not), but at 6.3 they are sure
they can setaside what remains, and do so.
After that, with mpm_event the pending 256KB will be sent nonblocking
in WRITE_COMPLETION scheduling (by any thread available), and any next
request on the connection will be processed later (likely yet another
thread).
With mpm_worker it depends on whether a pipelined request is available
at 4.3, if not we will FLUSH (thus ask the core_output_filter to
block) before reading the next request, but if the next request in
already the pipe than we will just pipeline the responses too.
In this latter case, the core output filter will continue to block
when appropriate (max threshold, max EORs), so we are good too.
> you added the additional behaviour to let all connection filters be async.
Never did something like this, AFAICT...
> Trouble is, this doesn’t work on non async MPMs, and here we are.
I think everything works when there is no ap_request_core_filter(), see above.
Regards,
Yann.