On 07 Oct 2015, at 6:23 PM, Stefan Eissing <stefan.eiss...@greenbytes.de> wrote:

>> Can you explain "non-multithreadability of apr_buckets” in more detail? I 
>> take it this is the problem with passing a bucket from one allocator to 
>> another?
>> 
>> If so then the copy makes more sense.
> 
> Yes, I wrote about this on the list a while ago. When the bucket is 
> destroyed, its allocator tries to put it on the free list. There is no 
> protection for that.

It would be nice to fix this for the future, but that would be an APR fix.

> 
>>> Stream pool destruction is synched with 
>>> 1. slave connection being done and no longer writing to it
>> 
>> How do you currently know the slave connection is done?
>> 
>> Normally a connection is cleaned up by the MPM that spawned the connection, 
>> I suspect you’ll need to replicate the same logic the MPMs use to tear down 
>> the connection using the c->aborted and c->keepalive flags.
>> 
>> Crucially the slave connection needs to tell you that it’s done. If you kill 
>> a connection early, data will be lost.
>> 
>> I suspect part of the problem is not implementing the algorithm that async 
>> MPMs used to kick filters with data in them. Without this kick, data in the 
>> slave stacks will never be sent. In theory, when the http2 filter receives a 
>> kick, it should pass the kick on to all slave connections.
> 
> I am not sure what you mean by that "kick". I'd have to look at your async 
> filter design some more…

What the core network filter used to do was the following:

- Apply an algorithm to determine how far into the brigade we should write 
using blocking writes. Flush buckets and safety limits get applied here.
- Actually do the write.
- As soon as the write returns EAGAIN, setaside the brigade in a buffer and 
leave
- The MPM “kicks” the core network filter by passing NULL to the filter and we 
repeat the above

We now do this for any filter:

- Apply the same safety algorithm, determine flush-to point up to which we must 
do blocking write.
- Do writes until we reach the flush-to point.
- Continue to do writes, calling ap_filter_should_yield() as a proxy for EAGAIN.
- Setaside remaining data in a buffer and leave, and add us to the set of 
filters that should be “kicked”.
- The MPM “kicks” all filters with setaside data in the c->filters set exactly 
once on each pass and we repeat the above.

Your code is effectively emulating an MPM, so would need to implement the 
“kick” above.

> I think you misunderstood me. mod_h2 uses ap_process_connection() just like 
> core.

I found that right at the start and confirmed you were doing so correctly.

> Maybe this async changes just shines the light on a bug that has always been 
> there, but never happened due to timing. I will look some more tomorrow. 
> Originally, I planned to do something else, but I am running out of 
> subversion branches where I can work…

Branches are cheap, creating more of them is not a problem.

Can you confirm what happens when things go wrong?

Do we see missing data, or do the requests hang?

Regards,
Graham
—

Reply via email to