OK, found the bug. Seems an update to the lastest nghttp2 lib co-incided with 
the checkin of your async filter changes. Everything is fine now. 

At least I learned some more about core filters, cannot hurt. Thanks for the 
help.

//Stefan

> Am 07.10.2015 um 18:40 schrieb Graham Leggett <minf...@sharp.fm>:
> 
> On 07 Oct 2015, at 6:23 PM, Stefan Eissing <stefan.eiss...@greenbytes.de> 
> wrote:
> 
>>> Can you explain "non-multithreadability of apr_buckets” in more detail? I 
>>> take it this is the problem with passing a bucket from one allocator to 
>>> another?
>>> 
>>> If so then the copy makes more sense.
>> 
>> Yes, I wrote about this on the list a while ago. When the bucket is 
>> destroyed, its allocator tries to put it on the free list. There is no 
>> protection for that.
> 
> It would be nice to fix this for the future, but that would be an APR fix.
> 
>> 
>>>> Stream pool destruction is synched with 
>>>> 1. slave connection being done and no longer writing to it
>>> 
>>> How do you currently know the slave connection is done?
>>> 
>>> Normally a connection is cleaned up by the MPM that spawned the connection, 
>>> I suspect you’ll need to replicate the same logic the MPMs use to tear down 
>>> the connection using the c->aborted and c->keepalive flags.
>>> 
>>> Crucially the slave connection needs to tell you that it’s done. If you 
>>> kill a connection early, data will be lost.
>>> 
>>> I suspect part of the problem is not implementing the algorithm that async 
>>> MPMs used to kick filters with data in them. Without this kick, data in the 
>>> slave stacks will never be sent. In theory, when the http2 filter receives 
>>> a kick, it should pass the kick on to all slave connections.
>> 
>> I am not sure what you mean by that "kick". I'd have to look at your async 
>> filter design some more…
> 
> What the core network filter used to do was the following:
> 
> - Apply an algorithm to determine how far into the brigade we should write 
> using blocking writes. Flush buckets and safety limits get applied here.
> - Actually do the write.
> - As soon as the write returns EAGAIN, setaside the brigade in a buffer and 
> leave
> - The MPM “kicks” the core network filter by passing NULL to the filter and 
> we repeat the above
> 
> We now do this for any filter:
> 
> - Apply the same safety algorithm, determine flush-to point up to which we 
> must do blocking write.
> - Do writes until we reach the flush-to point.
> - Continue to do writes, calling ap_filter_should_yield() as a proxy for 
> EAGAIN.
> - Setaside remaining data in a buffer and leave, and add us to the set of 
> filters that should be “kicked”.
> - The MPM “kicks” all filters with setaside data in the c->filters set 
> exactly once on each pass and we repeat the above.
> 
> Your code is effectively emulating an MPM, so would need to implement the 
> “kick” above.
> 
>> I think you misunderstood me. mod_h2 uses ap_process_connection() just like 
>> core.
> 
> I found that right at the start and confirmed you were doing so correctly.
> 
>> Maybe this async changes just shines the light on a bug that has always been 
>> there, but never happened due to timing. I will look some more tomorrow. 
>> Originally, I planned to do something else, but I am running out of 
>> subversion branches where I can work…
> 
> Branches are cheap, creating more of them is not a problem.
> 
> Can you confirm what happens when things go wrong?
> 
> Do we see missing data, or do the requests hang?
> 
> Regards,
> Graham
> —
> 

Reply via email to