On 2/25/2015 2:31 AM, Alex Bligh wrote:
> Yes. I've seen this.
>
> There were once some issues with the brigade allocators. You may want
> to look at my fork here:
>
> https://github.com/abligh/apache-websocket
Hi, Alex. Your blog post [1] -- or rather its comments section -- was
actually one of the only other places I could find a reference to the
crash I'm seeing! Glad to see you're watching this list.
It looks like half your fixes were pulled upstream; the remainder
involves a new allocator and mutex [2]. Anything else I'm missing?
> Despite my fixes, it still dies occasionally, normally because one
> of the bucket brigades becomes corrupt.
Like you mention in your post, I'm primarily seeing a crash when using
TLS. For posterity's sake, here's the full stack I see (the commenter on
your post had what appeared to be the same trace, but it was missing
symbols):
<SEGV from what appears to be the apr_bucket_destroy macro?>
libhttpd!writev_nonblocking
libhttpd!send_brigade_nonblocking
libhttpd!ap_random_parent_after_fork
libhttpd!ap_pass_brigade
mod_ssl!bio_filter_out_pass
mod_ssl!bio_filter_out_write
<libeay32>!BIO_write
...
This is a Windows 64-bit build. I can reproduce this easily, usually
within seconds of running my tests. I can try to work up a minimal
reproduction case if anyone else turns up and is interested.
My suspicion is that the two-brigade approach clashes with the fact that
OpenSSL can read from the socket during its writes and vice-versa. But
that's only a suspicion -- for all I know, mod_ssl and/or Apache might
have synchronization techniques that make parallel brigades safe.
> I spent many many hours on this, ultimately unsuccessfully (I've
> moved to mod_proxy_wstunnel plus a libwebsockets C based thing).
Tangent: I tried libwebsockets about six months back. I had trouble with
heavy load with it too -- the architecture didn't seem to handle the
case where the network stack could only accept a partial write, which
led to a lot of spurious disconnects when streaming massive amounts of
data. Have you run into that as well?
> FWIW if I hadn't made the above move, my plan was to eliminate
> doing most of the work in the apache thread by using a bucket
> brigade thate ended in a socketpair (I can't remember the correct
> apache terminology here, but the point was to have apache
> do the read/write from one end of the socketpair), then set
> up two new threads to read and write from the other end
> of the socketpair, encoding/decoding as we go. This means that
> once the connection is live the module code itself wouldn't
> actually touch any apache memory-managed data.
>
> IE:
>
> Apache <==> Socket+Socket <===> Decode/Encode <===> Whatever
Interesting. So, if I've got this right, you'd try to artificially
terminate the chain in a pair of file descriptors, and then handle those
directly using select/poll/whatever from more threads? Are there any
existing modules in Apache that take a similar approach?
Thanks!
Jacob Champion
LabVIEW R&D
National Instruments
[1] http://blog.alex.org.uk/2012/09/11/apache-websockets-and-tcp-vnc-proxy/
[2]
https://github.com/abligh/apache-websocket/commit/2d824f989aac196f42ad5127a290df04720bc2da