On 2/25/2015 2:31 AM, Alex Bligh wrote: > Yes. I've seen this. > > There were once some issues with the brigade allocators. You may want > to look at my fork here: > > https://github.com/abligh/apache-websocket
Hi, Alex. Your blog post [1] -- or rather its comments section -- was actually one of the only other places I could find a reference to the crash I'm seeing! Glad to see you're watching this list. It looks like half your fixes were pulled upstream; the remainder involves a new allocator and mutex [2]. Anything else I'm missing? > Despite my fixes, it still dies occasionally, normally because one > of the bucket brigades becomes corrupt. Like you mention in your post, I'm primarily seeing a crash when using TLS. For posterity's sake, here's the full stack I see (the commenter on your post had what appeared to be the same trace, but it was missing symbols): <SEGV from what appears to be the apr_bucket_destroy macro?> libhttpd!writev_nonblocking libhttpd!send_brigade_nonblocking libhttpd!ap_random_parent_after_fork libhttpd!ap_pass_brigade mod_ssl!bio_filter_out_pass mod_ssl!bio_filter_out_write <libeay32>!BIO_write ... This is a Windows 64-bit build. I can reproduce this easily, usually within seconds of running my tests. I can try to work up a minimal reproduction case if anyone else turns up and is interested. My suspicion is that the two-brigade approach clashes with the fact that OpenSSL can read from the socket during its writes and vice-versa. But that's only a suspicion -- for all I know, mod_ssl and/or Apache might have synchronization techniques that make parallel brigades safe. > I spent many many hours on this, ultimately unsuccessfully (I've > moved to mod_proxy_wstunnel plus a libwebsockets C based thing). Tangent: I tried libwebsockets about six months back. I had trouble with heavy load with it too -- the architecture didn't seem to handle the case where the network stack could only accept a partial write, which led to a lot of spurious disconnects when streaming massive amounts of data. Have you run into that as well? > FWIW if I hadn't made the above move, my plan was to eliminate > doing most of the work in the apache thread by using a bucket > brigade thate ended in a socketpair (I can't remember the correct > apache terminology here, but the point was to have apache > do the read/write from one end of the socketpair), then set > up two new threads to read and write from the other end > of the socketpair, encoding/decoding as we go. This means that > once the connection is live the module code itself wouldn't > actually touch any apache memory-managed data. > > IE: > > Apache <==> Socket+Socket <===> Decode/Encode <===> Whatever Interesting. So, if I've got this right, you'd try to artificially terminate the chain in a pair of file descriptors, and then handle those directly using select/poll/whatever from more threads? Are there any existing modules in Apache that take a similar approach? Thanks! Jacob Champion LabVIEW R&D National Instruments [1] http://blog.alex.org.uk/2012/09/11/apache-websockets-and-tcp-vnc-proxy/ [2] https://github.com/abligh/apache-websocket/commit/2d824f989aac196f42ad5127a290df04720bc2da