On 8/9/2016 12:07 PM, Jacob Champion wrote:
At this point, my primary suspect is our use of recycled OVERLAPPED
structs without reinitializing them to zero. To make matters worse,
we're setting the OVERLAPPED's internal .Pointer field in the
AcceptFilter 'data' case -- which we're not supposed to be doing to
begin with [1]. We don't do that in the 'connect' filter.

This is all just theorycrafting, though. I'll try to reproduce on my end
too.

--Jacob

[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms684342(v=vs.85).aspx
(the Members > Pointer section)

I think I've finally had some success finding a reproduction of this issue, though it's somewhat involved. I set up an instance of Apache 2.4.16 64-bit (built from source) on a Windows 7 machine and spun up an instance of WANem (http://wanem.sourceforge.net/) in a VirtualBox VM hosted on my client machine (also Windows 7).

WANem configuration (Advanced Mode):
Bandwidth - 100Mbps
Random Disconnect Type - tcp-reset
Random Disconnect MTTF Low - 1
Random Disconnect MTTF High - 3
Random Disconnect MTTR Low - 0
Random Disconnect MTTR High - 0

This instructs WANem to inject a TCP RST into connections that pass through it every 1 to 3 seconds (then recover after 0 seconds).

Then on my client machine, I added a route to the server that passes through the WANem gateway (cmd prompt: ROUTE ADD <server-ip> <WANem-ip>).

Finally, I ran a program on the client that makes 10 cURL requests in parallel repeatedly, performing a GET on a simple index.html page (well, a 28 KB HTML page). Eventually, even requests made to localhost on the server machine stop responding (they hang until the client times out).

Nothing shows in the error logs (I tried up to debug verbosity), and once it reproduces, no more entries appear in the access logs. I have to restart the server, though I haven't tried letting it sit for a period of time to see if it recovers on its own.

When I do the whole process again with "AcceptFilter http connect", it does not reproduce, and requests continue to work (when not being reset by WANem).

Not easy to set up, but at least it doesn't involve a browser or specific content on the server. I've seen it reproduce almost immediately, but it usually does so within 10 seconds or so.

I'll see if Wireshark shows anything interesting going on around the RSTs.
--
Paul Spangler
LabVIEW R&D
National Instruments

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Reply via email to