Hello there, You've certainly found a nasty bug. Great catch!!
I'm applying your patch right away. Good stuff! :-) On 23/01/2010, at 21:51, Juan J. Martínez wrote: > I can confirm there's a fd leak in some conditions in the proxy handler. > > When the keep-alive time arrives in the proxified application, the > socket it's closed in the proxified application. Cherokee keeps this > connection cached to reuse it in next calls (as expected). > > A new request arrives, and then cherokee_socket_is_connected gets a > ret_deny error, sets respined to true and tries to reconnect. > > The problem is the socket is cleaned *before* closing, thus the close > call does nothing and we have a fd leak :( > > I've repeated the fdtest [1] with the attached patch applied (it's just > swapping the clean and close calls in the reconnection with respin > enabled :D). > > Before the patch: > > URL: ***, Loops: 100 -- FD Before: 79, After: 177, Diff: 98 > > After the patch: > > URL: ***, Loops: 100 -- FD Before: 15, After: 14, Diff: -1 > > Keep in mind that the site it's in production, so the test results can > have some noise. > > Although the change seems pretty obvious, I'm newbie with the code. The > test confirms the patch, though. > > Regards, > > Juanjo > > [1]: http://gist.github.com/281767 > > El mié, 20-01-2010 a las 12:09 +0100, Juan J. Martínez escribió: >> El mié, 20-01-2010 a las 11:07 +0100, Alvaro Lopez Ortega escribió: >>> [...] >>> But before I continue speculating, it'd be important to know whether the >>> leak actually exist. Does making a hundred thousand requests to the proxy >>> make any difference? Are more descriptors being left opened? >> >> I'd have liked to peep in the proxy code, but I haven't had the time >> yet :( >> >> Anyway, I did a test following the steps to reproduce the problem: >> >> 1. Get a URL through the proxy >> 2. Wait for timeout >> 3. Go to 1 >> >> The script for the test is here: http://gist.github.com/281767 >> >> At the end it shows the difference in open fds according to fstat >> (lsof), and my results for 100 loops are: >> >> URL: ***, Loops: 100 -- FD Before: 79, After: 177, Diff: 98 >> >> It can be easily ported to Linux using lsof instead of fstat. >> >> Please notice that the server it's in production, so the results may >> have some noise for real requests (but not very much, because it's a >> very low traffic site). >> >> I don't know if that's OK, or it's related to some dark behavior in >> OpenBSD, or if I'm totally wrong :D >> >> Regards, >> >> Juanjo >> > > -- > jjm's home: http://www.usebox.net/jjm/ > blackshell: http://blackshell.usebox.net/ > ramble on: http://rambleon.usebox.net/ > > <handler_proxy_leak.patch> -- Octality http://www.octality.com/ _______________________________________________ Cherokee mailing list [email protected] http://lists.octality.com/listinfo/cherokee
