Hello, I'm experiencing a strange, but very bad behavior with Mongrel 0.3.13.4 and Pound 1.8. Every 6 hours or so one of our nine (not consistent on which one) application servers ( each one running several mongrel processes ) will start leaving lots of socket connections with pound open. This leads to "Too many open files" errors. I've set pound to close connections after 60 seconds ( after doing analysis of our Rails logs and finding that all requests end under 60 seconds, and only 0.12% take over 1 second ). Pound closes these, but mongrel apparently isn't getting the message and I end up with rapidly increasing socket connections on the app server all left in CLOSE_WAIT state ( which is when the remote end, pound, closes the connection, and the OS is waiting for the app, mongrel, to close it on its end ). Before setting Pound to kill these, one app server could effectively take out our web server, because of "Too many open files". The plagued app server still responds, and has low load, CPU usage, and memory usage. I can even access the mongrels on their individual ports. Nothing ( I could see ) shows up in the mongrel log and the Rails logs all show it completing requests in a very timely manner. Therefore I don't think its a code issue or resource issue. But it's very odd and disturbing.
One other person ( http://poocs.net/articles/2006/03/27/the-adventures-of-scaling-stage-3 ) had a similar problem with lighttpd and FastCGI, so I'm not sure if the problem is Mongrel or maybe just Ruby. That person just set up something to restart the plagued FastCGI processes, and that's what I'm doing for Mongrel right now. Has anyone else had this problem? Is there something I can do to fix it? Thanks for any insight. Thanks so much! -John Butler _______________________________________________ Mongrel-users mailing list [email protected] http://rubyforge.org/mailman/listinfo/mongrel-users
