I'm not sure if this is relevant to the issue but I figure I'd throw it
out there in case it is.
I added a new server to our production env. running a build of mongrel
with my fix for filters getting the CLOSE transition. My filter
increments a counter on the HANDLER event and decrements it on the CLOSE
event. I then send that count to statsd. Looking at my stats I can see
that CLOSE is happening much more frequently than HANDLER so it seems
the same connection is getting closed multiple times.
-Rob
On 6/22/12 1:40 PM, Rob LaRubbio wrote:
> Thanks for looking into this.
>
> We aren't using websockets or proxies, just mongrel2 and Tir. We have
> 4 mongrel2 servers behind a load balance each has 300 handlers. The
> handlers are not shared across servers (I have a pull request into Tir
> to make it easier to run Tir on a server other than mongrel2)
>
> ==== mongrel2.conf =====
> houston = Filter(
> name="/opt/mongrel2-1.8-dev/lib/mongrel2/filters/houston.so",
> settings = {
> <removed>
> }
> )
>
> apollo = Handler(send_spec='tcp://127.0.0.1:9999',
> send_ident='38f857b8-cbaa-4b58-9271-0d36c27813c4',
> recv_spec='tcp://127.0.0.1:9998', recv_ident='',
> protocol='tnetstring')
>
> static = Dir(base='static/',
> index_file='index.html',
> default_ctype='text/plain')
>
> main = Server(
> uuid="505417b8-1de4-454f-98b6-07eb9225cca1",
> access_log="/logs/access.log",
> error_log="/logs/error.log",
> chroot="/opt/mongrel2-1.8-dev",
> default_host="(.+)",
> name="main",
> pid_file="/run/mongrel2.pid",
> port=6767,
> hosts = [
> Host(name="(.+)",
> routes={ '/(.*/.*)': apollo,
> '/([^/]*)$': static })
> ],
> filters = [
> houston
> ]
> )
>
> settings = {
> "limits.content_length": 20480000
> }
>
> On 6/22/12 1:11 PM, Tordek wrote:
>> On 22/06/12 13:12, Rob LaRubbio wrote:
>>> Is the dev branch ready for a release? We're running it production
>>> and at least three times a week it starts spinning and writing this
>>> to the logs in an endless loop:
>>>
>>> Fri, 22 Jun 2012 16:04:50 GMT [ERROR] (src/task/fd.c:217: errno:
>>> None) Attempt to wait on a dead socket/fd: (nil) or -1
>>>
>>> The server fills up a 500G disk in about 11 hours and we need to
>>> kill the server to get it handling requests again.
>> Jason and I are looking into this; could you show us your
>> mongrel2.conf? Are you using websockets or proxies?
>>
>>> -Rob
>
>