Hello

The ViewMTN server running on www.ada-france.org (revision
ff6f92608b076dabc1da2f37b4aa326f47a8a7eb) leaks file descriptors and
eventually stops running.  The log file ends with the following stack
trace:

xxx.xxx.xxx.xxx:yyyyy - - [15/Mar/2009 01:03:45] "HTTP/1.1 GET 
/branch/changes/org.debian.libxmlada2/from/10/to/20" - 500 Internal Server Error
http://xxx.xxx.xxx.xxx:xxxx/
Traceback (most recent call last):
  File "/var/lib/monotone/net.angrygoats.viewmtn/viewmtn.py", line 173, in ?
  File "/var/lib/monotone/net.angrygoats.viewmtn/web/request.py", line 153, in 
run
  File "/var/lib/monotone/net.angrygoats.viewmtn/web/wsgi.py", line 54, in 
runwsgi
  File "/var/lib/monotone/net.angrygoats.viewmtn/web/httpserver.py", line 222, 
in runsimple
  File "/var/lib/monotone/net.angrygoats.viewmtn/web/wsgiserver/__init__.py", 
line 869, in start
  File "/var/lib/monotone/net.angrygoats.viewmtn/web/wsgiserver/__init__.py", 
line 896, in tick
  File "socket.py", line 161, in accept
socket.error: (24, 'Too many open files')


But the process is still running.  When I do "lsof -p $pid_of_viewmtn" I see 
hundreds of:

COMMAND   PID     USER   FD   TYPE   DEVICE    SIZE     NODE NAME
python  20746 monotone    5u  IPv4 36327705              TCP 
sd-12156:tproxy->crawl-66-249-71-208.googlebot.com:61384 (CLOSE_WAIT)
python  20746 monotone    6u  IPv4 36328012              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:44377 (CLOSE_WAIT)
python  20746 monotone    7u  IPv4 36328029              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:54333 (CLOSE_WAIT)
python  20746 monotone    8u  IPv4 36328034              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:57328 (CLOSE_WAIT)
python  20746 monotone    9u  IPv4 36328375              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:60411 (CLOSE_WAIT)
python  20746 monotone   10u  IPv4 36328397              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:34835 (CLOSE_WAIT)
python  20746 monotone   11u  IPv4 36328398              TCP 
sd-12156:tproxy->crawl-66-249-71-208.googlebot.com:43420 (CLOSE_WAIT)
python  20746 monotone   12u  IPv4 36328410              TCP 
sd-12156:tproxy->llf520097.crawl.yahoo.net:45122 (CLOSE_WAIT)

Right now there are 885 such open sockets, most of which are from web
spiders.  The process has reached its limit of 1024 open file
descriptors and is unresponsive (the other file descriptors include
sockets and established connections).

I was briefly tempted to write a couple of firewall rules but I
realize that this is futile; I cannot reliably distinguish between a
spider and a human connection.

According to [1], it seems like a bug whereby ViewMTN fails to
actually close connections after being notified that the client wants
to close the connection.

[1] http://www.sunmanagers.org/pipermail/summaries/2006-January/007068.html

Thoughts?

-- 
Ludovic Brenta.

Attachment: pgpLqAybSqoLB.pgp
Description: PGP signature

_______________________________________________
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel

Reply via email to