On Mon, May 25, 2009 at 01:38:41AM +0200, Eduard Bloch wrote:
> > I have noticed, that acng eats all available to a process
> > virtual memory after some days of work, and it starts
> > to return 503 to all requests.
> > It spawns a lot of threads and keeps them running.
> 
> How many exactly? (ps -L ...)

UID        PID  PPID   LWP  C NLWP    SZ   RSS PSR STIME TTY      STAT   TIME 
CMD
119      26449     1 26449  1  377 782533 10164  1 07:51 ?        Ssl    0:10 
/usr/sbin/apt-cacher-ng -c /etc/apt-cacher-ng 
pidfile=/var/run/apt-cacher-ng/pid SocketPath=/var/run/apt-cacher-ng/socket 
foreground=0

3G of VM space divided by 8M of thread stack size is roughly these 377.
It eats all of available VM.

> What exactly is in the HTTP status line (after 503)?
> 
> Further, reaching thread limit would have different symptoms (not
> throwing 503... just grep for "503", it's not used in conserver.cc at all).

Quote from telnet session.

  $ telnet localhost 3142
  Trying 127.0.0.1...
  Connected to localhost.localdomain.
  Escape character is '^]'.
  GET http://ftp.ru.debian.org/debian/dists/lenny/main/binary-i386/Packages.gz 
HTTP/1.1
  Host: ftp.ru.debian.org
  
  HTTP/1.1 503 Server overload, try later
  Date: Mon May 25 04:07:05 2009
  Server: Debian Apt-Cacher NG/0.3.12
  X-Original-Source: debrep/dists/lenny/main/binary-i386/Packages.gz
  
  Connection closed by foreign host.

Tail of log.

  Tue May 12 17:51:25 2009|Error resolving ftp.fi.debian.org: 503 DNS error for 
hostname ftp.fi.debian.org: Temporary failure in name resolution
  Tue May 12 17:51:39 2009|Error resolving volatile.debian.org: 503 DNS error 
for hostname volatile.debian.org: Temporary failure in name resolution
  Tue May 12 17:51:39 2009|Error resolving security.debian.org: 503 DNS error 
for hostname security.debian.org: Temporary failure in name resolution
  Tue May 12 17:51:39 2009|Error resolving ftp.fi.debian.org: 503 DNS error for 
hostname ftp.fi.debian.org: Temporary failure in name resolution
  Tue May 12 17:52:05 2009|Error resolving backports.org: 503 DNS error for 
hostname backports.org: Temporary failure in name resolution
  Tue May 12 17:52:19 2009|Error resolving security.debian.org: 503 DNS error 
for hostname security.debian.org: Temporary failure in name resolution
  Tue May 12 17:52:19 2009|Error resolving ftp.fi.debian.org: 503 DNS error for 
hostname ftp.fi.debian.org: Temporary failure in name resolution
  Tue May 12 17:52:45 2009|Error resolving backports.org: 503 DNS error for 
hostname backports.org: Temporary failure in name resolution
  Tue May 12 17:52:59 2009|Error resolving security.debian.org: 503 DNS error 
for hostname security.debian.org: Temporary failure in name resolution
  Tue May 12 17:52:59 2009|Error resolving ftp.fi.debian.org: 503 DNS error for 
hostname ftp.fi.debian.org: Temporary failure in name resolution


> However, your problem might be somehow connected to 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=529744 . That problem
> looks like many downloader objects not being released (according to
> pipe/socket ratio) which might also be caused by hanging user connection
> threads. And receiving two heavy bug reports within one week after no such
> problem has been reported for months, that's very suspicious.

> I just don't have a good idea yet. Version 0.3.12 was released few
> minutes ago and should appear on incoming.debian.org now. It adds proper
> handling for EINTR on close(). Please take that one for further
> tests. If the problem disappears -> great, if not: please provide thread
> count and status of file handles (lsof) and last lines of apt-cacher.err
> file.

Descriptor leaks were not detected.

  $ ls -l /proc/26449/fd
  total 0
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 0 -> /dev/null
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 1 -> /dev/null
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 2 -> /dev/null
  l-wx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 3 -> 
/var/log/apt-cacher-ng/apt-cacher.err
  l-wx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 4 -> 
/var/log/apt-cacher-ng/apt-cacher.log
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 5 -> 
socket:[56579922]
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 6 -> 
socket:[56579924]
  lrwx------ 1 apt-cacher-ng apt-cacher-ng 64 2009-05-25 07:55 7 -> 
socket:[56579925]


A patch for thread leak is in the attachment.
commit e03531102c06f4cbac29e1968b3513d842e3ea75
Author: Alexander Inyukhin <shur...@sectorb.msk.ru>
Date:   Mon May 25 07:44:16 2009 +0400

    Fix race

diff --git a/source/conserver.cc b/source/conserver.cc
index a89a138..0dfd824 100644
--- a/source/conserver.cc
+++ b/source/conserver.cc
@@ -66,10 +66,11 @@ void * ThreadAction(void *)
 			break;
 
 		if(myq.empty())
-		{ // to be decreased by the pool client
+		{
 			nSpareThreads++;
 			while (myq.empty())
 				cond.wait();
+			nSpareThreads--;
 		}
 				
 		con *c=myq.front();
@@ -138,9 +139,7 @@ void SetupConAndGo(int fd, const char *szClientName=NULL)
 		goto failure_mode;
 	}
 
-	if (nSpareThreads>0)
-		nSpareThreads--; // one is ours
-	else if(!SpawnThread(0))
+	if (nSpareThreads==0 && !SpawnThread(0))
 		goto failure_mode;
 	
 	qForWorker.push_back(c);

Reply via email to