Hi - Some time ago I started running into problems with amavisd hanging on a server that I maintain. In the typical case, everything works fine for several hours, after which one child hangs. The system then continues to run in this diminished fashion until the second child hangs. After that, Postfix starts reporting the following on all connection attempts.
status=deferred (delivery temporarily suspended: connect to 127.0.0.1 [127.0.0.1]: read timeout) At this point no further mail is processed until daemons are restarted. The current configuration (aside from security updates) worked flawlessly for many months before these problems started. I have pretty much exhausted my ideas on how to proceed and decided maybe it was time to beg for help. I have increased the amavisd debug level and it is clear that the problem results after 'CALLING SA check', from which amavis never returns. When this occurs, the child appears to be in a sleep state. An strace shows the following. elm:~# strace -p 7002 Process 7002 attached - interrupt to quit recvfrom(11, A netstat shows the connections are in a CLOSE_WAIT state. The lsof output includes the following (I have the full output available if needed). amavisd-n 7002 amavis 3u unix 0xf1cea6a0 91865013 socket amavisd-n 7002 amavis 4u IPv4 91865015 TCP localhost:10024 (LISTEN) amavisd-n 7002 amavis 5u REG 8,3 6399 5227052 /var/lib/amavis/amavis-20060728T101157-07002/email.txt amavisd-n 7002 amavis 6u IPv4 92321291 UDP *:41070 amavisd-n 7002 amavis 7u IPv4 92321247 TCP localhost:10024->localhost:58311 (CLOSE_WAIT) I have looked at several of the lsof reported email.txt files and haven't seen anything interesting there. I have run a manual expiry on the Bayes database several times. I have tried disabling AWL. I have tried disabling all plugins in init.pre. No help there. I have also tried running amavisd via 'amavis debug-sa'. However when I do this it usually takes at least a day for the first child to hang, leaving a mountain of data to sift through. The only time I managed to get both children to fail in this mode, the last debug messages were as follows. debug: leaving helper-app run mode debug: Running tests for priority: 500 I was unable to draw any useful conclusions from this information. I didn't see any other obvious problems being reported in the debug-sa output. The server in question is running Debian stable with the standard, up-to-date packages. amavisd-new 20030616p10-5 spamassassin 3.0.3-2sarge1 postfix 2.1.5-9 razor 2.670-1sarge2 pyzor 0.4.0+cvs20030 dcc 1.2.74-2 kernel-image 2.4.27-10sarge Any help would be greatly appreciated. I would really like to find a better solution that hourly postfix/amavisd restarts via cron. Thanks. Jim ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ AMaViS-user mailing list AMaViS-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/amavis-user AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 AMaViS-HowTos:http://www.amavis.org/howto/