Learned friends, my spamd is ill. It dies so often I have a cron job check it every three minutes. Over the past week it has averaged about one death per day, but it's not regular: on Saturday it died twice, an hour apart, but has been fine for the 36 hours since.
Logs follow (apologies for the line length): Dec 3 23:12:24 jess spamd[12331]: spamd: result: Y 27 - BAYES_99,DNS_FROM_RFC_ABUSE,HTML_50_60,HTML_MESSAGE,RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,URIBL_AB_SURBL,URIBL_JP_SURBL,URIBL_OB_SURBL,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL,URI_NOVOWEL scantime=6.7,size=52775,user=user1,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=45705,mid=<[EMAIL PROTECTED]>,bayes=1,autolearn=unavailable [...many lines elided...] Dec 3 23:13:15 host spamd[10689]: prefork: child states: BBBIIIIIIIIIIIIIIIIII Dec 3 23:13:19 host spamd[2000]: spamd: clean message (0.8/5.0) for user2:0 in 4.5 seconds, 7454 bytes. Dec 3 23:13:19 host spamd[2000]: spamd: result: . 0 - AWL,BAYES_50,FORGED_RCVD_HELO,HTML_MESSAGE,MIME_HTML_ONLY,UNPARSEABLE_RELAY scantime=4.5,size=7454,user=user2,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46290,mid=<[EMAIL PROTECTED]>,bayes=0.479493261214263,autolearn=no Dec 3 23:13:19 host spamd[10689]: prefork: child states: BIBIIIIIIIIIIIIIIIIII Dec 3 23:13:19 host spamd[1998]: spamd: clean message (4.8/5.0) for user3:0 in 5.6 seconds, 12978 bytes. Dec 3 23:13:19 host spamd[1998]: spamd: result: . 4 - ALL_TRUSTED,AWL,BAYES_99 scantime=5.6,size=12978,user=user3,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46287,mid=<[EMAIL PROTECTED]>,bayes=1,autolearn=no Dec 3 23:13:19 host spamd[4084]: spamd: clean message (0.8/5.0) for user4:0 in 5.3 seconds, 26114 bytes. Dec 3 23:13:19 host spamd[4084]: spamd: result: . 0 - AWL,BAYES_50,FORGED_RCVD_HELO,HTML_MESSAGE,MIME_HTML_ONLY,UNPARSEABLE_RELAY scantime=5.3,size=26114,user=user4,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46292,mid=<[EMAIL PROTECTED]>,bayes=0.50000000000005,autolearn=no Dec 3 23:13:19 host spamd[10689]: prefork: child states: IIBIIIIIIIIIIIIIIIIII Dec 3 23:13:20 host spamd[10689]: prefork: child states: IIIIIIIIIIIIIIIIIIIII Dec 3 23:13:20 host spamd[10689]: spamd: handled cleanup of child pid 12331 due to SIGCHLD Dec 3 23:13:23 host spamc[12302]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused Dec 3 23:13:24 host spamc[12302]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#2 of 3): Connection refused These errors continued until the three-minute check noticed spamd's absence from the process list and restarted it at 23:15: Dec 3 23:15:01 host spamc[13236]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused Dec 3 23:15:02 host spamd[13232]: logger: removing stderr method Dec 3 23:15:07 host spamd[13239]: spamd: server started on port 783/tcp (running version 3.1.0) Dec 3 23:15:07 host spamd[13239]: spamd: server pid: 13239 Dec 3 23:15:07 host spamd[13239]: spamd: server successfully spawned child process, pid 13287 Dec 3 23:15:07 host spamd[13239]: spamd: server successfully spawned child process, pid 13288 [...18 more children spawned...] Spamd is started as follows (IPs obfuscated). /usr/bin/spamd --daemonize --sql-config --nouser-config --listen-ip=0.0.0.0 --allowed-ips=127.0.0.1,a.b.c.d,a.b.c.e --max-children=30 --min-spare=10 --max-spare=20 spamc is called from procmail, either running as the recipient or as root with "-u recipient". One night last week, spamd stopped answering queries but remained alive. The three-minute sanity checker didn't see the need to restart it and 35,000 incoming messages went unchecked before I arrived in the morning. That may not related to this problem - it's just a grumble, and an explanation for why I'm feeling a bit sideways about spamd at the moment. This three-minute checker, by the way, was originally written to slap dccifd 1.2 back into action. It was worse than spamd is now. I'm happy to say that dccifd 1.3 is much better, though I still check it every three minutes and kill it every 24 hours. The box has 1.5G of RAM free, and oodles of empty disk. I don't think spamd's health concerns are environmental. Because SA checks about 100,000 messages per day, I need to be selective about the debugging I turn on. Can you recommend a -D option that will help me diagnose this problem? Many thanks in advance. -- _________________________________________________________________________ Andrew Donkin Waikato University, Hamilton, New Zealand