Hi Folks, it's been a while I asked here how to solve bayes timeout and spamd child timeout problems. Well, at least for our environment I have found a solution that seems to work. Also, I have a theory about the reason for this bayes timeout and spamd child timeout problems and I'd like to know whether this theory is correct.
Symptoms: child processing timeout at spamd line 1086, <GEN786> line 108. child processing timeout at spamd line 1086, <GEN73> line 209. ... bayes: child processing timeout at spamd line 1086. Reason: spamc timeout set to 220s spamd timeout set to 240s procmail timeout set to 300s First I did what everybody suggested. Disabling bayes_auto_expire in local.cf and doing the job manually per user. I wrote a script that extracted the users from the maillog that had a scantime of more than 220s and ran a sa-learn -u $user --force-expire --sync. The problem stayed unsolved. Then I changed the timeouts to values more than twice as high. Result: For nearly 2 days I had no timeout errors anymore. Then I checked once more the logs and I saw a lot of users having scantimes quite above 300s but lower than the new values. Those where users, that never before have had come up in my logs with such high scantimes. Then, I basically ran the whole day --force-expire --sync. I realized that the manual force-expire job was not applicable for 2700 users and a 2.5GB Bayes DB in mysql (myisam engine). Also I realized that doing the --force-expire job manually probably would mess up some or most of the users Bayes DBs. I changed back to auto_expire = 1 in local.cf and restarted spamd. This is what happened next for a number of users: bayes: expire_old_tokens: child processing timeout at spamd line 1086, This was on Tuesday, March 14. Since then I have had no problems anymore with spamd child timeouts. I have not looked into the spamd code and I think I shouldnt do it as I am no perl coder. Nevertheless I have a theory why the short timeout values could have such a heavy impact: If the timeouts are too short, spamd under some circumstances cannot finish the bayes expire job if bayes_auto_expire is enabled in local.cf. I hope, I correctly understand the expire job as a database cleanup job. Thus, if it can't be finished, it turns from a cleanup to a messup job; the problem gets wors or at least stayes at least as bad as it is. Now, I hope that by changing the mysql engine from myisam to innodb which is capable of doing DB transactions and is suggested by the SpamAssassin people in the Bayes manpages the expire job gets finished even if spamd suffers a timeout. Your comments? Philipp