Linda Walsh wrote:
Matt Kettler wrote:
The fact that they keep laying around is a problem. This suggests SA
keeps getting killed before the expire can complete. Do you have any
kind of limits set such as CPU time or memory that SA might be
running against and dying?
You can try kicking off an expire manually using sa-learn
--force-expire. (add -D if you want some debug output)..
note: this could run for a long time, particularly if bayes_toks is
really large.
----
Another one of the fileles appeared -- 17M long.
while bayes_toks is 8.8M.
That's quite reasonable. Should be able to crank through that reasonably
quick. It strikes me as odd that the .expire got larger than the toks
file, but I might be missing something about how the process works.
auto-whitelist is 78M -- that seems a bit excessive...
Yeah, the AWL has no expire process, other than manually running the
check-whitelist script on it. (found at:
http://svn.apache.org/repos/asf/spamassassin/branches/3.2/tools/ )
Don't know what really large means -- bayes_toks isn't that large
compared to some of the other files. No limits that I know of...
Ahh...seeing some oddness in the log though:
(Interrupted? timeouts?...weird...)...
Jul 25 15:24:48 Ishtar spamd[2443]: bayes: cannot open bayes databases
/home/user/.spamassassin/bayes_* R/W: lock failed: Interrupted system
call
Jul 25 15:28:21 Ishtar spamd[2355]: bayes: expire_old_tokens: child
processing timeout at /usr/bin/spamd line 1085, <GEN6> line 22.
The R/W lock fails are not surprising at all. If an expire process is
running, other scanners will fail to get a write-lock on the bayes DB.
That's not really that big a deal, all it means is that autolearning
can't occur during the expire run. Rather than logjam your mail queue,
SA just moves on and skips the autolearning, treating it as better to
process mail in a timely fashion than to wait to perform every automatic
learning possible. Same thing happens when two message try to autolearn
at the same time, only one gets the R/W lock, and the other moves on..
Now if it can't get a R/O lock (read only), start to worry. But R/W lock
failures happen fairly often.
The child processing timeouts are a bigger deal, that's what's causing
all the expire files to be left around. I can only guess this is caused
by the fact that --timeout-child defaults to 300 seconds. However, I
wouldn't expect that to apply to expire runs... Of course I'm no expert
here, as I don't use spamd (I use MailScanner, which loads the API
directly).
What version are you running? reading around the child processing
timeout seems to have been a common problem in the 3.1.x series, but
I've not seen it reported in the 3.2.x series.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4650