On Apr 21, 2008, at 8:40 AM, Chris St. Pierre wrote:
On Mon, 21 Apr 2008, Michael Parker wrote:
select * from bayes_vars;
...
2289 rows in set (0.00 sec)
What user do you run bayes under on your MXs?
I think you've found the issue. We run as spamd.
# sa-learn -u spamd --dump magic
0.000 0 3 0 non-token data: bayes db
version
0.000 0 1492123 0 non-token data: nspam
0.000 0 660634 0 non-token data: nham
0.000 0 73178711 0 non-token data: ntokens
0.000 0 1189775610 0 non-token data: oldest atime
0.000 0 1208785034 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry
atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count
That leads to two issues:
1. I need to straighten things out and figure out why I've got a
strange mix of per-user and global data in my Bayes DB. Whee.
You should use the bayes override username if you want global and then
just sa-learn -u <username> clear everything else (PITA, I know). I
personally don't believe individual bayes dbs are an issue, if you've
got the space and CPU on your database machine. See below for some
solutions.
2. Does this mean that, if I use per-user Bayes, I have to run
expiration as each user individually?
Manual expiration was recommended to me a long time ago as a way to
increase database performance, but it seems like it may not be worth
it if I have to run N forced expirations, for potentially large values
of N.
This is true for DBM based bayes databases, but generally (with an
exception I'll talk about in a second) MySQL based bayes expiration is
very fast (just a few seconds). I would go ahead and turn auto-expire
on, after running a manual expire to clear out the current backlog.
One reason that expiration slows down is an unoptimized db. I've
found for my small uses if I run optimization every couple of weeks I
get much better performance. It looks like you get a lot more traffic
so I would recommend running it more often. With frequent
optimizations and auto-expire your database will stay in much better
shape.
Michael
Thanks for your help.
Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University