Hi,

The file size of the bayes database on a server is becoming really large :
bayes_seen is 160 MB and bayes_toks is 8 MB. This mail server processes around
30000 mails a day, as a relay.

I did not configure any bayes_expiry_max_db_size, so it should be set to default
(150000), and the only configuration directives in my local.cf are :

bayes_auto_learn                        1
bayes_auto_learn_threshold_nonspam      0.1
bayes_auto_learn_threshold_spam         12.0

I do not understand how these bayes files can be so large, the fine manual says
that with such settings, the file size should stay around 8MB. Or do these 8 MB
represent the "normal" size of the bayes_toks file, not the bayes_seen one ?

Some more info :
su spam -s /bin/sh -c "sa-learn --dump magic -D"
(...)
debug: bayes: 6765 tie-ing to DB file R/O /home/spam/.spamassassin/bayes_toks
debug: bayes: 6765 tie-ing to DB file R/O /home/spam/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: Score set 2 chosen.
0.000          0          3          0  non-token data: bayes db version
0.000          0     405891          0  non-token data: nspam
0.000          0     948334          0  non-token data: nham
0.000          0     287829          0  non-token data: ntokens
0.000          0 1103037764          0  non-token data: oldest atime
0.000          0 1103107296          0  non-token data: newest atime
0.000          0 1103107219          0  non-token data: last journal sync atime
0.000          0 1103105595          0  non-token data: last expiry atime
0.000          0      43200          0  non-token data: last expire atime delta
0.000          0     161098          0  non-token data: last expire reduction
count
debug: bayes: 6765 untie-ing
debug: bayes: 6765 untie-ing db_toks
debug: bayes: 6765 untie-ing db_seen


Today, spamd stopped working with the following error :

Dec 15 04:25:15 server spamc[18803]: connect(AF_INET) to spamd at 127.0.0.1
failed, retrying (#1 of 3): Connection refused

I did not understand why it died. Manually restarting spamd solved the problem
but I think it could happen again, and it might be related to some lack of
resources due to the bayes file size ?

I am using postfix 1.1.12, SA 3.0.1, MIME-Base64-3.05, DB_File-1.809, and
db4-4.0.14-20 (RedHat 9) on a postfix+SA relay. The bayes database is common to
all users, and located on the "spam" user's home directory.

SA is invoked with "spamd -d -c -u spam" and "/usr/bin/spamc -t 180 -s 500000 -e
/usr/sbin/sendmail -i -f ${sender} -- ${recipient}"



Many thanks to whoever has any clue on how I could shrink the bayes files
without loosing them. I would particularly be interested on the right
bayes_expiry_max_db_size setting I should configure for a server handling
around 30000 mails daily.


Reply via email to