What is the expected speed of a sa-learn when the bayes db is a local
flatfile (non-nfs, non-SQL)? Machine has a Celeron 2.26GHz, 512mb RAM, and
a modern SATA drive formatted as ext3, and is otherwise idle; learning
does 2-3 messages/second (Spamassassin from Debian Testing's 3.1.7-1) when
learning batches of 50 - 200 emails.
[EMAIL PROTECTED]:~$ sa-learn --dump magic
0.000 0 3 0 non-token data: bayes db version
0.000 0 23314 0 non-token data: nspam
0.000 0 10547 0 non-token data: nham
0.000 0 267116 0 non-token data: ntokens
0.000 0 1167831004 0 non-token data: oldest atime
0.000 0 1169061004 0 non-token data: newest atime
0.000 0 1169061270 0 non-token data: last journal sync atime
0.000 0 1169040677 0 non-token data: last expiry atime
0.000 0 172800 0 non-token data: last expire atime delta
0.000 0 14141 0 non-token data: last expire reduction
count
[EMAIL PROTECTED]:~/.spamassassin$ ls -l
total 14756
-rw------- 1 bwindle bwindle 5251072 2007-01-17 14:10 auto-whitelist
-rw------- 1 bwindle bwindle 4997120 2007-01-17 14:14 bayes_seen
-rw------- 1 bwindle bwindle 10375168 2007-01-17 14:14 bayes_toks
-rw-r--r-- 1 bwindle bwindle 5367 2007-01-03 09:28 user_prefs
[EMAIL PROTECTED]:~$ time sa-learn --spam --mbox --progress
~/mail/SPAM-certainly
100% [===================================================================================================]
2.51 msgs/sec 00m11s DONE
Learned tokens from 25 message(s) (29 message(s) examined)
real 0m14.374s
user 0m4.040s
sys 0m0.170s
--
Burton Windle [EMAIL PROTECTED]