Hi,
I recently upgraded to SA 3.4.0-rsvnunknown (using
https://launchpad.net/~spamassassin/+archive/spamassassin-old on Ubuntu
10.04 LTS) from SA 3.3.2 on different machine running ArchLinux. I use
MySQL to store user preferences as well as Bayesin data. No AWL, no
autolearning of the Bayesin filter and both machines run sa-update as
daily cronjobs.
I migrated my MySQL database containing all settings along with my
/etc/spamassassin directory with my static settings/rules to the new
machine, ran sa-update, sa-compile and restarted spamd. I was curious to
see if 3.4.0 scored a certain message differently than 3.3.2, so I ran
"cat spam | spamc -u jes...@ifconfig.se -R" in order to see the result.
To my surprice, the bayesin filter only scored 60-80% (BAYES_60) where
it previously scored 90-95% (BAYES_95) .. Has there been any major
changes to the bayesin engine in 3.4? (and/or the SQL storage backend
for it) .. I copied my spam/ham corpus to the new machine and ran
sa-learn on top of the current database in order to see if that helped.
Shockingly, it now scored 1-5% (BAYES_05) and I decided to start over..
Ran a "sa-learn --clear" in order to wipe out the old database and
re-ran the sa-learn.. Now it scored perfectly 99-100% (BAYES_99)
I also noticed that my old database only had 11k tokens while the new
one got about 60k (both the old and new server has hapaxes enabled and
was trained using a corpus of about 600 spam and 200 ham)
Any thoughts or ideas what might have caused this?
Regards,
Jesper Wallin