Jim C. Nasby wrote: >> Are there any autolearn strings? Are they all "autolearn=no"? are there any >> decent number that are autolearn=failed or autolearn=disabled? >> > > grep -r autolearn caughtspam/ | grep -v 'Binary file' | sed -e > 's/.*autolearn=\([^ ]*\).*/\1/'|sort|uniq -c > 1545 no > 140 spam > 4 unavailable
Fair enough, that at least suggests that the autolearner is working. However, that learning ratio is pretty low. Are you using network tests? Without DNSBLs it's often hard to get enough header points to cause spam learning.. (Note I use mailscanner, hence the odd log syntax) grep "is spam," /var/log/maillog |wc -l 3434 grep "is spam," /var/log/maillog|grep "autolearn=spam" |wc -l 2766 grep "is spam," /var/log/maillog|grep "autolearn=not spam" | wc -l 0 So I'm autolearning about 80% of my tagged spam as spam, and none as ham. I'm also autolearning about 38% of my nonspam as ham. I'm using the default bayes_auto_learn_threshold_spam (12.0) I'm also using modified bayes_auto_learn_threshold_nonspam (-0.01). I use this coupled with a series of custom rules with tiny negative scores (all > -0.1). This makes nonspam learning something that has to be minimally earned, not just granted by virtue of a low score.