Re: Spamassassin Learn

Matt Kettler Tue, 07 Feb 2006 15:18:52 -0800

Jim C. Nasby wrote:
>> Are there any autolearn strings? Are they all "autolearn=no"? are there any
>> decent number that are autolearn=failed or autolearn=disabled?
>>
> 
> grep -r autolearn caughtspam/ | grep -v 'Binary file' | sed -e
> 's/.*autolearn=\([^ ]*\).*/\1/'|sort|uniq -c
> 1545 no
>  140 spam
>    4 unavailable


Fair enough, that at least suggests that the autolearner is working. However,
that learning ratio is pretty low.

Are you using network tests? Without DNSBLs it's often hard to get enough header
points to cause spam learning..

(Note I use mailscanner, hence the odd log syntax)

 grep "is spam," /var/log/maillog |wc -l
   3434
 grep "is spam," /var/log/maillog|grep "autolearn=spam" |wc -l
   2766
 grep "is spam," /var/log/maillog|grep "autolearn=not spam" | wc -l
      0

So I'm autolearning about 80% of my tagged spam as spam, and none as ham.

I'm also autolearning about 38% of my nonspam as ham.

I'm using the default bayes_auto_learn_threshold_spam (12.0)

I'm also using modified bayes_auto_learn_threshold_nonspam (-0.01). I use this
coupled with a series of custom rules with tiny negative scores (all > -0.1).
This makes nonspam learning something that has to be minimally earned, not just
granted by virtue of a low score.

Re: Spamassassin Learn

Reply via email to