Re: Bayes autolearning: logarithmic?

Karsten Bräckelmann Wed, 22 May 2013 11:45:28 -0700

On Wed, 2013-05-22 at 11:34 -0400, Andrew Talbot wrote:
> I set up Bayes with autolearning a few weeks ago. It took forever to
> get started, but now it seems like the learning speed has accelerated.
> 
> Is the autolearning supposed to accelerate? I can't help but feel like
> it may just be feeding itself it's own data or something.


There is no feedback loop in the learning process. Automatic learning is
based on non-Bayes scores, and does in particular entirely ignore
certain rules like BAYES_nn. See the AutoLearnThreshold [1] plugin.

Additionally, there are quite a few constraints for auto-learning to
happen. Besides the score thresholds, there are (non-configurable)
constraints for header and body rules being involved, etc.

Since there is no feedback here, I'd guess the "acceleration" is most
likely perceived only -- "it seems" to have accelerated, you said. Tried
backing that up with numbers?

The learning speed (or rather number of learned per overall messages)
can be influenced by a few factors.  (a) Changed scores (sa-update run)
might have an impact, due to different scores of matching header and
body rules.  (b) The spam in-stream, especially changes, spikes, or
certain spam patterns can make a huge difference.  (c) And of course,
whether you are using bayes_auto_learn_on_error, besides likely others I
just forgot.


In a nutshell: No feedback, thus no inherent acceleration. And most
definitely not logarithmic.


[1] 
http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: Bayes autolearning: logarithmic?

Reply via email to