On 8 Nov 2008, at 00:09, Matt Kettler wrote:
Matt Kettler wrote:
Neil wrote:
So maybe this is moving slightly off on a tangent, but:
Why does auto-learn sometimes learn spam with a rating of X, but not
spam with a rating of X+Y? Where's it's methodology?
First, there's several rules involved here.
To autolearn as spam *ALL* of the following must be met:
-must have at least 3 points from header type rules
-must have at least 3 points from body type rules
-must not already match a low-scoring bayes rule in the existing
training (ie: BAYES_00) This prevents autolearning from
contradicting
existing training.
-After recomputing the score of the message as if bayes and all
userconf
rules were disabled (including changing the scoreset! This makes a
big
difference in some cases.), that score must be over the spam learning
threshold. This prevents bayes from engaging in self-feedback or
feedback based on manual whitelists (which, if misconfigured would
cause
a "bayes hangover" of mis-learned mail).
Generally speaking, the score you see in the message header has
only a
loose correlation with the score used for learning checks.
Oh, one more rule I missed:
-The write lock for the bayes DB must be free. (ie: no other
learning or
expiry going on at the time). It will not block and wait for it, it
will
simply move on, but it will report autolearn=failed instead of
autolearn=no. This prevents autolearning from log jamming your mail
queue.
Thanks for that in-depth description; it helps me have a (less) vague
idea of what I'm doing.
-N.