On 8 Nov 2008, at 00:09, Matt Kettler wrote:

Matt Kettler wrote:
Neil wrote:

So maybe this is moving slightly off on a tangent, but:
Why does auto-learn sometimes learn spam with a rating of X, but not
spam with a rating of X+Y?  Where's it's methodology?


First, there's several rules involved here.

To autolearn as spam *ALL* of the following must be met:

-must have at least 3 points from header type rules
-must have at least 3 points from body type rules
-must not already match a low-scoring bayes rule in the existing
training (ie: BAYES_00) This prevents autolearning from contradicting
existing training.
-After recomputing the score of the message as if bayes and all userconf rules were disabled (including changing the scoreset! This makes a big
difference in some cases.), that score must be over the spam learning
threshold. This prevents bayes from engaging in self-feedback or
feedback based on manual whitelists (which, if misconfigured would cause
a "bayes hangover" of mis-learned mail).

Generally speaking, the score you see in the message header has only a
loose correlation with the score used for learning checks.


Oh, one more rule I missed:

-The write lock for the bayes DB must be free. (ie: no other learning or expiry going on at the time). It will not block and wait for it, it will
simply move on, but it will report autolearn=failed instead of
autolearn=no. This prevents autolearning from log jamming your mail queue.

Thanks for that in-depth description; it helps me have a (less) vague idea of what I'm doing.

-N.

Reply via email to