Since upgrading to 3.14, when I turn on bayes auto-learn with:

bayes_auto_learn 1

and I set the learn boundaries with:

bayes_auto_learn_threshold_nonspam    -3.5
bayes_auto_learn_threshold_spam       15.5

I get unexpected auto-learning.  Example:  I just saw a spam come
through that scored 9.9, which is enough for it to be tagged as spam,
but it should not be auto-learned as spam.  But, in the header it
clearly reads:

X-Spam-Status:
Yes, score=9.9 required=5.0 tests=AWL,BAYES_99,
DATE_IN_PAST_03_06,DCC_CHECK,DIGEST_MULTIPLE,HTML_40_50,HTML_MESSAGE,
MIME_HTML_ONLY,RAZOR2_CHECK,RCVD_IN_WHOIS_INVALID autolearn=spam
version=3.1.4


Any ideas?
SA does not autolearn based on the final message score. So, toss the 9.9
out the window. That's not the number SA compares to the 15.5.

For learning SA uses what the message score would have been if: 1) the
AWL is off. 2) Bayes was disabled, including shifting what scoreset is
used for all the other rules. 3) all white/blacklists are disabled. This
is often *quite* different from the final score.

However, in this case I don't entirely understand... The default SA 3.1
scores are:

score DATE_IN_PAST_03_06 0.736 0 1.122 0.478
score DCC_CHECK 0 1.37 0 2.17
score DIGEST_MULTIPLE 0 0.233 0 0.765
score HTML_40_50 0.611 0 0.497 0.496
score HTML_MESSAGE 0.001
score MIME_HTML_ONLY 0.414 0.001 0.389 0.001
score RAZOR2_CHECK 0 0.5 0 0.5
score RCVD_IN_WHOIS_INVALID 0 2.151 0 2.234

Adding the set1 scores up, the learning score should have been 4.753.

Have you modified any rule scores?


Thanks for trying to help Matt. No, I don't think I have changed any of those scores. I understand the basics of how the autolearn works. For a long time, with the settings above, it would usually only autolearn spams with extremely high scores (well over 15). Now, basically EVERY mail tagged as spam is being autolearned as spam whether it has scored 30 or 5.2. The other weird issue is that anything that is not being tagged as spam is also being autolearned as ham. (i.e. mails with scores of 3.5) which is absolutely not what I want.

Thanks,
Devin

Reply via email to