Re: bayes autolearn acting up

lists Thu, 24 Aug 2006 10:22:29 -0700

Since upgrading to 3.14, when I turn on bayes auto-learn with:


bayes_auto_learn 1

and I set the learn boundaries with:

bayes_auto_learn_threshold_nonspam    -3.5
bayes_auto_learn_threshold_spam       15.5

I get unexpected auto-learning.  Example:  I just saw a spam come
through that scored 9.9, which is enough for it to be tagged as spam,
but it should not be auto-learned as spam.  But, in the header it
clearly reads:

X-Spam-Status:
Yes, score=9.9 required=5.0 tests=AWL,BAYES_99,
DATE_IN_PAST_03_06,DCC_CHECK,DIGEST_MULTIPLE,HTML_40_50,HTML_MESSAGE,
MIME_HTML_ONLY,RAZOR2_CHECK,RCVD_IN_WHOIS_INVALID autolearn=spam
version=3.1.4


Any ideas?

SA does not autolearn based on the final message score. So, tossthe 9.9

out the window. That's not the number SA compares to the 15.5.

For learning SA uses what the message score would have been if: 1) the
AWL is off. 2) Bayes was disabled, including shifting what scoreset is

used for all the other rules. 3) all white/blacklists are disabled.This

is often *quite* different from the final score.

However, in this case I don't entirely understand... The default SA3.1

scores are:

score DATE_IN_PAST_03_06 0.736 0 1.122 0.478
score DCC_CHECK 0 1.37 0 2.17
score DIGEST_MULTIPLE 0 0.233 0 0.765
score HTML_40_50 0.611 0 0.497 0.496
score HTML_MESSAGE 0.001
score MIME_HTML_ONLY 0.414 0.001 0.389 0.001
score RAZOR2_CHECK 0 0.5 0 0.5
score RCVD_IN_WHOIS_INVALID 0 2.151 0 2.234

Adding the set1 scores up, the learning score should have been 4.753.

Have you modified any rule scores?

Thanks for trying to help Matt. No, I don't think I have changed anyof those scores. I understand the basics of how the autolearnworks. For a long time, with the settings above, it would usuallyonly autolearn spams with extremely high scores (well over 15). Now,basically EVERY mail tagged as spam is being autolearned as spamwhether it has scored 30 or 5.2. The other weird issue is thatanything that is not being tagged as spam is also being autolearnedas ham. (i.e. mails with scores of 3.5) which is absolutely notwhat I want.


Thanks,
Devin

Re: bayes autolearn acting up

Reply via email to