Jack Gostl wrote:
Well... I'm convinced. I turned off autolearn a week ago, and things have never been smoother. Its a shame really, that's a nice feature, but for some reason it waters down the Bayes resolution until its almost useless.


Most likely because the autolearn thresholds are too generous. The possibility to autolearn spam as ham and/or ham as spam is too great. I have been running with autolearn enabled, my thresholds set to:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0

without any problems for almost 3 years now. My bayes database has never been better. I think too many people have problems with it because of the defaults and instead of trying to figure out how to make it work better, they just turn it off and call it "broken".

-Jim


----- Original Message ----- From: "Jack Gostl" <[EMAIL PROTECTED]>
To: "Anthony Peacock" <[EMAIL PROTECTED]>; "SpamAssassin" <users@spamassassin.apache.org>
Sent: Monday, February 05, 2007 7:06 AM
Subject: Re: Bayes resolution gettin weaker



----- Original Message ----- From: "Anthony Peacock" <[EMAIL PROTECTED]>
To: "SpamAssassin" <users@spamassassin.apache.org>
Sent: Monday, February 05, 2007 3:56 AM
Subject: Re: Bayes resolution gettin weaker


Hi,

Jack Gostl wrote:
I've been watching this for awhile, and there is now a pattern to what I'm seeing.

I'm running a configuration with multiple users sharing a bayes files. This is an interim move to facilitate the spamassassin upgrades, and like many interim moves its been going on for a long time.

When I first build the bayes files from my personal folders and my spam archives, things were great. 99.8% of the spam caught or better. Then, usually after a week or so, the number starts to drop. Right now, its down to 97%, in another day or two it will be down below 95%. With the amount of spam we receive, that is a lot of missed junk mail.

So I blow away my bayes* files, rebuild, and I'm back up to darn near 100% caught. For about a week. Then the deterioration begins again.

Has anyone else encountered this? Is this an artifact of too many users sharing a spam file?

Also.... I retrain each night, feeding any missed spams plus any new hams received back through sa-learn. I can't see how that makes it worse, but who knows.

Do you have autolearn enabled?

Uh... yes? You are suggesting that I turn it off? I had always assumed that if the Bayes learned something as ham that it shouldn't, sa-learn was smart enough to undo it.

Change the thresholds for auto learning.  Mine are:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0

I'm willing to try. I made the change in my user_prefs and we'll see what the next week brings.

Thanks







Reply via email to