I use Bayes for site-wide and love it.  I have a recipe in my Procmail to
grab any message that scores between 6 and 10, and store it in a "suspect"
MBOX.  Once a week I look through this for false positives and move to a
"ConfirmedHam" MBOX, and move the rest to "ConfirmedSPAM" MBOX.  I look at
the false positives to see why they got caught.  Since I'm sitewide, I'd
rather let through a few SPAMs than any false positives.

I run Bayes, RBLs, and DCC. 

I use a threshold of 6.5, as this gives a 0.01% FP rate.

I run this on a machine with an 800MHz CPU, RH7.3, 384MB RAM. With Postfix
and John Hardin's Sanitizer script for Procmail.  I run SA through Procmail.


My daily processing stats are:
Total number of emails processed by the spam filter : 43724
Number of spams                         :     29792 ( 68.14%)
Number of clean messages                :     13932 ( 31.86%)
Average message analysis time           :      4.08 seconds
Average spam analysis time              :      3.74 seconds
Average clean message analysis time     :      4.70 seconds
Average message score                   :     12.38
Average spam score                      :     21.93
Average clean message score             :     -5.08

Never have a performance problem

My users are very sensitive to P0rn and V1agra ads, so I've jacked these
values up at the risk of catching foul real messages.  We're a company, so
that's not a problem.

As a question for the original poster, why not build a corpus and run the GA
scoring engine yourself?  That's what I'm working on now to improve my
local, real world scores.  No more guesswork.  I will know the rules with
little FP, and can crank them up.

<<Dan>>


 

| -----Original Message-----
| From: Terry Milnes [mailto:[EMAIL PROTECTED] 
| Sent: Tuesday, November 11, 2003 7:08 AM
| To: David B Funk
| Cc: [EMAIL PROTECTED]
| Subject: Re: [SAtalk] scoring system and values...
| 
| I have been considering using the bayes site wide, however I 
| have seen a lot of opinions that oppose its use this way. 
| Furthermore I did/do have doubts as to how well it would work.
| 


-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to