On Wed, Jul 14, 2004 at 11:43:03AM -0700, Kelson Vibber wrote: > >Meaning what? That there are razor blacklisted messages that are false > >positives ? > > Yes. Yesterday's announcement of the first Fedora Core 3 test release, for > instance.
For questions like this, looking at the rules/STATISTICS* files from the SA distro is useful. In this case, set1 (network, no bayes) shows: OVERALL% SPAM% HAM% S/O RANK SCORE NAME 495260 343948 151312 0.694 0.00 0.00 (all messages) 100.000 69.4480 30.5520 0.694 0.00 0.00 (all messages as %) 47.918 68.8645 0.3033 0.996 1.00 1.55 RAZOR2_CF_RANGE_51_100 54.243 77.7344 0.8459 0.989 0.99 0.90 RAZOR2_CHECK 6.294 8.8711 0.4362 0.953 0.81 0.56 RAZOR2_CF_RANGE_11_50 which basically says that out of the 344k spam checked, razor caught ~77.7% of them, but also caught ~0.85% of the 151k ham, for a rough overall accuracy (S/O) of ~98.9%. So, much like Bayes, the messages that are really spam probably hit a bunch of other rules as well, so the score generator can lower the razor score to remove some false positives while still having a sum score for the mails above the threshold of 5. In short: So far, no anti-spam method is 100% accurate for everyone. If you have good luck (aka: no false positives), then feel free to up the score to whatever you want. But don't complain when this bites you in the a**. Example: a mailing list I'm on, but rarely post to, decided to do "score HABEAS_SWE 20". which is up to them, since they apparently don't receive mails with the SWE in them. however, I use the SWE in my mails, and so my posts are always flagged as spam when posting to the list. -- Randomly Generated Tagline: "Don't bite the mailman." - Dave Matthews
pgpQcjjc4qvcz.pgp
Description: PGP signature
