Re: A New Approach: Find the Ham

2007-02-12 Thread michael moncur

I agree that this isn't going to be the best approach. Detecting ham
is simply more difficult:

1. New types of ham emerge more often than new types of spam. Spammers
generally stick to tried-and-true subjects while ham is all over the
place.

2. Ham is more personalized than spam. Everyone gets very similar
spam, but nobody gets the same mix of ham messages that I get.

3. Ham has a much greater range of potential subjects and patterns
than spam. For all the spam, nobody's doing anything creative like
trying to sell fountain pens or beverage dispensers or books of poetry
with spam - it's all fake rolexes and cheap pharmaceuticals. Ham, on
the other hand, has a million potential subjects and you get
one-of-a-kind messages every day.

4. Spammers will have an easier time faking ham characteristics than
removing spam characteristics, which may be endemic to their methods
(spamming software, botnets, etc.)

5. Network effects are very helpful with spam (DNS blacklists, Razor,
etc.) but not very helpful with ham.

Of course, ham rules are helpful - especially personalized ones. I use
a bunch. But they're best used with the existing framework of spam
detection.


Configuration tool updated

2005-10-07 Thread michael moncur
My SpamAssassin Configuration Tool, which is linked from the
SpamAssassin site and hasn't worked with 3.0 or 3.1, has finally been
updated.http://www.yrex.com/spam/spamconfig.php

It now works with 3.0 or 3.1, although 3.1 will need some edits to
v310.pre for razor/dcc/textcat along with the output of my script.

The old (SpamAssassin 2.5) version is still available here:

http://www.yrex.com/spam/spamconfig25.php

Please let me know if this doesn't work for anyone, or if there's a commonly-used setting that it lacks.

--
Michael Moncur - mgm at starlingtech dot com