> -----Original Message----- > From: Sean C Truman [mailto:[EMAIL PROTECTED]] [snip] > The CRC idea sounded good, But the possibility of it > working is fading. [snip]
For another simple approach, something that appears to work very well on one of the mail servers I run is checking the subject headers against a known list of spam subjects, and then tagging the message as spam. You cant just check the subject header for equality against the ones in a known spam subject list as spammers tend to put random words/numbers into the mail they're sending to foil this (this also foils the CRC approach). What you can do, with a small overhead, is do some sort of pattern matching approach similar to the unix diff command to give you a similarity score against each subject heading. A score over a certain threshold gets the email tagged as spam, adding a header to the message that the user can then use to do some filtering. The overhead is probably enough to not be able to do the same check against the body of the message. I havnt used this method long enough to trust it into deleting messages! If you're a Python fan you can use the difflib library (http://www.python.org/doc/current/lib/module-difflib.html) - I'm sure theres equivs in other languages (perl espec.) Marcus -- Marcus Williams - http://www.onq2.com Quintic Ltd, 39 Newnham Rd, Cambridge, CB3 9EY