On Thu, 2002-09-26 at 22:11, Ben Beeson wrote: > Aloha, > > I am pondering SpamAssassin for my box. The volume I now get after > becoming > a moderator on a mailing list is pretty disgusting. I find it even more > depressing that most of the UCE does not have opt-outs. Since I can't > opt-out, I need a better filter.
Please be aware that you should NEVER opt-out! In most cases it may stop that particular source of spam, but many sell their opt-out lists to other spammers because they are known and confirmed to be active addresses. This is also the reason why Linux mail clients do not display images in incoming mail by default. These images can easily have uniquely identifiable codes in the URL that can tell the spammers "I'm an active legit address. Spam me more!" > > I read the fine docs and understand that I need procmail and a few Perl > modules etc to work with SpamAssassin. What I am not sure of is whether or > not I need to use a tool like fetchmail to fetch my mail form the ISP before > can filter it. Anybody familiar with this process? > 1) fetchmail downloads mail from your POP3 accounts 2) procmail does general filtering, you can use simple regular expression matching to filter stuff into different mailboxes (like for mailing lists). 3) procmail can forward all remaining messages after your mailing list rules to spamassassin. 4) spamassassin uses intelligent analysis of the headers and body of the message to calculate a spam "score". If the score is above a certain configurable threshold it can be filtered by procmail. I set my procmail to put score of 5.5 or higher into my SPAM folder, which I review every once in a while just to make sure spamassassin didn't guess wrong. If you set your spam threshold too low, there is a chance that it may incorrectly identify legit mail as spam so be careful. Most folks should probably set their threshold to perhaps 10 or 15, and add Vipul's Razor to spamassassin's checks in order to increase spam detection accuracy. I've seen perhaps 99.99% effective spam filtering, with only 2 out of 2000 filtered messages being false positives (incorrectly identified as being spam). Nobody should use SpamAssassin without studying how it works and carefully adjusting settings. I plan on deploying it site-wide for several organization e-mail servers this year. I will use more liberal settings that may only be 95% effective in spam filtering, but that should reduce the chance of false positives to nothing. http://www-106.ibm.com/developerworks/linux/library/l-spam/?t=gr%2clnxw03=StampSpam Here's a very helpful IBM article about SpamAssassin, with links to more helpful information. http://razor.sourceforge.net/ You should also take a look at how Vipul's Razor works, it is another very interesting spam filtering system that uses a distributed checksum network. Very advanced stuff, and it is fairly easy to add Razor filtering to SpamAssassin's several checks.