Matt Kettler wrote to Mike Samba and [EMAIL PROTECTED]: > FWIW I use a combination of two sources for HAM training: > > 1) some selected chunks of my own email (ie: mailing lists not > involving SA, personal email, etc) > > 2) I set up a "nonspamtrap" account, and I've subscribed this to a few > of the newsletters my user's commonly subscribe to.
Good sources. We provide "spam" and "nonspam" accounts for our more pro-active clients to forward spam and ham, particularly messages that were incorrectly classified. As long as they're instructed to forward such messages as attachments, the messages (attachments) come through unmolested. I'm fortunate enough to personally own a domain that is now very close in spelling (same name, different TLD) to a domain used by a large ISP in our region. After seeing the postmaster logs on our email server, I set up an account to catch all of the incoming email on my domain. There are enough mistypes that I get several hundred messages per day for different recipients, including ham, spam, and virii. It's the closest thing to broadly varied user email that we can get without violating our own privacy policy. I have a staff member (otherwise known as our Resident SpamQueen) go through that, as well as our shared email boxes (sales, support, etc), and train the filter. She has no problem finding 1000+ SPAM and HAM weekly. It's done wonders for our filtering. If we didn't have such a good source of email, I guess I'd ask a small percentage of our customers to *voluntarily* allow us to use their accounts to train the filter... at which point we could just have the server FCC all of their messages to another shared mailbox on our system for our bodacious SpamQueen to traverse. That's trivial to implement on most systems. Yes, filtering can be configured on a per-user basis, but we chose to make it as simple for our clients (and as simple for us) as possible, and go site-wide. So, the filtering may not be quite as precise, but at least *we* control the QoS, and we err on the side of caution. It's worked remarkably well. We've been sustaining about 95% correctly filtered, with no false positives. Server-wide, our HAM:SPAM ratio is about 1.5:1. With many personal accounts, though, it's more like 1:15 (90-95% SPAM), after viruses are taken out of the equation (but that's another tangent). We'd be sunk without SpamAssassin. - Ryan -- Ryan Thompson <[EMAIL PROTECTED]> SaskNow Technologies - http://www.sasknow.com 901-1st Avenue North - Saskatoon, SK - S7K 1Y4 Tel: 306-664-3600 Fax: 306-244-7037 Saskatoon Toll-Free: 877-727-5669 (877-SASKNOW) North America
