Hi Tom, I don't post very often to the list, since there are others around that have more expertise and time than I do to offer suggestions and fixes. I also haven't used any of what I learned in my university statistics courses for at least 10 years :) But I thought I might weigh in on this idea of multiple statistical filters to try and achieve better accuracy.
While I like the idea of a multi-layered approach, we do so using some of the built-in features of Postfix, then Amavis, and then DSPAM. 90% of our site's (175 users) spam is blocked before it even reaches DSPAM. It works pretty well and we're way over 99% accuracy. The problem I think, with a multi-dspam setup as you described, is that DSPAM, (and perhaps any statistical filter?), relies on being able to "see" a fairly balanced set of spam and ham messages, in order to get good accuracy. If it sees too much spam (high spam ratio) it causes problems, and vice versa. This would probably happen to any subsequent DSPAM filter you put in place - it would receive hardly any spam, so long as the first DSPAM instance was doing it's job well. So, for example, if the first layer of your multi-DSPAM approach was getting say 99% accuracy, that would make the second layer's spam ratio 1/100, which would make statistical filtering difficult. I suppose you could try to chain together some kind of script that might corpus train the 2nd DSPAM instance with spams caught in the first...but that seems like a lot of extra work for minimal gains. Just my opinion. If you do decide to give it a try and get any significant gains, please drop me a line. I'd be interested to see how you did it. Jeff -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Allison Sent: January 15, 2007 8:35 AM To: [email protected] Subject: Re: [dspam-users] multiple use? > > But if it was truely a bad idea then why do so many > people use > multiple filters to capture spam? > > Do they? Is recycling the same message base repeatedly through the > same badly configured filter using "multiple filters"? If you want to > use multiple filters, then use multiple filters. > > --Tonni > Lighten up just a little bit, OK? Yes, there are a LOT of people who use more than one tool for filtering spam and with good reason. Each has a slightly different approach to solving the problem and together they can be 99.99% by any measure. And many of these people know a hell of a lot more than you give them credit for. How do you know that they are using badly configured filters. !DSPAM:1,45abad5794481437636767!
