It doesn't really work that way. Bayes is just one part of the picture and in order to get good results you have to turn the full toolkit loose on the problem; I'm not sure Bayes by itself should be expected to achieve 95% recognition anyway. The main flaw in your current plan is that once you re-activate the BLs then your Bayes content will begin to get stale - and effectiveness is likely then to decline over time. Bayes tends to work better when trained continuously on current traffic. Rather than stop using other tools, just to get some spam to train with, perhaps you should focus more on training Bayes more actively with the spam that gets through otherwise.
You're not likely ever to detect ALL the spam traffic, no matter what combination of tools you deploy - there will always be clever spammers working on ways to bypass them. >>> tonjg <t...@freeuk.com> 03/18/10 11:04 AM >>> Matus UHLAR - fantomas wrote: > >> DNS available? >> no > > well, why? DNS helps very much for catching spam. all blacklists use DNS > (afaik) sorry, when you said dns I didn't know you were referring to the dnsbl's. I know the black lists are excellent for filtering spam but I've got those switched off so I can actually accumulate some spam for the sa-learn. I figured if I get spamassassin working really well first (ie: a 95% success rate) I would then switch the bl's back on and use both.