On Wed, 18 Mar 2015 23:57:13 +0100 Reindl Harald wrote: > > Am 18.03.2015 um 23:34 schrieb RW: > > On Wed, 18 Mar 2015 22:46:14 +0100 > > Reindl Harald wrote: > >> > >> frankly i trained over months with *hand chosen* mail smaples and > >> spent nearly two weeks day and night to remove bayes-posioning from > >> the samples and rebuild bayes from scratch leading in reduce the > >> ntokens from 1700000 to 1500000 > > > > Why did you remove the Bayes-poison? > > because now BAYES_00 in case of legit mail is at 87% of all scanned > messages, BAYES_50 dropped from 10% to 4% and the milter-rejects are > still at around 8-10% with just 10 instead 150 flagged message on a > userbase with 1200 vaild RCPT's > > because finally the bayes has a quality that it needs few to no > further training at all in combination with other filters > > over the long the poision leads in more and more legit mail becoming > a higher score as deserved, the FP rate increases and at the end you > need to lower the reject score passing more junk because user > complaints - at that point the spammers won, you need to reset bayes > sooner or later and start from scratch with training > > that's not theory, i observed that behavior over many years with > commercial appliances using SA behind the scenes and enabled > auto-learning
This is nothing to do with auto-learning. There is a difference between miss-training and training with spam that contains so-called "Bayes poison". Bayes is best trained on what is in real-world spam, not what we would prefer that spammers put in spam.