On Wed, 18 Mar 2015 23:57:13 +0100
Reindl Harald wrote:

> 
> Am 18.03.2015 um 23:34 schrieb RW:
> > On Wed, 18 Mar 2015 22:46:14 +0100
> > Reindl Harald wrote:
> >>
> >> frankly i trained over months with *hand chosen* mail smaples and
> >> spent nearly two weeks day and night to remove bayes-posioning from
> >> the samples and rebuild bayes from scratch leading in reduce the
> >> ntokens from 1700000 to 1500000
> >
> > Why did you remove the Bayes-poison?
> 
> because now BAYES_00 in case of legit mail is at 87% of all scanned 
> messages, BAYES_50 dropped from 10% to 4% and the milter-rejects are 
> still at around 8-10% with just 10 instead 150 flagged message on a 
> userbase with 1200 vaild RCPT's
> 
> because finally the bayes has a quality that it needs few to no
> further training at all in combination with other filters
> 
> over the long the poision leads in more and more legit mail becoming
> a higher score as deserved, the FP rate increases and at the end you
> need to lower the reject score passing more junk because user
> complaints - at that point the spammers won, you need to reset bayes
> sooner or later and start from scratch with training
> 
> that's not theory, i observed that behavior over many years with 
> commercial appliances using SA behind the scenes and enabled
> auto-learning

This is nothing to do with auto-learning. There is a difference between
miss-training and training with spam that contains so-called "Bayes
poison".  Bayes is best trained on what is in real-world spam, not
what we would prefer that spammers put in spam. 

Reply via email to