Re: Bayes overtraining

David Jones Wed, 25 Jul 2018 11:00:01 -0700

On 07/25/2018 12:49 PM, Daniele Duca wrote:

Hi,
I'm evaluating incorporating CRM114 in my current setup and I wasreading the FAQs about training the filter here:http://crm114.sourceforge.net/src/FAQ.txt
What made me rethink my actual strategy were the following lines:

...

If you train in only on an error, that's close to the minimal change
necessary to obtain correct behavior from the filter.

If you train in something that would have been classified correctly
anyway, you have now set up a prejudice (an inappropriately strong
reaction) to that particular text.

Now, that prejudice will make it _harder_ to re-learn correct behavior on
the next piece of text that isn't right.  Instead of just learning
the correct behavior, we first have to unlearn the prejudice, and
_then_ learn the correct behavior.
...
In my current SA setup I use bayes_auto_learn along with some custompoison pills (autolearn_force on some rules) , and I'm currentlywondering if over training SA's bayes could lead to the same "prejudice"problem as CRM114.
I'm thinking that maybe it would be better to use"bayes_auto_learn_on_error 1"
What is your preferred strategy? Train everything you can or train onlyerrors?
Daniele

I personally found in our customer mail flow that CRM114 and Bogofilterdidn't help that much. We well-trained Bayesian DB with good metarules, RBLs (Invaluement is a must) along with MTA checks/blocks haveworked out to be spot on for my mail flow.


--
David Jones

Re: Bayes overtraining

Reply via email to