On Fri, 9 Mar 2012 16:38:49 +0100
Matus UHLAR - fantomas wrote:

> You can of course configure mailer to train automatically on anything 
> received/delivered.  However this would apparently cause much more
> FP's and FN's rate than letting user train only those that misfire.

The use of the word "apparently" never inspires much confidence. I'm
guessing that you don't have any real evidence.


> >If you're going to train on error then train on the right error, not
> >a rarer, correlated error.
> 
> The only error that really matters is the one that causes misfiring.

No, it isn't. Bayes is a statistical filter it needs to learn a lot of
diverse  spam and ham to reach it's optimum accuracy. It's been
demonstrated on Bogofilter that "train-on-everything" outperforms
"train-on-error" on the same corpora. They both end-up with similar
accuracy, but "train-on-everything" gets there very much faster.
Bogofilter is almost identical to BAYES; they just differ in the
details of the tokenizer and the Robinson parameters.

Training on SA miss-classification is going to be glacially slow.

Reply via email to