On Jan 20, 2009, at 9:39, Karsten Bräckelmann <guent...@rudersport.de> wrote:

On Tue, 2009-01-20 at 16:52 +0100, Matus UHLAR - fantomas wrote:
On 20-Jan-2009, at 08:04, Karsten Bräckelmann wrote:
You should also train low scoring (tagged) spam. Or, even better, train
those identified spam with a "low" Bayes score. Similar for ham.

On 20.01.09 08:19, LuKreme wrote:
I thought tagged spam was automatically learned by bayes?

Isn't that what bayes_auto_learn does?

Only if you set it up so, and only if it fullfills some expectations, e.g. some minimal score, some minimal score by header checks, some minimal score
by body checks...

Total score of at least 12, body and header 3 each -- by default. This
is a safety measure. Also this is without Bayes and AWL. Plus some more
esoteric constraints I forgot.

Thus, yes, it makes perfect sense to manually learn low scoring spam.


manual training on any FPs/FNs that were not correctly autolearned from is a
good idea.

Of course. Though those are the extremes only. Again, it also makes
sense to learn *correctly* classified mail, if it isn't auto-learned.
Even more so, if the Bayes value is close to 0.5 -- that's BAYES_50.

Teaching your Bayes about what's ham and spam, especially in the gr ay
area, will improve the results.

Thanks everyone. I will reässess how I throw spam and ham at sa-learn. Learn something everyday, right.

--
20-Jan-2009: It's a Brand New Day

Reply via email to