bayes

Bret Miller Mon, 22 May 2006 08:29:05 -0700

> On Montag, 22. Mai 2006 01:12 Sergei Gerasenko wrote:
> > But I'm reading everywhere that it's not a paricularly good idea.
>
> There's not a single answer to whether which method is best. I use a
> sitewide bayes, and it works good.
>
> sitewide:
> + spam learned helps all users
> + good when trained with 100% correct spam/ham
> - dangerous when learning false spam/ham
>
> user:
> + each user can train themselves
> - users most often don't train good, or not at all, or false (YMMV)
> - performance
> - disk space


We use site-wise bayes here too. While users can report FN's and FP's,
IT staff reviews the submissions prior to actual learning. This prevents
people from learning various e-mail lists they've signed up for as
SPAM-- we just send the report back and say, try unsubscribing first.
The approach has worked fairly well for us here. The number of users
that actually report anything is probably around 5%, so I'd say that a
per-user system would be less effective for our users. (Either that or
the other 95% of users get no spam ever.)

Bret

RE: spamc/spamd/bayes

Reply via email to