"Jason R. Mastaler" <[EMAIL PROTECTED]> writes:

> User Witr <[EMAIL PROTECTED]> writes:

> > The reason I wanted to do in in the middle of the TMDA pipeline is
> > that the (potentially expensive) classification operation could be
> > avoided in all but the relatively few cases where confirmation was
> > indicated (about 10% of the time for me).

> The success of the Bayesian technique that Paul Graham describes
> depends on a rich history of both spam and non-spam messages being
> assembled.  Thus, you'd want it to see all your mail, not just 10% of
> it.

Good point, but I'm afraid that Paul Graham's technique also relies on
having a "delete-as-spam" button, in addition to an ordinary "delete"
button.

So, if it was filter-then-TMDA, a problem could be that TMDA saved you
from seeing the incoming spam.  You would never hit the delete-as-spam
button, and thus you would never improve the filter.

Worse, it might even be the case that the filter would assume, since
you never selected delete-as-spam, that the spam that got through
wasn't even spam.  The filter might get worse and worse...

Does this make sense?

I guess you could periodically "reseed" your filter's repositories of
spam and non-spam, but that could cause other problems that Paul
mentions in his article.

- Sam
_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users

Reply via email to