Re: How to disable autolearn for FuzzyOcr?

Marc Perkel Mon, 16 Oct 2006 14:48:06 -0700

Daniel T. Staal wrote:

On Mon, October 16, 2006 3:07 pm, Marc Perkel said:

What need to be done with messages that are spam is to only learn the
headers and not the body of the message. What needs to be done is some
detection of deliberate bayes poisoning and removal of the poison before
larning.


In all honesty: Why?  Bayes, by design, handles that by learning any of
the words that are preferentially in spam or ham, and tossing the rest. 
It is highly unlikely that their attempts at poisoning the database are
going to do anything other than give them a *higher* spam score, and not
affecting your ham much or at all.


Even if you could decide which words would be bayes-poison, it would vary
by each email and each user/database.

Ignore it.  Let Bayes do what it is supposed to do.  The only thing I've
seen that is at all effective against SA's Bayes implementation is empty
messages.  Which are pretty useless, and screenable with other rules.

On my system I was getting so much poison email that it had actually reversd the bayes filter where nonspam was getting a 1.0 score and spam was getting a 0.

Re: How to disable autolearn for FuzzyOcr?

Reply via email to