Re: How to disable autolearn for FuzzyOcr?

Frank Bures Tue, 17 Oct 2006 06:17:04 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 16 Oct 2006 15:16:19 -0400 (EDT), Daniel T. Staal wrote:


>On Mon, October 16, 2006 3:07 pm, Marc Perkel said:
>> What need to be done with messages that are spam is to only learn the
>> headers and not the body of the message. What needs to be done is some
>> detection of deliberate bayes poisoning and removal of the poison before
>> larning.
>
>In all honesty: Why?  Bayes, by design, handles that by learning any of
>the words that are preferentially in spam or ham, and tossing the rest. 
>It is highly unlikely that their attempts at poisoning the database are
>going to do anything other than give them a *higher* spam score, and not
>affecting your ham much or at all.
>
>Even if you could decide which words would be bayes-poison, it would vary
>by each email and each user/database.
>
>Ignore it.  Let Bayes do what it is supposed to do.  The only thing I've
>seen that is at all effective against SA's Bayes implementation is empty
>messages.  Which are pretty useless, and screenable with other rules.
>
>Daniel T. Staal


After a week of running FuzzyOCR I have to agree.  I take back my original 
query :-)  Everything seems to be perfectly fine with Bayes.  Processing some 
100k messages a day.


Frank Bures, Dept. of Chemistry, University of Toronto, M5S 3H6
[EMAIL PROTECTED]
http://www.chem.utoronto.ca
PGP public key: http://pgp.mit.edu:11371/pks/lookup?op=index&search=Frank+Bures
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 5.0 OS/2 for non-commercial use
Comment: PGP 5.0 for OS/2
Charset: cp850

wj8DBQFFNMmrih0Xdz1+w+wRAjGXAJsErRRwkrV9OSDUo8QkrKVYJUtIugCfbolD
v+79zSpDu27WPsxtD0ohHqs=
=cVPK
-----END PGP SIGNATURE-----

Re: How to disable autolearn for FuzzyOcr?

Reply via email to