Jim Maul wrote:
Marc Perkel wrote:
Perhaps what I need to do is to get rid of autolearn and write my own
learning system that strips out the body of messages with images and
just learns the headers. My problem is that when users get image spam
they put it in the spam folders and they get learned. But the text in
the image spam causes ham type text to be learned as spam. That
causes ham to get higher scores.
Are you sure of this? Have you also trained these ham messages to
counter this effect? Not too long ago we were in the same situation.
I have autolearn enabled but I have adjusted the thresholds to avoid
learning false positives/negatives. We were getting ham (although
arguably - they were newsletter type ham) that was hitting BAYES_99.
As soon as i started training them as ham the problem went away. Spam
is still detected correctly by bayes and these newsletters no longer
hit bayes_99.
-Jim
What I think my problem might be is that I have done so much work
prescreening messages with Exim that what's left isn't good stock for
autolearn. I think what I need is a separate dedicated learner server
that is selective and smart about what it learns.
- Re: Is Bayes Dead? Have the spammers won? Marc Perkel
-