Jim Maul wrote:
Marc Perkel wrote:
Perhaps what I need to do is to get rid of autolearn and write my own learning system that strips out the body of messages with images and just learns the headers. My problem is that when users get image spam they put it in the spam folders and they get learned. But the text in the image spam causes ham type text to be learned as spam. That causes ham to get higher scores.



Are you sure of this? Have you also trained these ham messages to counter this effect? Not too long ago we were in the same situation. I have autolearn enabled but I have adjusted the thresholds to avoid learning false positives/negatives. We were getting ham (although arguably - they were newsletter type ham) that was hitting BAYES_99. As soon as i started training them as ham the problem went away. Spam is still detected correctly by bayes and these newsletters no longer hit bayes_99.

-Jim


What I think my problem might be is that I have done so much work prescreening messages with Exim that what's left isn't good stock for autolearn. I think what I need is a separate dedicated learner server that is selective and smart about what it learns.

Reply via email to