Matt Kettler wrote:

At 03:56 PM 7/17/2004, Rakesh wrote:

But the kind of feedback mechansim i have implemented in not reading from any inbox folder but works like this any mails to [EMAIL PROTECTED] is fed to the sa-learn using a perl wrapper script. When i checked my logs for a possible HAM feedback during the time period, I didnt find a single entry for HAM feedback which left me in more dilemma.


What about autolearning? Did you check for that? Recent versions of MailScanner will insert autolearn flags into the spam-hits header.

Yeah Autolearn may be the primary factor involved in the Bayes messup, Well will it be ok if i do an autolearn only for SPAM mails and not for HAM mails. But I think i may be wrong in this as doing an Autolearn only for SPAM messages will give rise to a lot of false positives as the Spammers have started using words and phrases that make their mails look more HAMMY. I don't what to do ? I think i am getting confused. Shall I do one thing. Force Expire my Bayes database and start building a new database a fresh. Suggestions Please ?



My next suspect is the Bayes DB expiry. I have read in many documentation that we need expire and rebuild the Bayes DB for old tokens to save disk space from being eaten up. But since i had a lot of hard drive space i decided not to expire the database and now my database size is 39 M.


OUCH.. don't circumvent the expiry mechanism if you don't understand it's full purpose. It's actually rather important because it weeds-out garbage tokens.

A bayes DB that never expires is *highly* vulnerable to bayes poisoning.

Well I sorry towards my wrong approach to Expiry Mechanism.

regards
Rakesh



Reply via email to