> Recently I've been getting some false negatives - essentially, > email from friends that is suddenly placed in the questionable, or > suspect folder when many other emails from these friends have been > processed and seen as ham. > > I began by training as I went, no prior collection of email on > which to train.
How balanced is your database? SpamBayes works best with a roughly equal number of ham and spam trained. Unfortunately, retaining this balance can be difficult to do, depending on the incoming mail stream. Since SpamBayes learns quickly, retraining from scratch periodically is sometimes a good idea, or adjusting the thresholds so fewer messages are trained might help. > I'm wondering - should the box that says "rebuild entire database" > be checked by default? That only has any effect when you are training via the "Training" tab in the SpamBayes Manager. If you tick it, then the training that you do will completely replace any existing training. If you don't, then the training will be added on to whatever previous training has been done. > Also, my database is now 2,536 kb - is this too large? How many messages is that? (The General tab of the SpamBayes Manager dialog tells you this). It doesn't sound that large, but I don't use the same database system, so I'm a bit rusty on what a normal bsddb database size is. =Tony.Meyer -- Please always include the list (spambayes at python.org) in your replies (reply-all), and please don't send me personal mail about SpamBayes. http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
