> From: Monte > Sent: Monday, January 09, 2006 10:16 PM <...>
> So, does it do any good to retrain on the correct ones, or > does it just take a little extra time. As Tim said, this is more complicated than it first looks. One thing to keep in mind is to avoid a large imbalance in the number of trained ham and spam. Though there have been some notable exceptions, it seems that most people get the best results when the number of trained ham and spam are similar. I don't think anyone has answered the question of how "balanced" the number of trained ham and spam need to be before performance suffers. My best SWAG (sophisticated *-* guess) is that if you can keep them within a 2:1 ratio, you are probably in good shape. I try to keep it closer than that, but I can't prove it helps. -- Seth Goodman _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
