On Tue, April 7, 2009 07:10, Thomas Hruska wrote: > [email protected] wrote: >> Keith> How many hams and spams have you trained on? >> Keith> -Quite a few , around 350 spam mails, hams around 4500. >> >> >> This is way out-of-balance. Typically SpamBayes works best with >> roughly equal numbers of ham and spam.
I have 14005 spam and 2679 ham. That's way out-of-balance too, but I can't say that Spambayes isn't working good enogh for me. No complaints here. > While I agree that this is out of balance, Spambayes seriously needs to > get its act together and stop allowing users to train on imbalances or > messages classified correctly and allows users to reset the database > periodically (the POP3 proxy server seriously needs a feature that allows > you to do a complete reset of the database within the UI itself). I use the procmail filter on Linux. No fancy GUI for me. I have never reset my db, but I think it's just a simple matter of rm'ing the db file. > The rule of thumb I follow is: Train on only one spam in ham and one > ham in unsure. Skip training on messages I plan on filtering using my > e-mail client (i.e. no point in training on messages I'm going to > whitelist in the first place). That's what I do too. I have some procmail rules that come before the Spambayes incantation. > Once I reach about 300 of each type, reset > the database and start over. > > My problem is that 99.9% of my incoming mail is spam, so there is an > imbalance by default. I am forced to delete unsures because the imbalance > is so great. IMO, 'unsure' is an inappropriate word choice for the > category. It causes many users to feel they need to tell Spambayes what > is ham and spam. This, in turn, creates the imbalances they then > experience. > > When was the last update to Spambayes? Time for a new version! Are you using the stable version or the beta version? -- Amedee _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
