Thanks, Skip. > "default_bayes_customize.ini".
It's in "C:\Program Files\SpamBayes\bin\" > Once you find it, just add the options I mentioned > to the [Tokenizer] section and restart. Is there any means of directly testing that the settings applied are actually taking effect? > Were you the person with, like, 60,000 spams and a > similar number of hams in your training set? Maybe > try retraining from scratch. I have a total of > about 400 emails in my training set and it works > fine. Yes. I'm concerned about the volume of spam I might receive if I were to try starting with a clean database. I get over 4,000 messages a day, with well over half of that being spam that I receive with the express purpose of analyzing spam to train my server to more efectively filter it. Starting with a blank database, even if it were significantly fine-tuned within the first day would leave literally thousands of spam messages untrained in a single week. Right now I'm having about 20-25 spam messages make it to my inbox each day after training with the 60k message ham+spam archives. At 4k messages per day and probably 2500-3000 of them being spam - 20-25 is at or less than 1% getting through. I can live with that. It's far better than having a couple hundred per day. I am considering backing up my current database and trying a new one (with the current one available as a fallback). On a very timely related note, the following article was publicized by Frisk Software today: http://www.secureworks.com/analysis/spamthru/ It discusses the use of virus-infected botnets for spamming. Regards, Shawn K. Hall http://12PointDesign.com/ _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
