On Monday 30 August 2004 09:24 pm, Rob Blomquist wrote: > On Monday 30 August 2004 9:34 pm, John Andersen wrote: > > On Monday 30 August 2004 05:40 pm, Rob Blomquist wrote: > > > I am getting a about 2% of my mail that is clean HAM marked as spam by > > > BAYES_99 as being 99-100% spam. > > > > > > What can I do about it? How does BAYES_99 pick spam? I assume its a > > > baysian filter.... > > > > > > Rob > > > > You need to save these mails that are falsely called spam > > and use sa-learn to teach bays database that they are not > > spam. Bayes filters need to be trained before they can > > be trusted 100% > > > > man sa-learn has some info about this. > > sa-learn runs against all my mail at exactly 18:42 everyday. And it never > seems to be getting it right. I ran it against 11,000 ham messages the > night I reinstalled it, but still no help. > > Rob
Rob, you have to feed sa-learn two different bunches of mail. One known spam and the other known ham. You have to continually train it with missed spam/ham. I do this by having users create two separate maildirs in their Mail directory, one named NotSpam and the other named MissedSpam. Then I run the following script nightly via cron. I had to hack it a bit because the original was for mbox, not maildir #!/usr/bin/perl ################################################################### # A script to automatically update SpamAssassin's Bayesian filter # Michael Reynolds - [EMAIL PROTECTED] # SpinWeb Net Designs - http://www.spinweb.net ################################################################### # set some variables $SA_LEARN = "/usr/bin/sa-learn"; $HOME = "/home"; $FOLDER_DIR = "Mail"; $MISSEDSPAM_FOLDER = "MissedSpam"; $NOTSPAM_FOLDER = "NotSpam"; # get a listing of users @user = `ls -1 $HOME`; # loop and process for($i=0;$i<@user;$i++) { # trim carriage return chop($user[$i]); # define where ham is located my $user_notspam_folder = "$HOME/$user[$i]/$FOLDER_DIR/$NOTSPAM_FOLDER/cur"; # if the folder exists, learn from it if(-e $user_notspam_folder) { system("$SA_LEARN --ham $user_notspam_folder/*"); system("rm $user_notspam_folder/*"); } # define where spam is located my $user_missedspam_folder = "$HOME/$user[$i]/$FOLDER_DIR/$MISSEDSPAM_FOLDER/cur"; # if the folder exists, learn from it if(-e $user_missedspam_folder) { system("$SA_LEARN --spam $user_missedspam_folder/*"); system("rm $user_missedspam_folder/*"); } } # rebuild the database system("$SA_LEARN --rebuild"); --------------------------end Note, whereever you see a trailing = sing, wrapping took place above. -- _____________________________________ John Andersen
pgpBtuonqCLPx.pgp
Description: signature
