When I first started using SA many months ago, I used Bayes for a while, but stopped using it because I seemed to be having "issues" with the values it was assigning. At the time, I didn't have time to properly "feed" it, so I just disabled it.
Recently, I've decided to enable it again. Today, I gave it a go. I am running MailScanner w/ SA v2.63 w/ sendmail. I have /etc/mail/spamassassin/ as my SA directory -- and local.cf in it is softlinked to /opt/MailScanner/etc/spam.assassin.prefs.conf -- so that SA and MS w/ SA always use the same config. I believe that this is the proper way to do it. I went into local.cf and here are the relevant lines within: --- auto_whitelist_path /var/spool/spamassassin/auto-whitelist auto_whitelist_file_mode 0600 bayes_path /var/spool/spamassassin/bayes bayes_file_mode 0600 bayes_auto_expire 0 use_bayes 0 --- I made sure that /var/spool/spamassassin/ was empty and changed use_bayes from 0 to 1 -- and restarted MailScanner. I expected that, at this point, nothing would happen differently -- since I hadn't fed Bayes yet. I then decided to feed it some ham. I have an mbox format file of 2700 ham messages. I did the following: sa-learn --ham --showdots --mbox my_ham_file It chugged at it for a while and then seemed to complete successfully. I then went and looked in /var/spool/spamassassin/ and saw that it had properly created bayes_journal, bayes_seen, and bayes_toks. bayes_toks was about 700k. A few minutes later, I noticed that incoming HAM was being marked as SPAM suddenly! The headers all show BAYES_99 tags! It was suddenly tagging ALL messages as BAYES_99 likely spam. I looked at the bayes dir and bayes_toks had grown to 1.3MB in only a few minutes -- almost double what it was after I had fed it the ham. I assume it is autolearning already? Even though I haven't fed it spam yet? In any case, I then turned use_bayes back off -- and wrote this email trying to determine what is going on. I did a "sa-learn --dump magic" and I get NO output at all. It works for a few seconds and then just goes back to a prompt. "sa-learn -D --dump magic" gives: --- debug: Score set 0 chosen. debug: running in taint mode? yes debug: Running in taint mode, removing unsafe env vars, and resetting PATH debug: PATH included '/usr/bin', keeping. debug: PATH included '/bin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/sbin', keeping. debug: PATH included '/usr/local/bin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/usr/games', keeping. debug: PATH included '/home/jgoggan/bin', keeping. debug: PATH included '/sbin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/usr/games', keeping. debug: Final PATH set to: /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/sbin:/usr/games:/home/jgoggan/bin:/sbin:/usr/sbin:/usr/games debug: using "/usr/local/share/spamassassin" for default rules dir debug: using "/etc/mail/spamassassin" for site rules dir debug: using "/home/johnroot/.spamassassin/user_prefs" for user prefs file debug: Score set 0 chosen. debug: Initialising learner --- That user prefs file is basically empty -- nothing in it that isn't a comment. Do I have to specify /etc/mail/spamassassin/local.cf? I thought it would use that automatically since it has the site rules dir correct. I also tried doing the "--dump magic" while specifying the DBPATH and such just to be sure -- no difference. Any thoughts/suggestions/corrections? Also -- can I turn autolearn off somewhere? I only want Bayes to learn from emails that I specifically FEED with sa-learn --ham and --spam. Thanks much! - John...
