RE: Bayes Scores Skipped/Not Applied: HAPPY RESOLUTION
Hi Matt, I resolved the issue. Thanks for pointing me in a different direction- the rubber has not been meeting the road for about a week on this issue! After upgrading using CPAN I am getting BAYES scores (among others from the /usr/share/spamassassin dir). So apparently it was an installation issue. Dec 23 18:45:12 unixserv0 MailScanner[24894]: Message jBO2iqfX025133 from 68.76.136.35 ([EMAIL PROTECTED]) to tomsawyer.com is spam, SBL+XBL, spamcop.net, SpamAssassin (score=36.46, required 4, autolearn=spam, BAYES_50 0.00, FORGED_RCVD_HELO 0.14, PYZOR_CHECK 3.70, RATWARE_NAME_ID 4.10, RCVD_IN_BL_SPAMCOP_NET 1.56, RCVD_IN_DSBL 2.60, RCVD_IN_NJABL_DUL 1.95, RCVD_IN_XBL 3.90, SAVE_THOUSANDS 0.40, SUBJECT_EXCESS_BASE64 0.45, TO_CC_NONE 0.13, URIBL_AB_SURBL 3.81, URIBL_JP_SURBL 4.09, URIBL_OB_SURBL 3.01, URIBL_SC_SURBL 4.50, URIBL_WS_SURBL 2.14) This leads me to a question about installation since in the INSTALL file it says: [unzip/untar the archive] cd Mail-SpamAssassin-* perl Makefile.PL [option: add -DSPAMC_SSL to $CFLAGS to build an SSL-enabled spamc] make make install This is how I normally done it over the past few years- install to a prefix in /usr/local and then create a symbolic link to /usr/local/spamassassin so that I can keep the previous version around in the event I need to recover from a bad upgrade. I don't recall ever having the difficulty that I had this time around. perl Makefile.PL --PREFIX=/usr/local/spamassassin-n.nn make make install ln -sf /usr/local/spamassassin-n.nn /usr/local/spamassassin The problem is that the installer did not put the perl modules in the site directory (/usr/perl5/site_perl/../... The spamassassin.pm module in the site dir now (after running CPAN) shows the updated version: #/usr/perl5# grep 3.00 /usr/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm $VERSION = "3.001000"; # update after release (same format as perl $]) The SpamAssassin.pm file from my original make, make install is here, but obviously was not being used: # ls -lsa /usr/local/spamassassin/lib/perl5/site_perl/5.8.0/Mail total 53 1 drwxr-xr-x 3 root 512 Dec 12 16:08 ./ 1 drwxr-xr-x 4 root 512 Dec 12 16:08 ../ 1 drwxr-xr-x 10 root 1024 Dec 12 16:08 SpamAssassin/ 50 -r--r--r-- 1 root50780 Sep 13 19:07 SpamAssassin.pm This is also version 3.001.000 In the install directions, it does not say anything about building spamassassin and then moving perl modules manually. Am I missing something? John -Original Message- From: Matt Kettler [mailto:[EMAIL PROTECTED] Sent: Friday, December 23, 2005 3:08 PM To: John Urness Cc: users@spamassassin.apache.org Subject: Re: Bayes Scores Skipped/Not Applied John Urness wrote: > > /etc/mail/spamassassin/local.cf > score ALL_TRUSTED 0 0 0 0 That is very concerning. Why'd you do that? 99.9% of the time the proper fix is to declare a trusted_networks. Disabling this rule merely covers up one symptom of a very pervasive problem (errant trust). > > use_bayes 1 > use_bayes_rules 1 > use_auto_whitelist 1 > bayes_auto_learn1 > bayes_auto_expire 1 > bayes_expiry_max_db_size20 > bayes_file_mode 0777 > auto_whitelist_path /extra/system/spamassassin/autoDB/auto-whitelist > auto_whitelist_file_mode 0666 > bayes_path /extra/system/spamassassin/autoDB/bayes > bayes_ignore_header X-MailScanner > bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header > X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop running spamd. MailScanner uses the perl API, so you don't need spamd, it's just wasting memory to run it. > >
RE: Bayes Scores Skipped/Not Applied
Here is some debugging info from MailScanner. Starting MailScanner... In Debugging mode, not forking... debug: SpamAssassin version 3.0.1 debug: Score set 0 chosen. debug: running in taint mode? no debug: config: SpamAssassin failed to parse line, skipping: use_razor1 0 debug: SpamAssassin version 3.0.1 debug: Score set 0 chosen. Use of uninitialized value in concatenation (.) or string at /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 978. Use of uninitialized value in concatenation (.) or string at /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 980. debug: read_scoreonly_config: cannot open "": No such file or directory [SNIP] /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line 329. configuration file "/usr/share/spamassassin/20_dnsbl_tests.cf" requires version 3.001000 of SpamAssassin, but this is code version 3.01. Maybe you need to use the -C switch, or remove the old config files? Skipping this file at /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line 329. configuration file "/usr/share/spamassassin/20_drugs.cf" requires version 3.001000 of SpamAssassin, but this is code version 3.01. Maybe you need to use the -C switch, or remove the old config files? Skipping this file at /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line 329. [Snip] The parser/version errors are numerous and indicate SpamAssassin 3.0 is being called. Odd. More debug... debug: bayes: 19802 tie-ing to DB file R/O /extra/system/spamassassin/autoDB/bayes_toks debug: bayes: 19802 tie-ing to DB file R/O /extra/system/spamassassin/autoDB/bayes_seen debug: bayes: found bayes db version 3 debug: Score set 3 chosen. [Snip] So it looks like it is finding bayes. And scoring with it, but maybe its that it can't read the scoring from the cf files in /usr/share/spamassassin because a version conflict. Yes, going through the the log again I found that the bayes_cf file could not be parsed because an older version of spamassassin is at work. Presumably this is what no scores from Bayes are being used: configuration file "/usr/share/spamassassin/23_bayes.cf" requires version 3.001000 of SpamAssassin, but this is code version 3.01. Maybe you need to use the -C switch, or remove the old config files? Skipping this file at /usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line 329. I installed it with make, make install, so assuming that is part of the problem, I am going to try upgrading with CPAN and see if that resolves the issue. John -Original Message- From: Matt Kettler [mailto:[EMAIL PROTECTED] Sent: Friday, December 23, 2005 3:08 PM To: John Urness Cc: users@spamassassin.apache.org Subject: Re: Bayes Scores Skipped/Not Applied John Urness wrote: > > /etc/mail/spamassassin/local.cf > score ALL_TRUSTED 0 0 0 0 That is very concerning. Why'd you do that? 99.9% of the time the proper fix is to declare a trusted_networks. Disabling this rule merely covers up one symptom of a very pervasive problem (errant trust). > > use_bayes 1 > use_bayes_rules 1 > use_auto_whitelist 1 > bayes_auto_learn1 > bayes_auto_expire 1 > bayes_expiry_max_db_size20 > bayes_file_mode 0777 > auto_whitelist_path /extra/system/spamassassin/autoDB/auto-whitelist > auto_whitelist_file_mode 0666 > bayes_path /extra/system/spamassassin/autoDB/bayes > bayes_ignore_header X-MailScanner > bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header > X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop running spamd. MailScanner uses the perl API, so you don't need spamd, it's just wasting memory to run it. > >
RE: Bayes Scores Skipped/Not Applied
Hi Matt, I stopped running spamd. The ALL_TRUSTED was letting a lot of junk get through and I saw a post that recommended 0 for the score to prevent false negatives. I have restored it to its original and added trusted networks (with a couple of subnets) as you suggest. I am still not seeing any BAYES scores. Here are a few examples: Message jBO1kTfX011690 from 141.157.60.60 ([EMAIL PROTECTED]) to tomsawyer.com is spam, SBL+XBL, spamcop.net, SpamAssassin (score=8.364, required 4, RATWARE_RCVD_PF 3.60, SARE_GETFCK 0.68, URIBL_JP_SURBL 4.09) Message jBO1hXfX008995 from 24.175.86.36 ([EMAIL PROTECTED]) to tomsawyer.com is spam, SpamAssassin (score=6.711, required 4, SPF_HELO_SOFTFAIL 2.43, SUBJ_ILLEGAL_CHARS 4.28) Message jBO1fYfX008953 from 222.47.203.164 ([EMAIL PROTECTED]) to tomsawyer.com is spam, SBL+XBL, SpamAssassin (score=10.678, required 4, FORGED_MUA_OUTLOOK 4.06, MSGID_SPAM_CAPS 4.40, SARE_RECV_IP_222032 2.22) Message jBO1kTfX011690 from 141.157.60.60 ([EMAIL PROTECTED]) to tomsawyer.com is spam, SBL+XBL, spamcop.net, SpamAssassin (score=8.364, required 4, RATWARE_RCVD_PF 3.60, SARE_GETFCK 0.68, URIBL_JP_SURBL 4.09) John -Original Message- From: Matt Kettler [mailto:[EMAIL PROTECTED] Sent: Friday, December 23, 2005 3:08 PM To: John Urness Cc: users@spamassassin.apache.org Subject: Re: Bayes Scores Skipped/Not Applied John Urness wrote: > > /etc/mail/spamassassin/local.cf > score ALL_TRUSTED 0 0 0 0 That is very concerning. Why'd you do that? 99.9% of the time the proper fix is to declare a trusted_networks. Disabling this rule merely covers up one symptom of a very pervasive problem (errant trust). > > use_bayes 1 > use_bayes_rules 1 > use_auto_whitelist 1 > bayes_auto_learn1 > bayes_auto_expire 1 > bayes_expiry_max_db_size20 > bayes_file_mode 0777 > auto_whitelist_path /extra/system/spamassassin/autoDB/auto-whitelist > auto_whitelist_file_mode 0666 > bayes_path /extra/system/spamassassin/autoDB/bayes > bayes_ignore_header X-MailScanner > bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header > X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop running spamd. MailScanner uses the perl API, so you don't need spamd, it's just wasting memory to run it. > >
RE: Bayes Scores Skipped/Not Applied
Loren, You are seriously paying attention. I did the debugs yesterday and completely rebuilt the bayes db today using a whitelist and a blacklist mailspool so it is now a lot smaller since it lost a couple of years of autolearning when I started over... So it is actually from a site bayes database that was recreated and not an issue of multiple databases John -Original Message- From: Loren Wilton [mailto:[EMAIL PROTECTED] Sent: Friday, December 23, 2005 4:43 PM To: users@spamassassin.apache.org Subject: Re: Bayes Scores Skipped/Not Applied This seems strange: > Here is sa-learn --dump magic: > This shows that I have more than enough spam and ham > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 3754 0 non-token data: nspam > 0.000 0220 0 non-token data: nham > 0.000 0 312279 0 non-token data: ntokens > 0.000 0 1051829432 0 non-token data: oldest atime > 0.000 0 1135374012 0 non-token data: newest atime > 0.000 0 0 0 non-token data: last journal sync > atime > 0.000 0 1135373049 0 non-token data: last expiry atime > 0.000 0 0 0 non-token data: last expire atime > delta > 0.000 0 0 0 non-token data: last expire > reduction count > [19915] dbg: bayes: DB journal sync: last sync: 1105811470 [19915] > dbg: bayes: corpus size: nspam = 153968, nham = 40588 [19915] dbg: > bayes: score = 4.91619356335349e-10 [19915] dbg: bayes: DB expiry: > tokens in DB: 1984629, Expiry max size: > 15, Oldest atime: 1084312293, Newest atime: 0, Last expire: 1098687948, > Current time: 1135280002 > [19915] dbg: bayes: DB journal sync: last sync: 1105811470 As I read that, the bayes db has 3754 spam and 220 ham. But later in processing it has 153968 spam and 40558 ham! This makes me think you have two different bayes databases under two different users. Which would perhaps imply different user_prefs files, and one of them might not be enabling bayes. Loren
Re: Bayes Scores Skipped/Not Applied
This seems strange: > Here is sa-learn --dump magic: > This shows that I have more than enough spam and ham > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 3754 0 non-token data: nspam > 0.000 0220 0 non-token data: nham > 0.000 0 312279 0 non-token data: ntokens > 0.000 0 1051829432 0 non-token data: oldest atime > 0.000 0 1135374012 0 non-token data: newest atime > 0.000 0 0 0 non-token data: last journal sync > atime > 0.000 0 1135373049 0 non-token data: last expiry atime > 0.000 0 0 0 non-token data: last expire atime > delta > 0.000 0 0 0 non-token data: last expire > reduction count > [19915] dbg: bayes: DB journal sync: last sync: 1105811470 > [19915] dbg: bayes: corpus size: nspam = 153968, nham = 40588 > [19915] dbg: bayes: score = 4.91619356335349e-10 > [19915] dbg: bayes: DB expiry: tokens in DB: 1984629, Expiry max size: > 15, Oldest atime: 1084312293, Newest atime: 0, Last expire: 1098687948, > Current time: 1135280002 > [19915] dbg: bayes: DB journal sync: last sync: 1105811470 As I read that, the bayes db has 3754 spam and 220 ham. But later in processing it has 153968 spam and 40558 ham! This makes me think you have two different bayes databases under two different users. Which would perhaps imply different user_prefs files, and one of them might not be enabling bayes. Loren
Re: Bayes Scores Skipped/Not Applied
John Urness wrote: > > /etc/mail/spamassassin/local.cf > score ALL_TRUSTED 0 0 0 0 That is very concerning. Why'd you do that? 99.9% of the time the proper fix is to declare a trusted_networks. Disabling this rule merely covers up one symptom of a very pervasive problem (errant trust). > > use_bayes 1 > use_bayes_rules 1 > use_auto_whitelist 1 > bayes_auto_learn1 > bayes_auto_expire 1 > bayes_expiry_max_db_size20 > bayes_file_mode 0777 > auto_whitelist_path/extra/system/spamassassin/autoDB/auto-whitelist > auto_whitelist_file_mode 0666 > bayes_path /extra/system/spamassassin/autoDB/bayes > bayes_ignore_header X-MailScanner > bayes_ignore_header X-MailScanner-SpamCheck > bayes_ignore_header X-MailScanner-SpamScore > bayes_ignore_header X-MailScanner-Information Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop running spamd. MailScanner uses the perl API, so you don't need spamd, it's just wasting memory to run it. > >
Bayes Scores Skipped/Not Applied
Hi, I recently upgraded from spamassassin 3.0 to 3.1 and right away the amount of false negatives increased. I thought at first that it was because of the loss of dcc and razor (which surely is a factor), but on further investigation it appears that it is more related to the Bayes system. I have looked through the archives, but I am not finding what I need although it seems like this question has been asked more than once in different ways. I get almost no BAYES_xx or AWL scores on headers. I got one BAYES_00 tag betweem 18 December and 22 December and it is my understanding that I should see them on almost all emails. When I run spamassassin manually on a piece of spam, it applys a BAYES_xx score. Yet as far as I can tell the spamd daemon skips BAYES_xx scoring. It seems like this implies a simple *cf file issue. Spamassassin -D --lint shows no rules errors and the site directory shows up as /etc/mail/spamassassin which is where the local.cf file is. This is a sitewide installation. Among other things, debug output is included for both spamd and spamassassin- this may be a bit overkill, but I am trying to anticipate what things the group will ask for based on previous posts so I have included debugging as well a sample headers of a false negative that was clearly spam, the local.cf file from /etc/mail/spamassassin, permissions on the bayes db files, etc. Vital stats: Upgraded recently from spamassasin 3.0 to 3.1 mailscanner 4.48 Rules-de-jour *.cf files under /etc/mail/spamassassin Used sitewide in conjunction with Mailscanner perl 5.8.0 running on solaris 5.8 Here is sa-learn --dump magic: This shows that I have more than enough spam and ham 0.000 0 3 0 non-token data: bayes db version 0.000 0 3754 0 non-token data: nspam 0.000 0220 0 non-token data: nham 0.000 0 312279 0 non-token data: ntokens 0.000 0 1051829432 0 non-token data: oldest atime 0.000 0 1135374012 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1135373049 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count spamd runs as such: # ps -efa |grep spamd root 22145 21420 0 13:42:37 ?0:00 /usr/local/bin/perl -T /usr/local/bin/spamd -d -x -l -c --syslog-socket=inet /etc/mail/spamassassin/local.cf score ALL_TRUSTED 0 0 0 0 use_bayes 1 use_bayes_rules 1 use_auto_whitelist 1 bayes_auto_learn1 bayes_auto_expire 1 bayes_expiry_max_db_size20 bayes_file_mode 0777 auto_whitelist_path/extra/system/spamassassin/autoDB/auto-whitelist auto_whitelist_file_mode 0666 bayes_path /extra/system/spamassassin/autoDB/bayes bayes_ignore_header X-MailScanner bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information # ls -lsag /extra/system/spamassassin/autoDB/ total 65893 3 drwxr-xr-x 3 root root 3072 Dec 23 11:16 ./ 1 drwxr-xr-x 6 root other 512 Nov 9 2004 ../ 16 -rw-rw-rw- 1 root other 24576 Dec 23 13:42 auto-whitelist 1 -rw--- 1 root other 6 Dec 23 13:42 auto-whitelist.mutex 23 -rw--- 1 root other 22572 Dec 23 14:13 bayes.mutex 312 -rw-rw-rw- 1 root other 352256 Dec 23 14:13 bayes_seen 8224 -rw-rw-rw- 1 root other10502144 Dec 23 14:13 bayes_toks 27792 -rw-rw-rw- 1 daemon mail 28430177 Dec 14 19:05 blacklist_mailspool 1 drwxrwxr-x 2 root users 512 Dec 23 11:09 delete/ 0 -rw-rw-rw- 1 daemon mail0 Dec 23 10:56 unwhitelist_mailspool 29520 -rw-rw-rw- 1 daemon mail 30198420 Dec 20 14:24 whitelist_mailspool Example e-mail header of false negative: Received: from i219-164-20-154.s02.a001.ap.plala.or.jp (i219-164-20-154.s02.a001.ap.plala.or.jp [219.164.20.154]) by [SNIP] (8.12.9/8.12.9) with SMTP id jBMCSafX001397 for <[SNIP] >; Thu, 22 Dec 2005 04:28:37 -0800 (PST) Received: from [192.168.178.219] (port=29858 helo=smncs) by i219-164-20-154.s02.a001.ap.plala.or.jp with esmtp id 1EpPYF-0002uW-U5 for [EMAIL PROTECTED]; Thu, 22 Dec 2005 09:28:07 -0300 Date: Thu, 22 Dec 2005 21:28:28 +0900 From: <[EMAIL PROTECTED]> X-Mailer: The Bat! (v3.5) Professional Reply-To: <[EMAIL PROTECTED]> Organization: dyii X-Priority: 3 (Normal) Message-ID: <[EMAIL PROTECTED]> To: <[SNIP]> Subject: press release MIME-Version: 1.0 Content-Type: multipart/related; boundary="=_869389e472dcb8f140f1bfe43211303b" X-Spam: Not detected X-TSS-MailScanner-Information: See www.mailscanner.info for information X-TSS-MailScanner: Appears