RE: Bayes Scores Skipped/Not Applied: HAPPY RESOLUTION

2005-12-23 Thread John Urness
Hi Matt,
I resolved the issue. Thanks for pointing me in a different direction- the
rubber has not been meeting the road for about a week on this issue!

After upgrading using CPAN I am getting BAYES scores (among others from the
/usr/share/spamassassin dir). So apparently it was an installation issue. 


Dec 23 18:45:12 unixserv0 MailScanner[24894]: Message jBO2iqfX025133 from
68.76.136.35 ([EMAIL PROTECTED]) to tomsawyer.com is spam,
SBL+XBL, spamcop.net, SpamAssassin (score=36.46, required 4, autolearn=spam,
BAYES_50 0.00, FORGED_RCVD_HELO 0.14, PYZOR_CHECK 3.70, RATWARE_NAME_ID
4.10, RCVD_IN_BL_SPAMCOP_NET 1.56, RCVD_IN_DSBL 2.60, RCVD_IN_NJABL_DUL
1.95, RCVD_IN_XBL 3.90, SAVE_THOUSANDS 0.40, SUBJECT_EXCESS_BASE64 0.45,
TO_CC_NONE 0.13, URIBL_AB_SURBL 3.81, URIBL_JP_SURBL 4.09, URIBL_OB_SURBL
3.01, URIBL_SC_SURBL 4.50, URIBL_WS_SURBL 2.14) 


This leads me to a question about installation since in the INSTALL file  it
says:
[unzip/untar the archive]
cd Mail-SpamAssassin-*
perl Makefile.PL
[option: add -DSPAMC_SSL to $CFLAGS to build an SSL-enabled spamc]
make
make install  

This is how I normally done it over the past few years- install to a prefix
in /usr/local and then create a symbolic link to /usr/local/spamassassin so
that I can keep the previous version around in the event I need to recover
from a bad upgrade. I don't recall ever having the difficulty that I had
this time around.

perl Makefile.PL --PREFIX=/usr/local/spamassassin-n.nn
make
make install 
  ln -sf /usr/local/spamassassin-n.nn /usr/local/spamassassin


The problem is that the installer did not put the perl modules in the site
directory (/usr/perl5/site_perl/../... The spamassassin.pm module in the
site dir now (after running CPAN) shows the updated version:

#/usr/perl5# grep 3.00 /usr/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm
$VERSION = "3.001000";  # update after release (same format as perl $])


The SpamAssassin.pm file from my original make, make install is here, but
obviously was not being used: 
# ls -lsa /usr/local/spamassassin/lib/perl5/site_perl/5.8.0/Mail
total 53
   1 drwxr-xr-x   3 root  512 Dec 12 16:08 ./
   1 drwxr-xr-x   4 root  512 Dec 12 16:08 ../
   1 drwxr-xr-x  10 root 1024 Dec 12 16:08 SpamAssassin/
  50 -r--r--r--   1 root50780 Sep 13 19:07 SpamAssassin.pm

This is also version 3.001.000


In the install directions, it does not say anything about building
spamassassin and then moving perl modules manually. Am I missing something? 



John

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 23, 2005 3:08 PM
To: John Urness
Cc: users@spamassassin.apache.org
Subject: Re: Bayes Scores Skipped/Not Applied

John Urness wrote:

> 
> /etc/mail/spamassassin/local.cf
> score ALL_TRUSTED 0 0 0 0

That is very concerning. Why'd you do that? 99.9% of the time the proper fix
is to declare a trusted_networks. Disabling this rule merely covers up one
symptom of a very pervasive problem (errant trust).

> 
> use_bayes   1
> use_bayes_rules 1
> use_auto_whitelist  1
> bayes_auto_learn1
> bayes_auto_expire   1
> bayes_expiry_max_db_size20
> bayes_file_mode 0777
> auto_whitelist_path
/extra/system/spamassassin/autoDB/auto-whitelist
> auto_whitelist_file_mode   0666
> bayes_path /extra/system/spamassassin/autoDB/bayes
> bayes_ignore_header X-MailScanner
> bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header 
> X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information

Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop
running spamd. MailScanner uses the perl API, so you don't need spamd, it's
just wasting memory to run it.

> 
> 



RE: Bayes Scores Skipped/Not Applied

2005-12-23 Thread John Urness
Here is some debugging info from MailScanner. 

Starting MailScanner...
In Debugging mode, not forking...
debug: SpamAssassin version 3.0.1
debug: Score set 0 chosen.
debug: running in taint mode? no
debug: config: SpamAssassin failed to parse line, skipping: use_razor1 0
debug: SpamAssassin version 3.0.1
debug: Score set 0 chosen.
Use of uninitialized value in concatenation (.) or string at
/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 978.
Use of uninitialized value in concatenation (.) or string at
/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 980.
debug: read_scoreonly_config: cannot open "": No such file or directory 

[SNIP]

/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line
329.
configuration file "/usr/share/spamassassin/20_dnsbl_tests.cf" requires
version 3.001000 of SpamAssassin, but this is code version 3.01. Maybe
you need to use the -C switch, or remove the old config files? Skipping this
file at
/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line
329.
configuration file "/usr/share/spamassassin/20_drugs.cf" requires version
3.001000 of SpamAssassin, but this is code version 3.01. Maybe you need
to use the -C switch, or remove the old config files? Skipping this file at
/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line
329.
[Snip]


The parser/version errors are numerous and indicate SpamAssassin 3.0 is
being called. Odd.


More debug...
debug: bayes: 19802 tie-ing to DB file R/O
/extra/system/spamassassin/autoDB/bayes_toks
debug: bayes: 19802 tie-ing to DB file R/O
/extra/system/spamassassin/autoDB/bayes_seen
debug: bayes: found bayes db version 3
debug: Score set 3 chosen.
[Snip]



So it looks like it is finding bayes. And scoring with it, but maybe its
that it can't read the scoring from the cf files in /usr/share/spamassassin
because a version conflict. 

Yes, going through the the log again I found that the bayes_cf file could
not be parsed because an older version of spamassassin is at work.
Presumably this is what no scores from Bayes are being used:

configuration file "/usr/share/spamassassin/23_bayes.cf" requires version
3.001000 of SpamAssassin, but this is code version 3.01. Maybe you need
to use the -C switch, or remove the old config files? Skipping this file at
/usr/local/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf/Parser.pm line
329. 



I installed it with make, make install, so assuming that is part of the
problem, I am going to try upgrading with CPAN and see if that resolves the
issue.



John 
-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 23, 2005 3:08 PM
To: John Urness
Cc: users@spamassassin.apache.org
Subject: Re: Bayes Scores Skipped/Not Applied

John Urness wrote:

> 
> /etc/mail/spamassassin/local.cf
> score ALL_TRUSTED 0 0 0 0

That is very concerning. Why'd you do that? 99.9% of the time the proper fix
is to declare a trusted_networks. Disabling this rule merely covers up one
symptom of a very pervasive problem (errant trust).

> 
> use_bayes   1
> use_bayes_rules 1
> use_auto_whitelist  1
> bayes_auto_learn1
> bayes_auto_expire   1
> bayes_expiry_max_db_size20
> bayes_file_mode 0777
> auto_whitelist_path
/extra/system/spamassassin/autoDB/auto-whitelist
> auto_whitelist_file_mode   0666
> bayes_path /extra/system/spamassassin/autoDB/bayes
> bayes_ignore_header X-MailScanner
> bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header 
> X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information

Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop
running spamd. MailScanner uses the perl API, so you don't need spamd, it's
just wasting memory to run it.

> 
> 



RE: Bayes Scores Skipped/Not Applied

2005-12-23 Thread John Urness
Hi Matt,
I stopped running spamd. 

The ALL_TRUSTED was letting a lot of junk get through and I saw a post that
recommended 0 for the score to prevent false negatives. I have restored it
to its original and added trusted networks (with a couple of subnets) as you
suggest.

I am still not seeing any BAYES scores. Here are a few examples:

Message jBO1kTfX011690 from 141.157.60.60 ([EMAIL PROTECTED])
to tomsawyer.com is spam, SBL+XBL, spamcop.net, SpamAssassin (score=8.364,
required 4, RATWARE_RCVD_PF 3.60, SARE_GETFCK 0.68, URIBL_JP_SURBL 4.09)

Message jBO1hXfX008995 from 24.175.86.36 ([EMAIL PROTECTED]) to
tomsawyer.com is spam, SpamAssassin (score=6.711, required 4,
SPF_HELO_SOFTFAIL 2.43, SUBJ_ILLEGAL_CHARS 4.28)

 Message jBO1fYfX008953 from 222.47.203.164 ([EMAIL PROTECTED]) to
tomsawyer.com is spam, SBL+XBL, SpamAssassin (score=10.678, required 4,
FORGED_MUA_OUTLOOK 4.06, MSGID_SPAM_CAPS 4.40, SARE_RECV_IP_222032 2.22)

Message jBO1kTfX011690 from 141.157.60.60 ([EMAIL PROTECTED])
to tomsawyer.com is spam, SBL+XBL, spamcop.net, SpamAssassin (score=8.364,
required 4, RATWARE_RCVD_PF 3.60, SARE_GETFCK 0.68, URIBL_JP_SURBL 4.09) 


John 


-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 23, 2005 3:08 PM
To: John Urness
Cc: users@spamassassin.apache.org
Subject: Re: Bayes Scores Skipped/Not Applied

John Urness wrote:

> 
> /etc/mail/spamassassin/local.cf
> score ALL_TRUSTED 0 0 0 0

That is very concerning. Why'd you do that? 99.9% of the time the proper fix
is to declare a trusted_networks. Disabling this rule merely covers up one
symptom of a very pervasive problem (errant trust).

> 
> use_bayes   1
> use_bayes_rules 1
> use_auto_whitelist  1
> bayes_auto_learn1
> bayes_auto_expire   1
> bayes_expiry_max_db_size20
> bayes_file_mode 0777
> auto_whitelist_path
/extra/system/spamassassin/autoDB/auto-whitelist
> auto_whitelist_file_mode   0666
> bayes_path /extra/system/spamassassin/autoDB/bayes
> bayes_ignore_header X-MailScanner
> bayes_ignore_header X-MailScanner-SpamCheck bayes_ignore_header 
> X-MailScanner-SpamScore bayes_ignore_header X-MailScanner-Information

Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop
running spamd. MailScanner uses the perl API, so you don't need spamd, it's
just wasting memory to run it.

> 
> 



RE: Bayes Scores Skipped/Not Applied

2005-12-23 Thread John Urness
Loren,
You are seriously paying attention. I did the debugs yesterday and
completely rebuilt the bayes db today using a whitelist and a blacklist
mailspool so it is now a lot smaller since it lost a couple of years of
autolearning when I started over...

So it is actually from a site bayes database that was recreated and not an
issue of multiple databases   


John 


-Original Message-
From: Loren Wilton [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 23, 2005 4:43 PM
To: users@spamassassin.apache.org
Subject: Re: Bayes Scores Skipped/Not Applied

This seems strange:


> Here is sa-learn --dump magic:
> This shows that I have more than enough spam and ham
> 0.000  0  3  0  non-token data: bayes db version
> 0.000  0   3754  0  non-token data: nspam
> 0.000  0220  0  non-token data: nham
> 0.000  0 312279  0  non-token data: ntokens
> 0.000  0 1051829432  0  non-token data: oldest atime
> 0.000  0 1135374012  0  non-token data: newest atime
> 0.000  0  0  0  non-token data: last journal sync
> atime
> 0.000  0 1135373049  0  non-token data: last expiry atime
> 0.000  0  0  0  non-token data: last expire atime
> delta
> 0.000  0  0  0  non-token data: last expire
> reduction count


> [19915] dbg: bayes: DB journal sync: last sync: 1105811470 [19915] 
> dbg: bayes: corpus size: nspam = 153968, nham = 40588 [19915] dbg: 
> bayes: score = 4.91619356335349e-10 [19915] dbg: bayes: DB expiry: 
> tokens in DB: 1984629, Expiry max size:
> 15, Oldest atime: 1084312293, Newest atime: 0, Last expire:
1098687948,
> Current time: 1135280002
> [19915] dbg: bayes: DB journal sync: last sync: 1105811470

As I read that, the bayes db has 3754 spam and 220 ham.
But later in processing it has 153968 spam and 40558 ham!

This makes me think you have two different bayes databases under two
different users.  Which would perhaps imply different user_prefs files, and
one of them might not be enabling bayes.

Loren



Re: Bayes Scores Skipped/Not Applied

2005-12-23 Thread Loren Wilton
This seems strange:


> Here is sa-learn --dump magic:
> This shows that I have more than enough spam and ham
> 0.000  0  3  0  non-token data: bayes db version
> 0.000  0   3754  0  non-token data: nspam
> 0.000  0220  0  non-token data: nham
> 0.000  0 312279  0  non-token data: ntokens
> 0.000  0 1051829432  0  non-token data: oldest atime
> 0.000  0 1135374012  0  non-token data: newest atime
> 0.000  0  0  0  non-token data: last journal sync
> atime
> 0.000  0 1135373049  0  non-token data: last expiry atime
> 0.000  0  0  0  non-token data: last expire atime
> delta
> 0.000  0  0  0  non-token data: last expire
> reduction count


> [19915] dbg: bayes: DB journal sync: last sync: 1105811470
> [19915] dbg: bayes: corpus size: nspam = 153968, nham = 40588
> [19915] dbg: bayes: score = 4.91619356335349e-10
> [19915] dbg: bayes: DB expiry: tokens in DB: 1984629, Expiry max size:
> 15, Oldest atime: 1084312293, Newest atime: 0, Last expire:
1098687948,
> Current time: 1135280002
> [19915] dbg: bayes: DB journal sync: last sync: 1105811470

As I read that, the bayes db has 3754 spam and 220 ham.
But later in processing it has 153968 spam and 40558 ham!

This makes me think you have two different bayes databases under two
different users.  Which would perhaps imply different user_prefs files, and
one of them might not be enabling bayes.

Loren



Re: Bayes Scores Skipped/Not Applied

2005-12-23 Thread Matt Kettler
John Urness wrote:

> 
> /etc/mail/spamassassin/local.cf
> score ALL_TRUSTED 0 0 0 0

That is very concerning. Why'd you do that? 99.9% of the time the proper fix is
to declare a trusted_networks. Disabling this rule merely covers up one symptom
of a very pervasive problem (errant trust).

> 
> use_bayes   1
> use_bayes_rules 1
> use_auto_whitelist  1
> bayes_auto_learn1
> bayes_auto_expire   1
> bayes_expiry_max_db_size20
> bayes_file_mode 0777
> auto_whitelist_path/extra/system/spamassassin/autoDB/auto-whitelist
> auto_whitelist_file_mode   0666
> bayes_path /extra/system/spamassassin/autoDB/bayes
> bayes_ignore_header X-MailScanner
> bayes_ignore_header X-MailScanner-SpamCheck
> bayes_ignore_header X-MailScanner-SpamScore
> bayes_ignore_header X-MailScanner-Information

Wait, why the mailscanner ignores? Are you using mailscanner? If so, stop
running spamd. MailScanner uses the perl API, so you don't need spamd, it's just
wasting memory to run it.

> 
> 


Bayes Scores Skipped/Not Applied

2005-12-23 Thread John Urness
Hi,
I recently upgraded from spamassassin 3.0 to 3.1 and right away the amount
of false negatives increased. I thought at first that it was because of the
loss of dcc and razor (which surely is a factor), but on further
investigation it appears that it is more related to the Bayes system.

I have looked through the archives, but I am not finding what I need
although it seems like this question has been asked more than once in
different ways.
 
I get almost no BAYES_xx or AWL scores on headers. I got one BAYES_00 tag
betweem 18 December and 22 December and it is my understanding that I should
see them on almost all emails. When I run spamassassin manually on a piece
of spam, it applys a BAYES_xx score. Yet as far as I can tell the spamd
daemon skips BAYES_xx scoring. It seems like this implies a simple *cf file
issue. Spamassassin -D --lint shows no rules errors and the site directory
shows up as /etc/mail/spamassassin which is where the local.cf file is. This
is a sitewide installation.

Among other things, debug output is included for both spamd and
spamassassin- this may be a bit overkill, but I am trying to anticipate what
things the group will ask for based on previous posts so I have included
debugging as well a sample headers of a false negative that was clearly
spam, the local.cf file from /etc/mail/spamassassin, permissions on the
bayes db files, etc.
 


Vital stats:
Upgraded recently from spamassasin 3.0 to 3.1
mailscanner 4.48
Rules-de-jour *.cf files under /etc/mail/spamassassin
Used sitewide in conjunction with Mailscanner
perl 5.8.0
running on solaris 5.8


 
Here is sa-learn --dump magic:
This shows that I have more than enough spam and ham
0.000  0  3  0  non-token data: bayes db version
0.000  0   3754  0  non-token data: nspam
0.000  0220  0  non-token data: nham
0.000  0 312279  0  non-token data: ntokens
0.000  0 1051829432  0  non-token data: oldest atime
0.000  0 1135374012  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal sync
atime
0.000  0 1135373049  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire atime
delta
0.000  0  0  0  non-token data: last expire
reduction count

 
spamd runs as such:
# ps -efa |grep spamd
root 22145 21420  0 13:42:37 ?0:00 /usr/local/bin/perl -T
/usr/local/bin/spamd -d -x -l -c --syslog-socket=inet

/etc/mail/spamassassin/local.cf
score ALL_TRUSTED 0 0 0 0

use_bayes   1
use_bayes_rules 1
use_auto_whitelist  1
bayes_auto_learn1
bayes_auto_expire   1
bayes_expiry_max_db_size20
bayes_file_mode 0777
auto_whitelist_path/extra/system/spamassassin/autoDB/auto-whitelist
auto_whitelist_file_mode   0666
bayes_path /extra/system/spamassassin/autoDB/bayes
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information




# ls -lsag  /extra/system/spamassassin/autoDB/
total 65893
   3 drwxr-xr-x   3 root root 3072 Dec 23 11:16 ./
   1 drwxr-xr-x   6 root other 512 Nov  9  2004 ../
  16 -rw-rw-rw-   1 root other   24576 Dec 23 13:42 auto-whitelist
   1 -rw---   1 root other   6 Dec 23 13:42
auto-whitelist.mutex
  23 -rw---   1 root other   22572 Dec 23 14:13 bayes.mutex
 312 -rw-rw-rw-   1 root other  352256 Dec 23 14:13 bayes_seen
8224 -rw-rw-rw-   1 root other10502144 Dec 23 14:13 bayes_toks
27792 -rw-rw-rw-   1 daemon   mail 28430177 Dec 14 19:05
blacklist_mailspool
   1 drwxrwxr-x   2 root users 512 Dec 23 11:09 delete/
   0 -rw-rw-rw-   1 daemon   mail0 Dec 23 10:56
unwhitelist_mailspool
29520 -rw-rw-rw-   1 daemon   mail 30198420 Dec 20 14:24
whitelist_mailspool





 
 
Example e-mail header of false negative: 
Received: from i219-164-20-154.s02.a001.ap.plala.or.jp
(i219-164-20-154.s02.a001.ap.plala.or.jp [219.164.20.154])
by [SNIP] (8.12.9/8.12.9) with SMTP id jBMCSafX001397
for <[SNIP] >; Thu, 22 Dec 2005 04:28:37 -0800 (PST)
Received: from [192.168.178.219] (port=29858 helo=smncs)
by i219-164-20-154.s02.a001.ap.plala.or.jp with esmtp
id 1EpPYF-0002uW-U5
for [EMAIL PROTECTED]; Thu, 22 Dec 2005 09:28:07 -0300
Date: Thu, 22 Dec 2005 21:28:28 +0900
From: <[EMAIL PROTECTED]>
X-Mailer: The Bat! (v3.5) Professional
Reply-To: <[EMAIL PROTECTED]>
Organization: dyii
X-Priority: 3 (Normal)
Message-ID: <[EMAIL PROTECTED]>
To: <[SNIP]>
Subject: press release
MIME-Version: 1.0
Content-Type: multipart/related;
boundary="=_869389e472dcb8f140f1bfe43211303b"
X-Spam: Not detected
X-TSS-MailScanner-Information: See www.mailscanner.info for information
X-TSS-MailScanner: Appears