Re: How much does whitelist_from really do?

2004-12-14 Thread Matt Barton
Peter Guhl wrote:
We have a really huge whitelist - all inserted in user_prefs using
"whitelist_from". But I constantly get told that mails from people at
this list got flagged as spam. That makes me wonder... do I have to do
something specific to make sure sa honors "whitelist_from"? Does it only
shift the score or bypass scanning?
You may want to do a whitelist_to for the spamassassin-users@ address.
--
Matt Barton
Webexcellence
PH: 317.423.3548 x22
TF: 800.808.6332 x22
FX: 317.423.8735
[EMAIL PROTECTED]
www.webexc.com


Re: URIDNSBL on freebsd?

2004-12-09 Thread Matt Barton
Jeff Chan wrote:
Try removing from your resolv.conf:
nameserver  127.0.0.1
and adding some external nameservers.  This may be a bug
in the FreeBSD version of SpamAssassin.
If the DNS server on the system does listen in a regular interface, you 
may be able to set the entry in resolv.conf to the IP address on that 
interface (i.e. a public IP address that is local to the server).

--
Matt Barton
Webexcellence
PH: 317.423.3548 x22
TF: 800.808.6332 x22
FX: 317.423.8735
[EMAIL PROTECTED]
www.webexc.com


Re: SA vs. postfix main.cf

2004-12-06 Thread Matt Barton
Per Jessen wrote:
Not that I can think of.  Essentially you need to decide who makes
the decision for you - SA or Postfix. By the time postfix delivers
the mail to SA via the content_filter, all the Postfix checks are
complete - smtpd__restrictions - so if postfix has decided to
reject an email, SA can't really override that later.  Therefore, if 
your users disagree with your blockinglist, don't use those
blockinglist(s) in postfix and leave it to SA.
In order to do the same kind of whitelisting in Postfix, you'd basically 
need to setup some check_*_access checks before your RBL's allowing them 
to pass.

--
Matt Barton
Webexcellence
PH: 317.423.3548 x22
TF: 800.808.6332 x22
FX: 317.423.8735
[EMAIL PROTECTED]
www.webexc.com


Re: sa-learn ham

2004-11-24 Thread Matt Barton
Gustafson, Tim wrote:
How do you keep your ntokens so low?
Mine averages ((nspam + nham) * 10).  Yours is basically (nspam + nham).
Do you run some job that expires tokens or something?  I'm running
sa-learn --force-expire once a day (and it takes about 2-3 minutes to
run) but the ntokens never seems to go down.  :\
I don't run --force-expire at all.  I think it will automatically expire 
tokens when certain criteria are met -- none of which I can recall as I 
write this e-mail, though I know you can find it online.

I have a conrjob that runs a script every half-hour that checks for 
e-mails that need to be manually fed in to sa-learn.  It pulls them out 
of designated ham and spam IMAP folders, runs them through sa-learn, and 
then runs sa-learn again with --sync.  I think the --sync may be what 
does it, but I don't know for sure.

-Original Message-----
From: Matt Barton [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 24, 2004 11:27 AM
To: SA Users List
Subject: Re: sa-learn ham
Since we're all playing show-and-tell, here is a dump of the magic on my
company's mail server.
0.000  0  3  0  non-token data: bayes db version
0.000  0 101024  0  non-token data: nspam
0.000  0 164343  0  non-token data: nham
0.000  0 240026  0  non-token data: ntokens
--
Matt Barton
Webexcellence
PH: 317.423.3548 x22
TF: 800.808.6332 x22
FX: 317.423.8735
[EMAIL PROTECTED]
www.webexc.com


Re: sa-learn ham

2004-11-24 Thread Matt Barton
Gustafson, Tim wrote:
0.000  0  2  0  non-token data: bayes db version
0.000  0  88033  0  non-token data: nspam
0.000  0  15592  0  non-token data: nham
0.000  01729756  0  non-token data: ntokens
0.000  0 1010964573  0  non-token data: oldest atime
0.000  0 1762110386  0  non-token data: newest atime
0.000  0 1101309901  0  non-token data: last journal
sync atime
0.000  0 1101301792  0  non-token data: last expiry
atime
0.000  0  0  0  non-token data: last expire
atime delta
0.000  0  0  0  non-token data: last expire
reduction count
I agree with Jim that having your SPAM/HAM numbers match doesn't really
matter, as long as you have sufficient amounts of each.  I think the
"threshold" where my users started to expierence the best filtering
accuracy was when I topped 1000 SPAMs and HAMs.  But, as Jim said
before, your mileage may vary.
Since we're all playing show-and-tell, here is a dump of the magic on my 
company's mail server.

0.000  0  3  0  non-token data: bayes db version
0.000  0 101024  0  non-token data: nspam
0.000  0 164343  0  non-token data: nham
0.000  0 240026  0  non-token data: ntokens
0.000  0 1101226944  0  non-token data: oldest atime
0.000  0 1101313137  0  non-token data: newest atime
0.000  0 1101313136  0  non-token data: last journal 
sync atime
0.000  0 1101270336  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire 
atime delta
0.000  0 196502  0  non-token data: last expire 
reduction count

My auto-learn thresholds are set as follows in the global local.cf.
bayes_auto_learn_threshold_nonspam 0.8
bayes_auto_learn_threshold_spam 10.0
It is very important that you keep your bayes_min_[ham|spam]_num 
settings to at least 1000.

--
Matt Barton
Webexcellence
PH: 317.423.3548 x22
TF: 800.808.6332 x22
FX: 317.423.8735
[EMAIL PROTECTED]
www.webexc.com