spam score question

2015-04-18 Thread Michael Williamson
Hi,

I have another question.

It appears to me that spamassassin can produce different spam scores
for the same email.
In particular, I have noticed that points are omitted for
RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why?
Is the difference due to a difference in how spamassassin is invoked?
(for example, due an environment variable).
One way that I invoke spamassassin to get spam scores is from a
program that is started as a cronjob for a user. This way sometimes
omits the points for the test mentioned above. Then, when I invoke
spamassassin from the command line as the same user, for the same
email, I get a higher score because it includes points for
RCVD_IN_SBL_CSS.

I am using a fairly old version, SpamAssassin version 3.3.1 running on
Perl version 5.10.1. The OS is CentOS 6.0.

Thanks,
-Mike


Re: spam score question

2015-04-18 Thread Michael Williamson
On 4/18/15, Antony Stone antony.st...@spamassassin.open.source.it wrote:
 On Saturday 18 April 2015 at 17:16:40 (EU time), Michael Williamson wrote:

 Hi,

 I have another question.

 It appears to me that spamassassin can produce different spam scores
 for the same email.

 Do you mean *exactly* the same email - totally identical headers and body,
 with no changes between the two invocations?

Yes, I believe so, exactly identical.


 In particular, I have noticed that points are omitted for
 RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why?

 Well, there's a chance that the machine you received the email from wasn't
 in
 the Spamhaus blacklist on one occasion, and was on the other...


Something like this is possible, although I think it would be more likely that
the failure is due to a timeout or communication problem with the
Spamhaus server.

 Is the difference due to a difference in how spamassassin is invoked?
 (for example, due an environment variable).
 One way that I invoke spamassassin to get spam scores is from a
 program that is started as a cronjob for a user.

 Does that job run as the user, or as another ID on the system?


It is run from the users cron table. For score comparison, from the
command line,
I do

  # su username
  # spamassassin -t  email_filename

I know that there might actually be some different environment
variables doing it this way (like PATH).


 What exactly are you passing to SpamAssassin from the cron job (where are
 you
 getting the email from in the standard delivery path), and how else do you
 pass emails to SpamAssassin in the normal course of email delivery (you
 don't
 mention what your MTA is, or how SpamAssassin is plugged in to it)?


The email server runs postfix, amavis, and dovecot (and roundcube).
I elaborate on this below.


 This way sometimes omits the points for the test mentioned above. Then,
 when
 I invoke spamassassin from the command line as the same user, for the
 same
 email, I get a higher score because it includes points for
 RCVD_IN_SBL_CSS.

 Since you say you are running both checks as the same user, and also you're

 focusing on the score for one specific test, I'll omit any possibility that

 you've got different Bayes databases on the machine, each being used by the

 different ways you're passing the email to SpamAssassin.

 When you've plucked an email out of the delivery path and sent it (via the
 cron job) to SpamAssassin, do you then re-insert it back into the same place

 in the delivery path, and is that place immediately before it would get
 passed
 to SpamAssassin by some milter or similar feature?  If not, please describe

 your email delivery path, paying particular attention to where you're taking

 the emails out (for cron job processing), where you're reinserting them, and

 where SpamAssassin otherwise gets invoked.

 I am using a fairly old version, SpamAssassin version 3.3.1 running on
 Perl version 5.10.1. The OS is CentOS 6.0.

 Out of interest, why are you passing emails to SpamAssassin from a cron job,

 and then apparently later getting them scored in the normal course of email

 delivery?  What's the purpose of the cron job?


The reason that I am using a cronjob for users, is that I could never
get the dovecot 'sieve' plugin to work. So instead, I wrote a program
using inotify to watch for new email files to appear in the directory
 'Maildir/new/',  and move them immediately to 'Maildir/tmp/' before
dovecot gets them.
Then the program either moves the file back to 'Maildir/new/' or into
a spam folder. In order to
decide where to move it, the program runs spamassassin again (since
the mail has already been scored at this point) using fork/exec. The
reason that spamassassin is run again, is that
some users use the spamassassin bayes database training program
sa-learn for their individual accounts, but that bayes database(s) is
not used, as far as I can tell, when amavis first invokes
spamassassin, before mail is put into 'Maildir/new', so the scores are
too low. When I re-run spamassassin (both of the two different ways
mentioned), it is using the -t option and the email content is piped
in from the standard input. This does not modify the original email
content including the original inserted spam scores, but it does
generate a new score, using the user database.

This method has been working pretty well for a about a week, until
this Spamhaus issue.
An alternative that I have considered is to simply set up a new email
server, but without amavis.

Thanks,
-Mike


Re: configure question

2015-01-18 Thread Michael Williamson
Yes, amavisd is running and modifying the file
/etc/amavisd/amavisd.conf has an effect on the spamassassin header
messages added to emails. Thanks, that answers that question.

Now, the next question is, if I manually run

 # spamassassin -t  spam_filename

I get a different, much higher spam score than is automatically
inserted in the X-Spam-Score
field. Note that, for this user, this has been done:

 # sa-learn --spam .Spam/cur/*



On 1/18/15, Marieke Janssen mjans...@myguard.nl wrote:
Spamassassin seems not to be getting the configuration changes that I
 make.

 Is it possible you run something like Amavis that controls SpamAssassin?
 Headers and spamscore are usual controlled there and will override
 local.cf.

 /MJ




Re: configure question

2015-01-18 Thread Michael Williamson
I think you are right.
Running spamassassin manually appears to use the user's user_prefs
configuration file, and bayes database. I need to get amavisd to do it
that way too, if possible.


On 1/18/15, John Hardin jhar...@impsec.org wrote:
 On Sun, 18 Jan 2015, Michael Williamson wrote:

 Here is an example of the automatically inserted spam headers:

 Return-Path: discount---coup...@acant.firm.in
 X-Spam-Status:
tests=[BAYES_00=-1.9


 And here, I ran the same email through spamassassin manually from the
 command line:

 X-Spam-Report:
  *  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%

 That is either due to training that message reclassified it as spam, or
 you are manually training a different bayes database than amavis is using.

 --
   John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
   jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
Rights can only ever be individual, which means that you cannot
gain a right by joining a mob, no matter how shiny the issued
badges are, or how many of your neighbors are part of it.  -- Marko
 ---
   5 days until John Moses Browning's 160th Birthday



Re: configure question

2015-01-18 Thread Michael Williamson
OK, thanks. I will read the amavisd FAQ.

However, I am skeptical about your explanation for the spam score difference.

Here is an example of the automatically inserted spam headers:

Return-Path: discount---coup...@acant.firm.in
...
X-Spam-Flag: NO
X-Spam-Score: -1.106
X-Spam-Level:
X-Spam-Status: No, score=-1.106 tagged_above=-999 required=5
tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RDNS_NONE=0.793,
SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=no

And here, I ran the same email through spamassassin manually from the
command line:

 # spamassassin -t  spam_filename

Return-Path: discount---coup...@acant.firm.in
X-Spam-Report:
*  100 USER_IN_BLACKLIST From: address is in the user's black-list
*  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
*  [score: 1.]
*  3.3 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS
*  [5.178.109.37 listed in zen.spamhaus.org]
*  1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
*  [URIs: acant.firm.in]
*  2.5 URIBL_DBL_SPAM Contains a spam URL listed in the DBL blocklist
*  [URIs: acant.firm.in]
*  1.2 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
*  [URIs: acant.firm.in]
*  0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
*  [score: 1.]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
*  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay 
lines
X-Spam-Flag: YES
X-Spam-Status: Yes, score=113.3 required=4.0 tests=BAYES_99,BAYES_999,
HTML_MESSAGE,RCVD_IN_SBL_CSS,RDNS_NONE,UNPARSEABLE_RELAY,URIBL_BLACK,
URIBL_DBL_SPAM,URIBL_JP_SURBL,USER_IN_BLACKLIST autolearn=no 
version=3.3.1
X-Spam-Level: **




On 1/18/15, RW rwmailli...@googlemail.com wrote:
 On Sun, 18 Jan 2015 09:06:00 -0700
 Michael Williamson wrote:

 Yes, amavisd is running and modifying the file
 /etc/amavisd/amavisd.conf has an effect on the spamassassin header
 messages added to emails. Thanks, that answers that question.

 Amavisd uses SA as a library, you don't need to be running spamd.

 service spamassassin restart affect's neither amavisd nor the the
 spamassassin script, only tests done through spamc/spamd.

 You should read the Amavisd FAQ.

 Now, the next question is, if I manually run

  # spamassassin -t  spam_filename

 I get a different, much higher spam score than is automatically
 inserted in the X-Spam-Score
 field. Note that, for this user, this has been done:

 You expect to get a higher score the second time. You've trained it as
 spam, and the delay causes it to hit more network test.



configure question

2015-01-17 Thread Michael Williamson
Hi, I have a question.

Spamassassin seems not to be getting the configuration changes that I make.
I add (or change) lines like this

 add_header all Flag _YESNOCAPS_
 required_score 4.0

to both /etc/mail/spamassassin/local.cf and
/home/username/.spamassassin/user_prefs,
I check the file permissions to be readable by all. I restart it

 # service spamassassin restart

It indicates OK, and spamd processes are indicated running. When spam
email arrives, only
sometimes is it tagged (tagged_above=2), and also unchanged is the
value (required=6.2) despite changing required_score in the
configuration files. If then I run

 # spamassassin -t  spam_filename

the output shows much higher spam score than the value inserted in
email (X-Spam-Score).

Running

 # spamassassin -V

returns the message

SpamAssassin version 3.3.1
  running on Perl version 5.10.1

I am running on CentOS 6.6

I am pretty naive about email. At least smtpd and dovecot are running
on this server.

Thanks for any help,
-Mike


Re: configure question

2015-01-17 Thread Michael Williamson
OK. Here it is:

#!/bin/sh
#
# spamassassin This script starts and stops the spamd daemon
#
# chkconfig: - 78 30
# processname: spamd
# description: spamd is a daemon process which uses SpamAssassin to check \
#  email messages for SPAM.  It is normally called by spamc \
#  from a MDA.

# Source function library.
. /etc/rc.d/init.d/functions

prog=spamd

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ ${NETWORKING} = no ]  exit 0

# Set default spamd configuration.
SPAMDOPTIONS=-d -c -m5 -H
SPAMD_PID=/var/run/spamd.pid

# Source spamd configuration.
if [ -f /etc/sysconfig/spamassassin ] ; then
. /etc/sysconfig/spamassassin
fi

[ -f /usr/bin/spamd -o -f /usr/local/bin/spamd ] || exit 0
PATH=$PATH:/usr/bin:/usr/local/bin

# By default it's all good
RETVAL=0

# See how we were called.
case $1 in
  start)
# tell portreserve to release the port
[ -x /sbin/portrelease ]  /sbin/portrelease spamd /dev/null || :
# Start daemon.
echo -n $Starting $prog: 
daemon $NICELEVEL spamd $SPAMDOPTIONS -r $SPAMD_PID
RETVAL=$?
echo
if [ $RETVAL = 0 ]; then
touch /var/lock/subsys/spamd
fi
;;
  stop)
# Stop daemons.
echo -n $Stopping $prog: 
killproc spamd
RETVAL=$?
echo
if [ $RETVAL = 0 ]; then
rm -f /var/lock/subsys/spamd
rm -f $SPAMD_PID
fi
;;
  restart)
$0 stop
sleep 3
$0 start
;;
  condrestart)
   [ -e /var/lock/subsys/spamd ]  $0 restart
   ;;
  status)
status spamd
RETVAL=$?
;;
  *)
echo Usage: $0 {start|stop|restart|status|condrestart}
RETVAL=1
;;
esac

exit $RETVAL


On 1/17/15, Daniel Staal dst...@usa.net wrote:
 --As of January 17, 2015 4:20:36 PM -0700, Michael Williamson is alleged to

 have said:

 to both /etc/mail/spamassassin/local.cf and
 /home/username/.spamassassin/user_prefs,
 I check the file permissions to be readable by all. I restart it

  # service spamassassin restart

 --As for the rest, it is mine.

 That's calling some script from /etc/rc.d/init.d, if I remember Centos
 correctly.  Would you be able to look at/post that script?  I suspect that
 it's probably setting the location of the config files via options, so if
 we can figure out what it's doing than we can figure out what needs to be
 changed.

 Daniel T. Staal

 ---
 This email copyright the author.  Unless otherwise noted, you
 are expressly allowed to retransmit, quote, or otherwise use
 the contents for non-commercial purposes.  This copyright will
 expire 5 years after the author's death, or in 30 years,
 whichever is longer, unless such a period is in excess of
 local copyright law.
 ---