Re: spam score question

2015-04-18 Thread Michael Williamson
On 4/18/15, Antony Stone  wrote:
> On Saturday 18 April 2015 at 17:16:40 (EU time), Michael Williamson wrote:
>
>> Hi,
>>
>> I have another question.
>>
>> It appears to me that spamassassin can produce different spam scores
>> for the same email.
>
> Do you mean *exactly* the same email - totally identical headers and body,
> with no changes between the two invocations?

Yes, I believe so, exactly identical.


>> In particular, I have noticed that points are omitted for
>> RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why?
>
> Well, there's a chance that the machine you received the email from wasn't
> in
> the Spamhaus blacklist on one occasion, and was on the other...
>

Something like this is possible, although I think it would be more likely that
the failure is due to a timeout or communication problem with the
Spamhaus server.

>> Is the difference due to a difference in how spamassassin is invoked?
>> (for example, due an environment variable).
>> One way that I invoke spamassassin to get spam scores is from a
>> program that is started as a cronjob for a user.
>
> Does that job run as the user, or as another ID on the system?
>

It is run from the users cron table. For score comparison, from the
command line,
I do

  # su 
  # spamassassin -t < 

I know that there might actually be some different environment
variables doing it this way (like PATH).


> What exactly are you passing to SpamAssassin from the cron job (where are
> you
> getting the email from in the standard delivery path), and how else do you
> pass emails to SpamAssassin in the normal course of email delivery (you
> don't
> mention what your MTA is, or how SpamAssassin is plugged in to it)?
>

The email server runs postfix, amavis, and dovecot (and roundcube).
I elaborate on this below.


>> This way sometimes omits the points for the test mentioned above. Then,
>> when
>> I invoke spamassassin from the command line as the same user, for the
>> same
>> email, I get a higher score because it includes points for
>> RCVD_IN_SBL_CSS.
>
> Since you say you are running both checks as the same user, and also you're
>
> focusing on the score for one specific test, I'll omit any possibility that
>
> you've got different Bayes databases on the machine, each being used by the
>
> different ways you're passing the email to SpamAssassin.
>
> When you've plucked an email out of the delivery path and sent it (via the
> cron job) to SpamAssassin, do you then re-insert it back into the same place
>
> in the delivery path, and is that place immediately before it would get
> passed
> to SpamAssassin by some milter or similar feature?  If not, please describe
>
> your email delivery path, paying particular attention to where you're taking
>
> the emails out (for cron job processing), where you're reinserting them, and
>
> where SpamAssassin otherwise gets invoked.
>
>> I am using a fairly old version, SpamAssassin version 3.3.1 running on
>> Perl version 5.10.1. The OS is CentOS 6.0.
>
> Out of interest, why are you passing emails to SpamAssassin from a cron job,
>
> and then apparently later getting them scored in the normal course of email
>
> delivery?  What's the purpose of the cron job?
>

The reason that I am using a cronjob for users, is that I could never
get the dovecot 'sieve' plugin to work. So instead, I wrote a program
using inotify to watch for new email files to appear in the directory
 'Maildir/new/',  and move them immediately to 'Maildir/tmp/' before
dovecot gets them.
Then the program either moves the file back to 'Maildir/new/' or into
a spam folder. In order to
decide where to move it, the program runs spamassassin again (since
the mail has already been scored at this point) using fork/exec. The
reason that spamassassin is run again, is that
some users use the spamassassin bayes database training program
sa-learn for their individual accounts, but that bayes database(s) is
not used, as far as I can tell, when amavis first invokes
spamassassin, before mail is put into 'Maildir/new', so the scores are
too low. When I re-run spamassassin (both of the two different ways
mentioned), it is using the "-t" option and the email content is piped
in from the standard input. This does not modify the original email
content including the original inserted spam scores, but it does
generate a new score, using the user database.

This method has been working pretty well for a about a week, until
this Spamhaus issue.
An alternative that I have considered is to simply set up a new email
server, but without amavis.

Thanks,
-Mike


spam score question

2015-04-18 Thread Michael Williamson
Hi,

I have another question.

It appears to me that spamassassin can produce different spam scores
for the same email.
In particular, I have noticed that points are omitted for
RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why?
Is the difference due to a difference in how spamassassin is invoked?
(for example, due an environment variable).
One way that I invoke spamassassin to get spam scores is from a
program that is started as a cronjob for a user. This way sometimes
omits the points for the test mentioned above. Then, when I invoke
spamassassin from the command line as the same user, for the same
email, I get a higher score because it includes points for
RCVD_IN_SBL_CSS.

I am using a fairly old version, SpamAssassin version 3.3.1 running on
Perl version 5.10.1. The OS is CentOS 6.0.

Thanks,
-Mike


Re: configure question

2015-01-18 Thread Michael Williamson
I think you are right.
Running spamassassin manually appears to use the user's "user_prefs"
configuration file, and bayes database. I need to get amavisd to do it
that way too, if possible.


On 1/18/15, John Hardin  wrote:
> On Sun, 18 Jan 2015, Michael Williamson wrote:
>
>> Here is an example of the automatically inserted spam headers:
>>
>> Return-Path: 
>> X-Spam-Status:
>>tests=[BAYES_00=-1.9
>
>>
>> And here, I ran the same email through spamassassin manually from the
>> command line:
>>
>> X-Spam-Report:
>>  *  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
>
> That is either due to training that message reclassified it as spam, or
> you are manually training a different bayes database than amavis is using.
>
> --
>   John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
>   jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> ---
>Rights can only ever be individual, which means that you cannot
>gain a right by joining a mob, no matter how shiny the issued
>badges are, or how many of your neighbors are part of it.  -- Marko
> ---
>   5 days until John Moses Browning's 160th Birthday
>


Re: configure question

2015-01-18 Thread Michael Williamson
OK, thanks. I will read the amavisd FAQ.

However, I am skeptical about your explanation for the spam score difference.

Here is an example of the automatically inserted spam headers:

Return-Path: 
...
X-Spam-Flag: NO
X-Spam-Score: -1.106
X-Spam-Level:
X-Spam-Status: No, score=-1.106 tagged_above=-999 required=5
tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RDNS_NONE=0.793,
SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=no

And here, I ran the same email through spamassassin manually from the
command line:

 # spamassassin -t < spam_filename

Return-Path: 
X-Spam-Report:
*  100 USER_IN_BLACKLIST From: address is in the user's black-list
*  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
*  [score: 1.]
*  3.3 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS
*  [5.178.109.37 listed in zen.spamhaus.org]
*  1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
*  [URIs: acant.firm.in]
*  2.5 URIBL_DBL_SPAM Contains a spam URL listed in the DBL blocklist
*  [URIs: acant.firm.in]
*  1.2 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
*  [URIs: acant.firm.in]
*  0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
*  [score: 1.]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
*  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay 
lines
X-Spam-Flag: YES
X-Spam-Status: Yes, score=113.3 required=4.0 tests=BAYES_99,BAYES_999,
HTML_MESSAGE,RCVD_IN_SBL_CSS,RDNS_NONE,UNPARSEABLE_RELAY,URIBL_BLACK,
URIBL_DBL_SPAM,URIBL_JP_SURBL,USER_IN_BLACKLIST autolearn=no 
version=3.3.1
X-Spam-Level: **




On 1/18/15, RW  wrote:
> On Sun, 18 Jan 2015 09:06:00 -0700
> Michael Williamson wrote:
>
>> Yes, amavisd is running and modifying the file
>> "/etc/amavisd/amavisd.conf" has an effect on the spamassassin header
>> messages added to emails. Thanks, that answers that question.
>
> Amavisd uses SA as a library, you don't need to be running spamd.
>
> "service spamassassin restart" affect's neither amavisd nor the the
> spamassassin script, only tests done through spamc/spamd.
>
> You should read the Amavisd FAQ.
>
>> Now, the next question is, if I manually run
>>
>>  # spamassassin -t < spam_filename
>>
>> I get a different, much higher spam score than is automatically
>> inserted in the X-Spam-Score
>> field. Note that, for this user, this has been done:
>
> You expect to get a higher score the second time. You've trained it as
> spam, and the delay causes it to hit more network test.
>


Re: configure question

2015-01-18 Thread Michael Williamson
Yes, amavisd is running and modifying the file
"/etc/amavisd/amavisd.conf" has an effect on the spamassassin header
messages added to emails. Thanks, that answers that question.

Now, the next question is, if I manually run

 # spamassassin -t < spam_filename

I get a different, much higher spam score than is automatically
inserted in the X-Spam-Score
field. Note that, for this user, this has been done:

 # sa-learn --spam .Spam/cur/*



On 1/18/15, Marieke Janssen  wrote:
>>Spamassassin seems not to be getting the configuration changes that I
>> make.
>
> Is it possible you run something like Amavis that controls SpamAssassin?
> Headers and spamscore are usual controlled there and will override
> local.cf.
>
> /MJ
>
>


Re: configure question

2015-01-17 Thread Michael Williamson
OK. Here it is:

#!/bin/sh
#
# spamassassin This script starts and stops the spamd daemon
#
# chkconfig: - 78 30
# processname: spamd
# description: spamd is a daemon process which uses SpamAssassin to check \
#  email messages for SPAM.  It is normally called by spamc \
#  from a MDA.

# Source function library.
. /etc/rc.d/init.d/functions

prog="spamd"

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ ${NETWORKING} = "no" ] && exit 0

# Set default spamd configuration.
SPAMDOPTIONS="-d -c -m5 -H"
SPAMD_PID=/var/run/spamd.pid

# Source spamd configuration.
if [ -f /etc/sysconfig/spamassassin ] ; then
. /etc/sysconfig/spamassassin
fi

[ -f /usr/bin/spamd -o -f /usr/local/bin/spamd ] || exit 0
PATH=$PATH:/usr/bin:/usr/local/bin

# By default it's all good
RETVAL=0

# See how we were called.
case "$1" in
  start)
# tell portreserve to release the port
[ -x /sbin/portrelease ] && /sbin/portrelease spamd &>/dev/null || :
# Start daemon.
echo -n $"Starting $prog: "
daemon $NICELEVEL spamd $SPAMDOPTIONS -r $SPAMD_PID
RETVAL=$?
echo
if [ $RETVAL = 0 ]; then
touch /var/lock/subsys/spamd
fi
;;
  stop)
# Stop daemons.
echo -n $"Stopping $prog: "
killproc spamd
RETVAL=$?
echo
if [ $RETVAL = 0 ]; then
rm -f /var/lock/subsys/spamd
rm -f $SPAMD_PID
fi
;;
  restart)
$0 stop
sleep 3
$0 start
;;
  condrestart)
   [ -e /var/lock/subsys/spamd ] && $0 restart
   ;;
  status)
status spamd
RETVAL=$?
;;
  *)
echo "Usage: $0 {start|stop|restart|status|condrestart}"
RETVAL=1
;;
esac

exit $RETVAL


On 1/17/15, Daniel Staal  wrote:
> --As of January 17, 2015 4:20:36 PM -0700, Michael Williamson is alleged to
>
> have said:
>
>> to both /etc/mail/spamassassin/local.cf and
>> /home//.spamassassin/user_prefs,
>> I check the file permissions to be readable by all. I restart it
>>
>>  # service spamassassin restart
>
> --As for the rest, it is mine.
>
> That's calling some script from /etc/rc.d/init.d, if I remember Centos
> correctly.  Would you be able to look at/post that script?  I suspect that
> it's probably setting the location of the config files via options, so if
> we can figure out what it's doing than we can figure out what needs to be
> changed.
>
> Daniel T. Staal
>
> ---
> This email copyright the author.  Unless otherwise noted, you
> are expressly allowed to retransmit, quote, or otherwise use
> the contents for non-commercial purposes.  This copyright will
> expire 5 years after the author's death, or in 30 years,
> whichever is longer, unless such a period is in excess of
> local copyright law.
> ---
>


configure question

2015-01-17 Thread Michael Williamson
Hi, I have a question.

Spamassassin seems not to be getting the configuration changes that I make.
I add (or change) lines like this

 add_header all Flag _YESNOCAPS_
 required_score 4.0

to both /etc/mail/spamassassin/local.cf and
/home//.spamassassin/user_prefs,
I check the file permissions to be readable by all. I restart it

 # service spamassassin restart

It indicates OK, and spamd processes are indicated running. When spam
email arrives, only
sometimes is it tagged (tagged_above=2), and also unchanged is the
value (required=6.2) despite changing required_score in the
configuration files. If then I run

 # spamassassin -t < spam_filename

the output shows much higher spam score than the value inserted in
email (X-Spam-Score).

Running

 # spamassassin -V

returns the message

SpamAssassin version 3.3.1
  running on Perl version 5.10.1

I am running on CentOS 6.6

I am pretty naive about email. At least smtpd and dovecot are running
on this server.

Thanks for any help,
-Mike