spam score question
Hi, I have another question. It appears to me that spamassassin can produce different spam scores for the same email. In particular, I have noticed that points are omitted for RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why? Is the difference due to a difference in how spamassassin is invoked? (for example, due an environment variable). One way that I invoke spamassassin to get spam scores is from a program that is started as a cronjob for a user. This way sometimes omits the points for the test mentioned above. Then, when I invoke spamassassin from the command line as the same user, for the same email, I get a higher score because it includes points for RCVD_IN_SBL_CSS. I am using a fairly old version, SpamAssassin version 3.3.1 running on Perl version 5.10.1. The OS is CentOS 6.0. Thanks, -Mike
Re: spam score question
On 4/18/15, Antony Stone antony.st...@spamassassin.open.source.it wrote: On Saturday 18 April 2015 at 17:16:40 (EU time), Michael Williamson wrote: Hi, I have another question. It appears to me that spamassassin can produce different spam scores for the same email. Do you mean *exactly* the same email - totally identical headers and body, with no changes between the two invocations? Yes, I believe so, exactly identical. In particular, I have noticed that points are omitted for RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why? Well, there's a chance that the machine you received the email from wasn't in the Spamhaus blacklist on one occasion, and was on the other... Something like this is possible, although I think it would be more likely that the failure is due to a timeout or communication problem with the Spamhaus server. Is the difference due to a difference in how spamassassin is invoked? (for example, due an environment variable). One way that I invoke spamassassin to get spam scores is from a program that is started as a cronjob for a user. Does that job run as the user, or as another ID on the system? It is run from the users cron table. For score comparison, from the command line, I do # su username # spamassassin -t email_filename I know that there might actually be some different environment variables doing it this way (like PATH). What exactly are you passing to SpamAssassin from the cron job (where are you getting the email from in the standard delivery path), and how else do you pass emails to SpamAssassin in the normal course of email delivery (you don't mention what your MTA is, or how SpamAssassin is plugged in to it)? The email server runs postfix, amavis, and dovecot (and roundcube). I elaborate on this below. This way sometimes omits the points for the test mentioned above. Then, when I invoke spamassassin from the command line as the same user, for the same email, I get a higher score because it includes points for RCVD_IN_SBL_CSS. Since you say you are running both checks as the same user, and also you're focusing on the score for one specific test, I'll omit any possibility that you've got different Bayes databases on the machine, each being used by the different ways you're passing the email to SpamAssassin. When you've plucked an email out of the delivery path and sent it (via the cron job) to SpamAssassin, do you then re-insert it back into the same place in the delivery path, and is that place immediately before it would get passed to SpamAssassin by some milter or similar feature? If not, please describe your email delivery path, paying particular attention to where you're taking the emails out (for cron job processing), where you're reinserting them, and where SpamAssassin otherwise gets invoked. I am using a fairly old version, SpamAssassin version 3.3.1 running on Perl version 5.10.1. The OS is CentOS 6.0. Out of interest, why are you passing emails to SpamAssassin from a cron job, and then apparently later getting them scored in the normal course of email delivery? What's the purpose of the cron job? The reason that I am using a cronjob for users, is that I could never get the dovecot 'sieve' plugin to work. So instead, I wrote a program using inotify to watch for new email files to appear in the directory 'Maildir/new/', and move them immediately to 'Maildir/tmp/' before dovecot gets them. Then the program either moves the file back to 'Maildir/new/' or into a spam folder. In order to decide where to move it, the program runs spamassassin again (since the mail has already been scored at this point) using fork/exec. The reason that spamassassin is run again, is that some users use the spamassassin bayes database training program sa-learn for their individual accounts, but that bayes database(s) is not used, as far as I can tell, when amavis first invokes spamassassin, before mail is put into 'Maildir/new', so the scores are too low. When I re-run spamassassin (both of the two different ways mentioned), it is using the -t option and the email content is piped in from the standard input. This does not modify the original email content including the original inserted spam scores, but it does generate a new score, using the user database. This method has been working pretty well for a about a week, until this Spamhaus issue. An alternative that I have considered is to simply set up a new email server, but without amavis. Thanks, -Mike
Re: configure question
Yes, amavisd is running and modifying the file /etc/amavisd/amavisd.conf has an effect on the spamassassin header messages added to emails. Thanks, that answers that question. Now, the next question is, if I manually run # spamassassin -t spam_filename I get a different, much higher spam score than is automatically inserted in the X-Spam-Score field. Note that, for this user, this has been done: # sa-learn --spam .Spam/cur/* On 1/18/15, Marieke Janssen mjans...@myguard.nl wrote: Spamassassin seems not to be getting the configuration changes that I make. Is it possible you run something like Amavis that controls SpamAssassin? Headers and spamscore are usual controlled there and will override local.cf. /MJ
Re: configure question
I think you are right. Running spamassassin manually appears to use the user's user_prefs configuration file, and bayes database. I need to get amavisd to do it that way too, if possible. On 1/18/15, John Hardin jhar...@impsec.org wrote: On Sun, 18 Jan 2015, Michael Williamson wrote: Here is an example of the automatically inserted spam headers: Return-Path: discount---coup...@acant.firm.in X-Spam-Status: tests=[BAYES_00=-1.9 And here, I ran the same email through spamassassin manually from the command line: X-Spam-Report: * 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% That is either due to training that message reclassified it as spam, or you are manually training a different bayes database than amavis is using. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Rights can only ever be individual, which means that you cannot gain a right by joining a mob, no matter how shiny the issued badges are, or how many of your neighbors are part of it. -- Marko --- 5 days until John Moses Browning's 160th Birthday
Re: configure question
OK, thanks. I will read the amavisd FAQ. However, I am skeptical about your explanation for the spam score difference. Here is an example of the automatically inserted spam headers: Return-Path: discount---coup...@acant.firm.in ... X-Spam-Flag: NO X-Spam-Score: -1.106 X-Spam-Level: X-Spam-Status: No, score=-1.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RDNS_NONE=0.793, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=no And here, I ran the same email through spamassassin manually from the command line: # spamassassin -t spam_filename Return-Path: discount---coup...@acant.firm.in X-Spam-Report: * 100 USER_IN_BLACKLIST From: address is in the user's black-list * 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% * [score: 1.] * 3.3 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS * [5.178.109.37 listed in zen.spamhaus.org] * 1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist * [URIs: acant.firm.in] * 2.5 URIBL_DBL_SPAM Contains a spam URL listed in the DBL blocklist * [URIs: acant.firm.in] * 1.2 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist * [URIs: acant.firm.in] * 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% * [score: 1.] * 0.0 HTML_MESSAGE BODY: HTML included in message * 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS * 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines X-Spam-Flag: YES X-Spam-Status: Yes, score=113.3 required=4.0 tests=BAYES_99,BAYES_999, HTML_MESSAGE,RCVD_IN_SBL_CSS,RDNS_NONE,UNPARSEABLE_RELAY,URIBL_BLACK, URIBL_DBL_SPAM,URIBL_JP_SURBL,USER_IN_BLACKLIST autolearn=no version=3.3.1 X-Spam-Level: ** On 1/18/15, RW rwmailli...@googlemail.com wrote: On Sun, 18 Jan 2015 09:06:00 -0700 Michael Williamson wrote: Yes, amavisd is running and modifying the file /etc/amavisd/amavisd.conf has an effect on the spamassassin header messages added to emails. Thanks, that answers that question. Amavisd uses SA as a library, you don't need to be running spamd. service spamassassin restart affect's neither amavisd nor the the spamassassin script, only tests done through spamc/spamd. You should read the Amavisd FAQ. Now, the next question is, if I manually run # spamassassin -t spam_filename I get a different, much higher spam score than is automatically inserted in the X-Spam-Score field. Note that, for this user, this has been done: You expect to get a higher score the second time. You've trained it as spam, and the delay causes it to hit more network test.
configure question
Hi, I have a question. Spamassassin seems not to be getting the configuration changes that I make. I add (or change) lines like this add_header all Flag _YESNOCAPS_ required_score 4.0 to both /etc/mail/spamassassin/local.cf and /home/username/.spamassassin/user_prefs, I check the file permissions to be readable by all. I restart it # service spamassassin restart It indicates OK, and spamd processes are indicated running. When spam email arrives, only sometimes is it tagged (tagged_above=2), and also unchanged is the value (required=6.2) despite changing required_score in the configuration files. If then I run # spamassassin -t spam_filename the output shows much higher spam score than the value inserted in email (X-Spam-Score). Running # spamassassin -V returns the message SpamAssassin version 3.3.1 running on Perl version 5.10.1 I am running on CentOS 6.6 I am pretty naive about email. At least smtpd and dovecot are running on this server. Thanks for any help, -Mike
Re: configure question
OK. Here it is: #!/bin/sh # # spamassassin This script starts and stops the spamd daemon # # chkconfig: - 78 30 # processname: spamd # description: spamd is a daemon process which uses SpamAssassin to check \ # email messages for SPAM. It is normally called by spamc \ # from a MDA. # Source function library. . /etc/rc.d/init.d/functions prog=spamd # Source networking configuration. . /etc/sysconfig/network # Check that networking is up. [ ${NETWORKING} = no ] exit 0 # Set default spamd configuration. SPAMDOPTIONS=-d -c -m5 -H SPAMD_PID=/var/run/spamd.pid # Source spamd configuration. if [ -f /etc/sysconfig/spamassassin ] ; then . /etc/sysconfig/spamassassin fi [ -f /usr/bin/spamd -o -f /usr/local/bin/spamd ] || exit 0 PATH=$PATH:/usr/bin:/usr/local/bin # By default it's all good RETVAL=0 # See how we were called. case $1 in start) # tell portreserve to release the port [ -x /sbin/portrelease ] /sbin/portrelease spamd /dev/null || : # Start daemon. echo -n $Starting $prog: daemon $NICELEVEL spamd $SPAMDOPTIONS -r $SPAMD_PID RETVAL=$? echo if [ $RETVAL = 0 ]; then touch /var/lock/subsys/spamd fi ;; stop) # Stop daemons. echo -n $Stopping $prog: killproc spamd RETVAL=$? echo if [ $RETVAL = 0 ]; then rm -f /var/lock/subsys/spamd rm -f $SPAMD_PID fi ;; restart) $0 stop sleep 3 $0 start ;; condrestart) [ -e /var/lock/subsys/spamd ] $0 restart ;; status) status spamd RETVAL=$? ;; *) echo Usage: $0 {start|stop|restart|status|condrestart} RETVAL=1 ;; esac exit $RETVAL On 1/17/15, Daniel Staal dst...@usa.net wrote: --As of January 17, 2015 4:20:36 PM -0700, Michael Williamson is alleged to have said: to both /etc/mail/spamassassin/local.cf and /home/username/.spamassassin/user_prefs, I check the file permissions to be readable by all. I restart it # service spamassassin restart --As for the rest, it is mine. That's calling some script from /etc/rc.d/init.d, if I remember Centos correctly. Would you be able to look at/post that script? I suspect that it's probably setting the location of the config files via options, so if we can figure out what it's doing than we can figure out what needs to be changed. Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---