On 4/18/15, Antony Stone <antony.st...@spamassassin.open.source.it> wrote: > On Saturday 18 April 2015 at 17:16:40 (EU time), Michael Williamson wrote: > >> Hi, >> >> I have another question. >> >> It appears to me that spamassassin can produce different spam scores >> for the same email. > > Do you mean *exactly* the same email - totally identical headers and body, > with no changes between the two invocations?
Yes, I believe so, exactly identical. >> In particular, I have noticed that points are omitted for >> RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why? > > Well, there's a chance that the machine you received the email from wasn't > in > the Spamhaus blacklist on one occasion, and was on the other... > Something like this is possible, although I think it would be more likely that the failure is due to a timeout or communication problem with the Spamhaus server. >> Is the difference due to a difference in how spamassassin is invoked? >> (for example, due an environment variable). >> One way that I invoke spamassassin to get spam scores is from a >> program that is started as a cronjob for a user. > > Does that job run as the user, or as another ID on the system? > It is run from the users cron table. For score comparison, from the command line, I do # su <username> # spamassassin -t < <email_filename> I know that there might actually be some different environment variables doing it this way (like PATH). > What exactly are you passing to SpamAssassin from the cron job (where are > you > getting the email from in the standard delivery path), and how else do you > pass emails to SpamAssassin in the normal course of email delivery (you > don't > mention what your MTA is, or how SpamAssassin is plugged in to it)? > The email server runs postfix, amavis, and dovecot (and roundcube). I elaborate on this below. >> This way sometimes omits the points for the test mentioned above. Then, >> when >> I invoke spamassassin from the command line as the same user, for the >> same >> email, I get a higher score because it includes points for >> RCVD_IN_SBL_CSS. > > Since you say you are running both checks as the same user, and also you're > > focusing on the score for one specific test, I'll omit any possibility that > > you've got different Bayes databases on the machine, each being used by the > > different ways you're passing the email to SpamAssassin. > > When you've plucked an email out of the delivery path and sent it (via the > cron job) to SpamAssassin, do you then re-insert it back into the same place > > in the delivery path, and is that place immediately before it would get > passed > to SpamAssassin by some milter or similar feature? If not, please describe > > your email delivery path, paying particular attention to where you're taking > > the emails out (for cron job processing), where you're reinserting them, and > > where SpamAssassin otherwise gets invoked. > >> I am using a fairly old version, SpamAssassin version 3.3.1 running on >> Perl version 5.10.1. The OS is CentOS 6.0. > > Out of interest, why are you passing emails to SpamAssassin from a cron job, > > and then apparently later getting them scored in the normal course of email > > delivery? What's the purpose of the cron job? > The reason that I am using a cronjob for users, is that I could never get the dovecot 'sieve' plugin to work. So instead, I wrote a program using inotify to watch for new email files to appear in the directory 'Maildir/new/', and move them immediately to 'Maildir/tmp/' before dovecot gets them. Then the program either moves the file back to 'Maildir/new/' or into a spam folder. In order to decide where to move it, the program runs spamassassin again (since the mail has already been scored at this point) using fork/exec. The reason that spamassassin is run again, is that some users use the spamassassin bayes database training program sa-learn for their individual accounts, but that bayes database(s) is not used, as far as I can tell, when amavis first invokes spamassassin, before mail is put into 'Maildir/new', so the scores are too low. When I re-run spamassassin (both of the two different ways mentioned), it is using the "-t" option and the email content is piped in from the standard input. This does not modify the original email content including the original inserted spam scores, but it does generate a new score, using the user database. This method has been working pretty well for a about a week, until this Spamhaus issue. An alternative that I have considered is to simply set up a new email server, but without amavis. Thanks, -Mike