On 4/18/15, Antony Stone <antony.st...@spamassassin.open.source.it> wrote:
> On Saturday 18 April 2015 at 17:16:40 (EU time), Michael Williamson wrote:
>
>> Hi,
>>
>> I have another question.
>>
>> It appears to me that spamassassin can produce different spam scores
>> for the same email.
>
> Do you mean *exactly* the same email - totally identical headers and body,
> with no changes between the two invocations?

Yes, I believe so, exactly identical.


>> In particular, I have noticed that points are omitted for
>> RCVD_IN_SBL_CSS (Spamhaus blacklist) sometimes. Why?
>
> Well, there's a chance that the machine you received the email from wasn't
> in
> the Spamhaus blacklist on one occasion, and was on the other...
>

Something like this is possible, although I think it would be more likely that
the failure is due to a timeout or communication problem with the
Spamhaus server.

>> Is the difference due to a difference in how spamassassin is invoked?
>> (for example, due an environment variable).
>> One way that I invoke spamassassin to get spam scores is from a
>> program that is started as a cronjob for a user.
>
> Does that job run as the user, or as another ID on the system?
>

It is run from the users cron table. For score comparison, from the
command line,
I do

  # su <username>
  # spamassassin -t < <email_filename>

I know that there might actually be some different environment
variables doing it this way (like PATH).


> What exactly are you passing to SpamAssassin from the cron job (where are
> you
> getting the email from in the standard delivery path), and how else do you
> pass emails to SpamAssassin in the normal course of email delivery (you
> don't
> mention what your MTA is, or how SpamAssassin is plugged in to it)?
>

The email server runs postfix, amavis, and dovecot (and roundcube).
I elaborate on this below.


>> This way sometimes omits the points for the test mentioned above. Then,
>> when
>> I invoke spamassassin from the command line as the same user, for the
>> same
>> email, I get a higher score because it includes points for
>> RCVD_IN_SBL_CSS.
>
> Since you say you are running both checks as the same user, and also you're
>
> focusing on the score for one specific test, I'll omit any possibility that
>
> you've got different Bayes databases on the machine, each being used by the
>
> different ways you're passing the email to SpamAssassin.
>
> When you've plucked an email out of the delivery path and sent it (via the
> cron job) to SpamAssassin, do you then re-insert it back into the same place
>
> in the delivery path, and is that place immediately before it would get
> passed
> to SpamAssassin by some milter or similar feature?  If not, please describe
>
> your email delivery path, paying particular attention to where you're taking
>
> the emails out (for cron job processing), where you're reinserting them, and
>
> where SpamAssassin otherwise gets invoked.
>
>> I am using a fairly old version, SpamAssassin version 3.3.1 running on
>> Perl version 5.10.1. The OS is CentOS 6.0.
>
> Out of interest, why are you passing emails to SpamAssassin from a cron job,
>
> and then apparently later getting them scored in the normal course of email
>
> delivery?  What's the purpose of the cron job?
>

The reason that I am using a cronjob for users, is that I could never
get the dovecot 'sieve' plugin to work. So instead, I wrote a program
using inotify to watch for new email files to appear in the directory
 'Maildir/new/',  and move them immediately to 'Maildir/tmp/' before
dovecot gets them.
Then the program either moves the file back to 'Maildir/new/' or into
a spam folder. In order to
decide where to move it, the program runs spamassassin again (since
the mail has already been scored at this point) using fork/exec. The
reason that spamassassin is run again, is that
some users use the spamassassin bayes database training program
sa-learn for their individual accounts, but that bayes database(s) is
not used, as far as I can tell, when amavis first invokes
spamassassin, before mail is put into 'Maildir/new', so the scores are
too low. When I re-run spamassassin (both of the two different ways
mentioned), it is using the "-t" option and the email content is piped
in from the standard input. This does not modify the original email
content including the original inserted spam scores, but it does
generate a new score, using the user database.

This method has been working pretty well for a about a week, until
this Spamhaus issue.
An alternative that I have considered is to simply set up a new email
server, but without amavis.

Thanks,
-Mike

Reply via email to