Bob Proulx wrote:
>    0.0 RCVD_FAKE_HELO_DOTCOM  Received contains a faked HELO hostname
>    2.2 FORGED_YAHOO_RCVD      'From' yahoo.com does not match 'Received' 
> headers
>   score RCVD_FAKE_HELO_DOTCOM 0.899 0.034 0.969 0.424

Thanks to jdow and also to Bob Menschel who mailed me offlist I now
have the answer.  My bayes database was (temporarily) broken when that
message was scored.

  grep -r score.*FORGED_YAHOO_RCVD /usr/share/spamassassin
  score FORGED_YAHOO_RCVD 1.668 2.174 2.095 2.700

Since FORGED_YAHOO_RCVD lists here as 2.2 I was operating on column
two.  I was expecting to be operating in column four.  Column four
would print 0.4 but column two would print 0.0 within the rounding and
truncation of the formatting.

Quoting the docs:

           If four valid scores are listed, then the score that
           is used depends on how SpamAssassin is being used. The
           first score is used when both Bayes and network tests
           are disabled. The second score is used when Bayes is
           disabled, but network tests are enabled. The third
           score is used when Bayes is enabled and network tests
           are disabled. The fourth score is used when Bayes is
           enabled and network tests are enabled.

Basically: score tag -bayes-net -bayes+net +bayes-net +bayes+net

Why was I operating in "-bayes+net" mode?  Not sure.  I am normally
using bayes and getting column four scores.  I had just done the
upgrade to sa-3.0 and that changes the database from version 2 to
version 3.  I also did some more training on spam.  But my sa-learn
--dump numbers were very low in the 250 range even though I should
have had a thousand messages trained.

I think the bayes db was lost during my upgrade.  Then I trained on
top of it.  In the middle of that this message was scored without
bayes but with net and so got the column two scores.  That explains
the strange behavior.

To double check this I recovered my previous bayes db from backup.  I
downgraded back to the previous sa-2.64.  Then upgraded again to
sa-3.0 to try to recreate my problem with the bayes database.  But I
was unable to recreate the problem.  Everything worked fine this time
and my bayes database lists 1137/1272 ham/spam now.  So it must have
been some strange problem between the chair and keyboard when I did
the first upgrade.

Bob

Reply via email to