https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7933

--- Comment #3 from Bill Cole <[email protected]> ---
(In reply to jidanni from comment #2)
> Created attachment 5755 [details]
> Old mail not detected
> 
> Why doesn't this trigger
> 
> header DATE_IN_PAST_96_XX     eval:check_for_shifted_date('undef', '-96')
> describe DATE_IN_PAST_96_XX   Date: is 96 hours or more before Received: date

Good question... 

If I'm reading the code correctly, the reason for this is that there are
plausible and parseable Received headers which have times close to the Date
header. If I strip out the Received headers from 2020, it triggers that rule.

The comments in the code imply that not using the smallest Date/Received
difference resulted in false positives. 

Since DATE_IN_PAST_96_XX and its siblings are fairly strong rules with scores
set by the RuleQA process (current scores for DATE_IN_PAST_96_XX: 2.600 2.070
1.233 3.405)  I do not believe it would be polite to users to modify the
behavior of the underlying eval function at this point. It currently is a
measurement of the apparent delay between message composition and initial
submission, not of total transit time. RuleQA shows that metric correlating
rather well with spamminess.

It may be useful to add a different test that looks at a more strictly
specified date comparison, such as using the last Received header or the last
"trusted" Received header instead of the current practice of using the smallest
time delta  in a parseable Received header relative to the Date header. That
would require a new eval in Plugin/HeaderEval.pm. Whether a measurement of
putative total transit time actually correlates either way to ham or spam is
anyone's guess. In the sample case, it seems likely to me that the message is
not spam, but rather some sort of re-injected mail originally sent to a
discussion list.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to