On 9 Dec 2018, at 18:23, Chris Pollock wrote: > On Sun, 2018-12-09 at 13:06 -0500, Bill Cole wrote: >> On 9 Dec 2018, at 12:04, Chris Pollock wrote: >> >>> This is probably very trivial and doesn't affect anything except >>> maybe >>> the size of the headers but I have to ask. When looking at the >>> headers >>> of some ham I noticed - https://pastebin.com/H7euxqVX the two rules >>> I >>> mention above are in 72_active.cf. Is there a reason for the number >>> of >>> times it's listed? Couldn't each subtest be listed just once >>> instead >>> of >>> multiple times? >> >> Not with the current documented behavior of the code, given the way >> those sub-rules are designed to work together. The goal is to >> identify >> messages which use Latin-script 'e' characters but also use many >> non-Latin-script characters which look like 'e' but are not. To make >> this determination, the rules require the 'multiple' flag without a >> cap >> on thne number of matches which a 'maxhits' parameter would set. > > Got it, thanks Bill. I've never noticed this before. I also noticed > that according to my daily sa-update output this subtest is apparently > new or at least it didn't appear in the output until this past Fri.
Correct. See the thread with the subject "No longer just embedded =9D characters in blackmail emails" here last week for the background. >> >> It is not recommended to routinely add the list of matched sub-rules >> to >> scanned messages. >> > Any specific reason why? This is just on my home system. It's got the potential to be VERY noisy (as you've discovered) while not really providing much useful info. Not a big deal on a small system. Anyway, as of today I've capped those 2 subrules at levels which leave ample space to still match the target spam. Should show up in tomorrow's update.
signature.asc
Description: OpenPGP digital signature