Re: Really hard-to-filter spam

Bill Cole Thu, 27 Jul 2023 22:24:01 -0700

On 2023-07-28 at 00:26:51 UTC-0400 (Thu, 27 Jul 2023 23:26:51 -0500(CDT))

David B Funk <users@spamassassin.apache.org>
is rumored to have said:

On Fri, 28 Jul 2023, Jared Hall wrote:
On 7/27/2023 12:08 PM, Ken D'Ambrosio wrote:
Hey, all. I've recently started getting spam that's really hard todeal with, and I'm open to suggestions as to how to approach it.Superficially,
[snip..]
The damn body's been encoded! And there's so little in there thatit's not triggering on many rules (e.g., Bayesian doesn't go over20%). If anyone has a bright idea -- maybe a way to decode theattachments and run a regex against _that_? -- I'm all ears.
1. There are milters/content-filters that decode Base64 messageparts (amavisd-new, mimedefang, etc) for processing by SA.2. There are still sufficiently unique items: First-Name-Only,Mixed-Case word in the Subject (NLP modeling), and a Base-64 encodedHTML attachment (w/ UTF-8 encoding no less). Combined in a Metarule, these innocuous items will likely hit with good accuracy evenwithout Base64 decoding.
Umm, unless I'm really missing something here the usual SA processingdecodes such body stuff (QP, Base64, etc) and feeds the "cleaned" textto the rule processing engine.


Correct. It has nothing to do with the calling glue.

You have to work hard to get matches done on the raw stuff if you wantto do special rule matching on the un-decoded body.

Correct. That should only be needed in rare cases where you're lookingfor a pattern in a non-text part.

I'm not sure why the OP's rule didn't match the target message, but itis NOT because of the Base64 encoding of parts with the 'text' primaryMIME type. If I had to guess, I'd look for invisible characters hiddenin the text (e.g. Unicode "zero width non-joiner" marks and the like)that break the pattern and for lookalike non-ASCII characters (oftenCyrillic or Greek) in the target string.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

Re: Really hard-to-filter spam

Reply via email to