Re: Really hard-to-filter spam

Thomas Cameron via users Wed, 02 Aug 2023 11:25:44 -0700

On 7/28/23 00:23, Bill Cole wrote:

1. There are milters/content-filters that decode Base64 message parts(amavisd-new, mimedefang, etc) for processing by SA.
2. There are still sufficiently unique items: First-Name-Only,Mixed-Case word in the Subject (NLP modeling), and a Base-64 encodedHTML attachment (w/ UTF-8 encoding no less). Combined in a Metarule, these innocuous items will likely hit with good accuracy evenwithout Base64 decoding.
Umm, unless I'm really missing something here the usual SA processingdecodes such body stuff (QP, Base64, etc) and feeds the "cleaned"text to the rule processing engine.
Correct. It has nothing to do with the calling glue.
You have to work hard to get matches done on the raw stuff if youwant to do special rule matching on the un-decoded body.
Correct. That should only be needed in rare cases where you're lookingfor a pattern in a non-text part.
I'm not sure why the OP's rule didn't match the target message, but itis NOT because of the Base64 encoding of parts with the 'text' primaryMIME type. If I had to guess, I'd look for invisible characters hiddenin the text (e.g. Unicode "zero width non-joiner" marks and the like)that break the pattern and for lookalike non-ASCII characters (oftenCyrillic or Greek) in the target string.

I am seeing the same issue. I get those same emails, with that132.1532.1334 string or similar. SA is definitely not catching them,even though I dump them into my spam folder and run sa-learn --spamagainst them day after day. How can I check to see if it's actuallydecoding the base64? Or is that just a fact? It seems incredibly weirdthat I get these things every day, I mark them as spam every day, andthey never hit more than a couple of points on the spam scale.


Thomas

Re: Really hard-to-filter spam

Reply via email to