I've had to deal with quite a bit of obfuscated spam over the years.
I started out having every possible obfuscation in every rule, and whenever i discovered a new one, i needed to go back and update every single rule with the new one. The rules were massive and completely unreadable. Then i discovered replace_tags, which i can highly recommend looking into, if you haven't already:
https://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin_ReplaceTags.html
https://github.com/apache/spamassassin/blob/trunk/rules/25_replace.cf
Using this made the rules so much easier to read when you come back to them 6 months from now, and it's much easier to reuse the same obfuscations. Just update it in one place and it applies to all rules using them. (Sorry, that sounded like a horrible sales-pitch from a TV-advertisement or something..)

I've found the builtin rules are occasionally missing some special characters, so i made a replace_tag for every letter where i include the built-in one. Here's a couple of examples:
replace_tag        CUSTOM_C            (<C>|\xe1\xb4\x84)
replace_tag        CUSTOM_N (<N>|\xe2\x93\x9d|\xc6[\x9e\x9d]|\xef\xbd\x8e)
replace_tag        CUSTOM_V            (<V>)

Then i can add other custom characters i find to each letter there, if the built-in rules are not catching the obfuscation.
I've found the easiest way to get the characters is a quick python for-loop:
>>> for c in "ṣҿṽҿral":
...     print(f"{c}: {c.encode('utf8')}")
...
ṣ: b'\xe1\xb9\xa3'
ҿ: b'\xd2\xbf'
ṽ: b'\xe1\xb9\xbd'
ҿ: b'\xd2\xbf'
r: b'r'
a: b'a'
l: b'l'

In the end, you can make either one rule that catches both the normal and obfuscated versions, or separate them so you can punish obfuscated versions even harder: body        __BODY_VIAGRA /(^|[^a-zA-Z0-9\.]|<CUSTOM_WORD_SEP>)viagra([^a-zA-Z0-9]|$)/i body        __BODY_VIAGRA_OBF /(^|[^a-zA-Z0-9]|<CUSTOM_WORD_SEP>)(?!\bviagra\b)<CUSTOM_V><CUSTOM_I><CUSTOM_A><CUSTOM_G><CUSTOM_R><CUSTOM_A>([^a-zA-Z0-9]|$)/i
replace_rules    __BODY_VIAGRA __BODY_VIAGRA_OBF

I would say start out with the built-in ones from the 25_replace.cf file, and if you see they're not catching certain characters, start creating your own versions and add those characters.

As others have pointed out, it might cause issues if you actually have people writing in languages that use those special characters, but that's the eternal joy of managing a spam-filter..


On 12/15/25 2:04 AM, Mark London wrote:
Hi - One of users got a bitcoin blackmail email, that use special characters to avoid the bitcoin spam rules.   Does anybody have rules that detect this type of obfuscation?  Thanks. - Mark

Begin forwarded message:

*From:* Ashley Adkins <[email protected]>
*Date:* December 12, 2025 at 3:51:30 PM EST
*Subject:* *Reminder! Check this message now*


Greetings!

I nҿҿd to inform bad nĕwṣ with you.

Approximately ṣҿṽҿral monthṡ ago I obtainễd accȩṡṡ to your gadgễtŝ, which you uṩẽ for wҿb _(krxvtgqb) _ṣurfing. Aftҿr that, I _(qofyata) _haⱱê ṥtartȅd tracking your intẹrnẹt activities.

Here iṩ thḗ ṣȇquȇncȇ of events:


--
Martin Flygenring (maf)
Systems Engineer, group.one / one.com

Reply via email to