On 23/10/2010 2:28 PM, Lawrence @ Rogers wrote:
Hello all,

I noticed recently that our users are getting spam with the subject similar to the following:


SpamAssassin seems to be having a hard time determining whether it is spam or not because it appears as one long word.

In all cases, the subject contains no spaces (to prevent detection I would think) and is longer than 62 characters (not sure why they do this, but it is true in every sample I've seen so far).

I would like to create a rule to pick up on this, but having a bit of difficult with the regex for the rule. This is what I've come up with so far

header CR_SUBJECT_SPAMMY    Subject =~ /.{62}/
describe CR_SUBJECT_SPAMMY Subject looks spammy (contains a lot of characters, and no spaces)
score CR_SUBJECT_SPAMMY     2.5

I just need to modify the regex to check that the Subject contains no spaces.

I've done some research, and the longest non-coined word in a major dictionary is 30 characters long, meaning that if it was used twice in a subject, the total length would still only be 60 characters, There may be some FPs if the sender used formatting like commas and such, but the possibility of them using 2 of the word, then formatting without spacing, would probably be extremely remote.

Any assistance or advice would be greatly appreciated.


Lawrence Williams

This is the rule I've come up with now

# Matches a new technique used by spammers in the Subject line
# Running a bunch of pornographic words together (with no spaces) to evade
# spam filters
# This rule tests for the Subject containing any numbers, letters, or common formatting
# string must be at least 42 characters and contain no spaces

header CR_SUBJECT_SPAMMY    Subject =~ /^[0-9a-zA-Z,.+]{42,}$/
describe CR_SUBJECT_SPAMMY Subject looks spammy (contains a lot of characters, and no spaces)
score CR_SUBJECT_SPAMMY     3.5
tflags CR_SUBJECT_SPAMMY noautolearn

Reply via email to