Looks like the first try to send was also blocked, so this time the offensive content is going in a zip.  Here's the repost:

I collected a list representing examples of this obfuscation technique.  The first set is all from one spammer (the pill guy that has a huge volume of crud spam hitting everyone, and sometimes it does get through based on how clean his IP is):

    (see zip file)

The second set is from other randomized crap spam.  There are various techniques used here:

    (see zip file)

I would hate to target just one spammer with a heavy filter (necessary in order to help protect from FP's), and certainly you can't tag all of this stuff.  One of my thoughts would be to just look for non-english characters, and strings with a letter then only certain special characters and then another letter, and score low.  The only problem is that 26 x 26 = 676 combinations for just one special character three character combo.  So some system of limiting the letter choices would be wise, for instance, you could limit the strings to just the 15 most popular letters and eliminate doubles, which would be only 225 combinations per special character, and then choose just 5 or so special characters.  On a subject search, that should be doable.  Any volunteers for finding the 15 most popular letters?  I'll be happy to code it up with a little help.

BTW, spammers using the first type of word obfuscation are also quite likely to use other types, and fail tests like GIBBERISH, GIBBERISHSUB, OBFUSCATION, DYNAMIC, FOREIGN, Y!DIRECTED, etc.  Very little of this stuff gets through our filters because these filters do such a good job at crud detection.

Matt



Kami Razvan wrote:

Hi;

We tried this idea with words but it became way too long..

Perhaps this can be used with an ANTI approach like Matt's filters.

Something like:

SUBJECT -2 ENDSWITH !

SUBJECT -4 CONTAINS 's

So the filters could be cancelled for correct usage.  But again this will have high FP's.

Regards,

Kami



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of John Tolmachoff (Lists)
Sent: Thursday, November 06, 2003 5:40 PM
To: [EMAIL PROTECTED]
Subject: RE: [Declude.JunkMail] Non-alpha-numeric subject filter

Some of those are going to have a large FP rate.

 

John Tolmachoff

Engineer/Consultant/Owner

eServices For You

 

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Mike Gable
Sent:
Thursday, November 06, 2003 1:53 PM
To: Declude (E-mail 2)
Subject: [Declude.JunkMail] Non-alpha-numeric subject filter

 

Hi. I've composed a simple but effective subject filter for non-alpha-num characters that are intended to obfuscate words and phrases. It is catching a lot more junk than before. My hold weight is 25 and delete is 35. Forgive me if this is an old idea. Here it is:

 

SUBJECT 6 CONTAINS ~
SUBJECT 4 CONTAINS `
SUBJECT 2 CONTAINS !
SUBJECT 4 CONTAINS @
SUBJECT 4 CONTAINS #
SUBJECT 6 CONTAINS $
SUBJECT 6 CONTAINS %
SUBJECT 6 CONTAINS ^
SUBJECT 2 CONTAINS &
SUBJECT 4 CONTAINS *
SUBJECT 2 CONTAINS (
SUBJECT 2 CONTAINS )
SUBJECT 2 CONTAINS -
SUBJECT 6 CONTAINS _
SUBJECT 2 CONTAINS +
SUBJECT 2 CONTAINS =
SUBJECT 6 CONTAINS |
SUBJECT 6 CONTAINS \
SUBJECT 2 CONTAINS {
SUBJECT 2 CONTAINS }
SUBJECT 2 CONTAINS [
SUBJECT 2 CONTAINS ]
SUBJECT 2 CONTAINS :
SUBJECT 4 CONTAINS ;
SUBJECT 2 CONTAINS "
SUBJECT 4 CONTAINS '
SUBJECT 6 CONTAINS <
SUBJECT 6 CONTAINS >
SUBJECT 2 CONTAINS ,
SUBJECT 2 CONTAINS .
SUBJECT 2 CONTAINS ?
SUBJECT 4 CONTAINS /


Attachment: Subject_Randomization.zip
Description: Zip compressed data

Reply via email to