Hello,

I have created some rules which I have found to be very effective so far at
identifying a certain type of spam that spamassassin otherwises cannot
detect.

Here are the rules:

# highly suspicious practices
rawbody LOCAL_UNNECESSARY_UNESCAPE
/[+=]\s*unescape\s*\(\s*["']%(6[1-9A-F]|7[0-9A])/
score LOCAL_UNNECESSARY_UNESCAPE 1.7
rawbody LOCAL_UNNECESSARY_STRCONCAT /[+=]\s*"[a-zA-Z0-9]+"\+"[a-zA-Z0-9]+"/
score LOCAL_UNNECESSARY_STRCONCAT 0.5
rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
score LOCAL_HIDE_FROMCHARCODE 0.7
rawbody LOCAL_HIDE_URL /"h"\+"tt"\+"p:"\+"\/"/
score LOCAL_HIDE_URL 0.7

I have noticed a common trend of spam which has base64-encoded HTML
attachments, highly obfuscated with Javascript generating and concatenating
links.  The above four rules detect patterns which should only be present in
Javascript whose intention is to hide its true function (obfuscate itself).

The first rule checks for use of unescape() on constants where the
characters are just lowercase letters which wouldn't need escaping anyway.

The second checks for unnecessary string concatenation with constant strings
consisting entirely of letters.  It would match ="asdf"+"jkl" or
+"asdf"+"jkl".

The third test checks for substituting another name for the function
String.fromCharCode, which would be common when trying to obfuscate strings
in Javascript.

The fourth test was just a specific pattern I saw in a lot of spam, but is
less generic.  It looks for the string "h"+"tt"+"p:"+"/".  This would
probably need more alteration to be useful in a more general context.

These are unlikely to hit non-spam, even if it contains Javascript, and even
if it contains minified Javascript.  It is plausible, however, that it may
generate hits on discussions that are specifically about how to get through
spam filters, such as a discussion between spammers, or makers of spam
filters - since these patterns will occur in the context of "how to get
through spam filters".

Use these as you wish!  I hereby license them under the WTFPL which is GPL
and Apache license compatible.

Thomas Rutter
-- 
View this message in context: 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tp33333130p33333130.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Reply via email to