Hello, I have created some rules which I have found to be very effective so far at identifying a certain type of spam that spamassassin otherwises cannot detect.
Here are the rules: # highly suspicious practices rawbody LOCAL_UNNECESSARY_UNESCAPE /[+=]\s*unescape\s*\(\s*["']%(6[1-9A-F]|7[0-9A])/ score LOCAL_UNNECESSARY_UNESCAPE 1.7 rawbody LOCAL_UNNECESSARY_STRCONCAT /[+=]\s*"[a-zA-Z0-9]+"\+"[a-zA-Z0-9]+"/ score LOCAL_UNNECESSARY_STRCONCAT 0.5 rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/ score LOCAL_HIDE_FROMCHARCODE 0.7 rawbody LOCAL_HIDE_URL /"h"\+"tt"\+"p:"\+"\/"/ score LOCAL_HIDE_URL 0.7 I have noticed a common trend of spam which has base64-encoded HTML attachments, highly obfuscated with Javascript generating and concatenating links. The above four rules detect patterns which should only be present in Javascript whose intention is to hide its true function (obfuscate itself). The first rule checks for use of unescape() on constants where the characters are just lowercase letters which wouldn't need escaping anyway. The second checks for unnecessary string concatenation with constant strings consisting entirely of letters. It would match ="asdf"+"jkl" or +"asdf"+"jkl". The third test checks for substituting another name for the function String.fromCharCode, which would be common when trying to obfuscate strings in Javascript. The fourth test was just a specific pattern I saw in a lot of spam, but is less generic. It looks for the string "h"+"tt"+"p:"+"/". This would probably need more alteration to be useful in a more general context. These are unlikely to hit non-spam, even if it contains Javascript, and even if it contains minified Javascript. It is plausible, however, that it may generate hits on discussions that are specifically about how to get through spam filters, such as a discussion between spammers, or makers of spam filters - since these patterns will occur in the context of "how to get through spam filters". Use these as you wish! I hereby license them under the WTFPL which is GPL and Apache license compatible. Thomas Rutter -- View this message in context: http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tp33333130p33333130.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.