Re: Some rules I created for suspicious Javascript practices

2012-03-06 Thread Simon Loewenthal
Hi,

Were these rules, or an improved variant, added to the rules?


Regards, Simon.

On 16/02/12 01:43, neon_overload wrote:
 Hello,

 I have created some rules which I have found to be very effective so far at
 identifying a certain type of spam that spamassassin otherwises cannot
 detect.

 Here are the rules:

 # highly suspicious practices
 rawbody LOCAL_UNNECESSARY_UNESCAPE
 /[+=]\s*unescape\s*\(\s*[']%(6[1-9A-F]|7[0-9A])/
 score LOCAL_UNNECESSARY_UNESCAPE 1.7
 rawbody LOCAL_UNNECESSARY_STRCONCAT /[+=]\s*[a-zA-Z0-9]+\+[a-zA-Z0-9]+/
 score LOCAL_UNNECESSARY_STRCONCAT 0.5
 rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
 score LOCAL_HIDE_FROMCHARCODE 0.7
 rawbody LOCAL_HIDE_URL /h\+tt\+p:\+\//
 score LOCAL_HIDE_URL 0.7

 I have noticed a common trend of spam which has base64-encoded HTML
 attachments, highly obfuscated with Javascript generating and concatenating
 links.  The above four rules detect patterns which should only be present in
 Javascript whose intention is to hide its true function (obfuscate itself).

 The first rule checks for use of unescape() on constants where the
 characters are just lowercase letters which wouldn't need escaping anyway.

 The second checks for unnecessary string concatenation with constant strings
 consisting entirely of letters.  It would match =asdf+jkl or
 +asdf+jkl.

 The third test checks for substituting another name for the function
 String.fromCharCode, which would be common when trying to obfuscate strings
 in Javascript.

 The fourth test was just a specific pattern I saw in a lot of spam, but is
 less generic.  It looks for the string h+tt+p:+/.  This would
 probably need more alteration to be useful in a more general context.

 These are unlikely to hit non-spam, even if it contains Javascript, and even
 if it contains minified Javascript.  It is plausible, however, that it may
 generate hits on discussions that are specifically about how to get through
 spam filters, such as a discussion between spammers, or makers of spam
 filters - since these patterns will occur in the context of how to get
 through spam filters.

 Use these as you wish!  I hereby license them under the WTFPL which is GPL
 and Apache license compatible.

 Thomas Rutter


-- 
 PGP is optional: 4BA78604
 simon @ klunky  . org
 simon @ klunky  .   co.uk
I won't accept your confidentiality
agreement, and your Emails are kept.
   ~Ö¿Ö~



Re: Some rules I created for suspicious Javascript practices

2012-03-04 Thread LuKreme
On 16 Feb 2012, at 18:11 , neon_overload wrote:
 I have been hard at work on tweaking these rules and have come up with new
 versions which appear more effective.  Have not spent much time on
 performance though.

Curious how you arrived at the scoring. For example, I would thing that 
LOCAL_U_UNESCAPE would be scored much higher as, at least as it looks to me, no 
one would ever do that legitimately.

-- 
You know, Calculus is sort of like measles. Once you've had it, you
probably won't get it again, and you're glad of it. -- W. Carr



Re: Some rules I created for suspicious Javascript practices

2012-02-16 Thread Adam Katz
On 02/15/2012 04:43 PM, Thomas Rutter wrote (as neon_overload):
 I have created some rules which I have found to be very effective so 
 far at identifying a certain type of spam that spamassassin 
 otherwises cannot detect.

 I hereby license them under the WTFPL which is GPL and Apache license
 compatible.

I am interpreting that license as rename things and they're essentially
public domain.  Rules have been renamed, tweaked, and added to
subversion for testing.  After the next ruleqa run (probably tomorrow),
you can see how they perform on the SpamAssassin corpus at
http://ruleqa.spamassassin.org/?srcpath=neon_overload.cf

The new versions, which are Apache License 2.0, are attached.  Note that
attribution, though not requested, is present.

Thomas Rutter:  If you have any objections to what I did, complain now.
# I hereby license them under the WTFPL which is GPL and Apache license
# compatible. -- Thomas Rutter/neon_overload to SA-users, 2012-02-16 00:43 UTC
# 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tt3130.html
# 
# WTFPL 2.0 basically says rename things and they're essentially public domain
# Rules have been renamed and slightly tweaked

rawbody  JS_EXTRA_UNESCAPE  
/[+=]\s{0,9}unescape\s{0,9}\(\s{0,9}[']%(?i:6[1-9A-F]|7[0-9A])/
describe JS_EXTRA_UNESCAPE  JavaScript: Unnecessary URI escaping
#score LOCAL_UNNECESSARY_UNESCAPE 1.7

rawbody  JS_EXTRA_CONCAT
/[+=]\s{0,9}['][a-z0-9]{1,64}[']\+['][a-z0-9]{1,64}[']/i
describe JS_EXTRA_CONCATJavaScript: Unnecessary string concatination
#score LOCAL_UNNECESSARY_STRCONCAT 0.5

rawbody  JS_FROMCHARCODE/=\s{0,9}String\.fromCharCode\b/
describe JS_FROMCHARCODEJavaScript: function String.fromCharCode
#score LOCAL_HIDE_FROMCHARCODE 0.7

#rawbody  LOCAL_HIDE_URL/h\+tt\+p:\+\//
rawbody  JS_CONCATINATED_HTTP   
m@(?!http:/)h['+]{0,3}(?:t['+]{0,3}){2}p['+]{0,3}:['+]{0,3}/@
describe JS_CONCATINATED_HTTP   Contains concatenated URI like htt+p://...
#score LOCAL_HIDE_URL 0.7



signature.asc
Description: OpenPGP digital signature


Re: Some rules I created for suspicious Javascript practices

2012-02-16 Thread neon_overload


Adam Katz-10 wrote:
 
 Thomas Rutter:  If you have any objections to what I did, complain now.
 

That's fine.

I have been hard at work on tweaking these rules and have come up with new
versions which appear more effective.  Have not spent much time on
performance though.

New version follows:


# highly suspicious practices
rawbody LOCAL_U_UNESCAPE /[+=(]\s*unescape\s*\(\s*[']%(6[1-9A-F]|7[0-9A])/
describe LOCAL_U_UNESCAPE Suspicious use of JS unescape function
score LOCAL_U_UNESCAPE 1.8

rawbody LOCAL_U_STRCONCAT /[+=(]\s*(['])[a-zA-Z0-9\.]{1,16}\1 ?\+
?\1[a-zA-Z0-9\.]{0,16}\1/
describe LOCAL_U_STRCONCAT Suspicious unnecessary string concatenation
score LOCAL_U_STRCONCAT 0.7

rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
describe LOCAL_HIDE_FROMCHARCODE Obfuscated used of JS fromCharCode function
score LOCAL_HIDE_FROMCHARCODE 0.6

rawbody LOCAL_HIDE_URL /[+=(]\s*(['])(?!http)h(\1 ?\+ ?\1)?t(\1 ?\+
?\1)?t(\1 ?\+ ?\1)?p(\1 ?\+ ?\1)?(?!:\/\/):(\1 ?\+ ?\1)?\/(\1 ?\+ ?\1)?\//
describe LOCAL_HIDE_URL Obfuscated HTTP link eg. 'ht'+'tp:'+'//'
score LOCAL_HIDE_URL 0.9

rawbody LOCAL_JS_REDIR1
/[Ss][Cc][Rr][Ii][Pp][Tt]\s*(type=[^]+\s*)?\s*(window|self|(var\s+)?([a-z]+)\s*=\s*window\s*;?\s*\4)?\.?(location|\[[']location[']\])(\.href)?\s*[=(]/
describe LOCAL_JS_REDIR1 Code for a JS redirect
score LOCAL_JS_REDIR1 0.5

body LOCAL_FILLER_TEXT /([A-Z][a-z]*(\s[a-z]+){4,6}\.?\s?){18}/
describe LOCAL_FILLER_TEXT Long sequence of 5-7 word sentences with capital
only at start
score LOCAL_FILLER_TEXT 0.4

-- 
View this message in context: 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tp3130p33340124.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Some rules I created for suspicious Javascript practices

2012-02-15 Thread neon_overload

Hello,

I have created some rules which I have found to be very effective so far at
identifying a certain type of spam that spamassassin otherwises cannot
detect.

Here are the rules:

# highly suspicious practices
rawbody LOCAL_UNNECESSARY_UNESCAPE
/[+=]\s*unescape\s*\(\s*[']%(6[1-9A-F]|7[0-9A])/
score LOCAL_UNNECESSARY_UNESCAPE 1.7
rawbody LOCAL_UNNECESSARY_STRCONCAT /[+=]\s*[a-zA-Z0-9]+\+[a-zA-Z0-9]+/
score LOCAL_UNNECESSARY_STRCONCAT 0.5
rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
score LOCAL_HIDE_FROMCHARCODE 0.7
rawbody LOCAL_HIDE_URL /h\+tt\+p:\+\//
score LOCAL_HIDE_URL 0.7

I have noticed a common trend of spam which has base64-encoded HTML
attachments, highly obfuscated with Javascript generating and concatenating
links.  The above four rules detect patterns which should only be present in
Javascript whose intention is to hide its true function (obfuscate itself).

The first rule checks for use of unescape() on constants where the
characters are just lowercase letters which wouldn't need escaping anyway.

The second checks for unnecessary string concatenation with constant strings
consisting entirely of letters.  It would match =asdf+jkl or
+asdf+jkl.

The third test checks for substituting another name for the function
String.fromCharCode, which would be common when trying to obfuscate strings
in Javascript.

The fourth test was just a specific pattern I saw in a lot of spam, but is
less generic.  It looks for the string h+tt+p:+/.  This would
probably need more alteration to be useful in a more general context.

These are unlikely to hit non-spam, even if it contains Javascript, and even
if it contains minified Javascript.  It is plausible, however, that it may
generate hits on discussions that are specifically about how to get through
spam filters, such as a discussion between spammers, or makers of spam
filters - since these patterns will occur in the context of how to get
through spam filters.

Use these as you wish!  I hereby license them under the WTFPL which is GPL
and Apache license compatible.

Thomas Rutter
-- 
View this message in context: 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tp3130p3130.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.