Re: Some rules I created for suspicious Javascript practices

2012-03-06 Thread Simon Loewenthal
Hi,

Were these rules, or an improved variant, added to the rules?


Regards, Simon.

On 16/02/12 01:43, neon_overload wrote:
> Hello,
>
> I have created some rules which I have found to be very effective so far at
> identifying a certain type of spam that spamassassin otherwises cannot
> detect.
>
> Here are the rules:
>
> # highly suspicious practices
> rawbody LOCAL_UNNECESSARY_UNESCAPE
> /[+=]\s*unescape\s*\(\s*["']%(6[1-9A-F]|7[0-9A])/
> score LOCAL_UNNECESSARY_UNESCAPE 1.7
> rawbody LOCAL_UNNECESSARY_STRCONCAT /[+=]\s*"[a-zA-Z0-9]+"\+"[a-zA-Z0-9]+"/
> score LOCAL_UNNECESSARY_STRCONCAT 0.5
> rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
> score LOCAL_HIDE_FROMCHARCODE 0.7
> rawbody LOCAL_HIDE_URL /"h"\+"tt"\+"p:"\+"\/"/
> score LOCAL_HIDE_URL 0.7
>
> I have noticed a common trend of spam which has base64-encoded HTML
> attachments, highly obfuscated with Javascript generating and concatenating
> links.  The above four rules detect patterns which should only be present in
> Javascript whose intention is to hide its true function (obfuscate itself).
>
> The first rule checks for use of unescape() on constants where the
> characters are just lowercase letters which wouldn't need escaping anyway.
>
> The second checks for unnecessary string concatenation with constant strings
> consisting entirely of letters.  It would match ="asdf"+"jkl" or
> +"asdf"+"jkl".
>
> The third test checks for substituting another name for the function
> String.fromCharCode, which would be common when trying to obfuscate strings
> in Javascript.
>
> The fourth test was just a specific pattern I saw in a lot of spam, but is
> less generic.  It looks for the string "h"+"tt"+"p:"+"/".  This would
> probably need more alteration to be useful in a more general context.
>
> These are unlikely to hit non-spam, even if it contains Javascript, and even
> if it contains minified Javascript.  It is plausible, however, that it may
> generate hits on discussions that are specifically about how to get through
> spam filters, such as a discussion between spammers, or makers of spam
> filters - since these patterns will occur in the context of "how to get
> through spam filters".
>
> Use these as you wish!  I hereby license them under the WTFPL which is GPL
> and Apache license compatible.
>
> Thomas Rutter


-- 
 PGP is optional: 4BA78604
 simon @ klunky  . org
 simon @ klunky  .   co.uk
I won't accept your confidentiality
agreement, and your Emails are kept.
   ~Ö¿Ö~



Re: Some rules I created for suspicious Javascript practices

2012-03-04 Thread LuKreme
On 16 Feb 2012, at 18:11 , neon_overload wrote:
> I have been hard at work on tweaking these rules and have come up with new
> versions which appear more effective.  Have not spent much time on
> performance though.

Curious how you arrived at the scoring. For example, I would thing that 
LOCAL_U_UNESCAPE would be scored much higher as, at least as it looks to me, no 
one would ever do that legitimately.

-- 
You know, Calculus is sort of like measles. Once you've had it, you
probably won't get it again, and you're glad of it. -- W. Carr



Re: Some rules I created for suspicious Javascript practices

2012-02-16 Thread neon_overload


Adam Katz-10 wrote:
> 
> Thomas Rutter:  If you have any objections to what I did, complain now.
> 

That's fine.

I have been hard at work on tweaking these rules and have come up with new
versions which appear more effective.  Have not spent much time on
performance though.

New version follows:


# highly suspicious practices
rawbody LOCAL_U_UNESCAPE /[+=(]\s*unescape\s*\(\s*["']%(6[1-9A-F]|7[0-9A])/
describe LOCAL_U_UNESCAPE Suspicious use of JS unescape function
score LOCAL_U_UNESCAPE 1.8

rawbody LOCAL_U_STRCONCAT /[+=(]\s*(["'])[a-zA-Z0-9\.]{1,16}\1 ?\+
?\1[a-zA-Z0-9\.]{0,16}\1/
describe LOCAL_U_STRCONCAT Suspicious unnecessary string concatenation
score LOCAL_U_STRCONCAT 0.7

rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
describe LOCAL_HIDE_FROMCHARCODE Obfuscated used of JS fromCharCode function
score LOCAL_HIDE_FROMCHARCODE 0.6

rawbody LOCAL_HIDE_URL /[+=(]\s*(["'])(?!http)h(\1 ?\+ ?\1)?t(\1 ?\+
?\1)?t(\1 ?\+ ?\1)?p(\1 ?\+ ?\1)?(?!:\/\/):(\1 ?\+ ?\1)?\/(\1 ?\+ ?\1)?\//
describe LOCAL_HIDE_URL Obfuscated HTTP link eg. 'ht'+'tp:'+'//'
score LOCAL_HIDE_URL 0.9

rawbody LOCAL_JS_REDIR1
/<[Ss][Cc][Rr][Ii][Pp][Tt]\s*(type="[^"]+"\s*)?>\s*(window|self|(var\s+)?([a-z]+)\s*=\s*window\s*;?\s*\4)?\.?(location|\[['"]location['"]\])(\.href)?\s*[=(]/
describe LOCAL_JS_REDIR1 Code for a JS redirect
score LOCAL_JS_REDIR1 0.5

body LOCAL_FILLER_TEXT /([A-Z][a-z]*(\s[a-z]+){4,6}\.?\s?){18}/
describe LOCAL_FILLER_TEXT Long sequence of 5-7 word sentences with capital
only at start
score LOCAL_FILLER_TEXT 0.4

-- 
View this message in context: 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tp3130p33340124.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Some rules I created for suspicious Javascript practices

2012-02-16 Thread Adam Katz
On 02/15/2012 04:43 PM, Thomas Rutter wrote (as neon_overload):
> I have created some rules which I have found to be very effective so 
> far at identifying a certain type of spam that spamassassin 
> otherwises cannot detect.

> I hereby license them under the WTFPL which is GPL and Apache license
> compatible.

I am interpreting that license as "rename things and they're essentially
public domain."  Rules have been renamed, tweaked, and added to
subversion for testing.  After the next ruleqa run (probably tomorrow),
you can see how they perform on the SpamAssassin corpus at
http://ruleqa.spamassassin.org/?srcpath=neon_overload.cf

The new versions, which are Apache License 2.0, are attached.  Note that
attribution, though not requested, is present.

Thomas Rutter:  If you have any objections to what I did, complain now.
# "I hereby license them under the WTFPL which is GPL and Apache license
# compatible." -- Thomas Rutter/neon_overload to SA-users, 2012-02-16 00:43 UTC
# 
http://old.nabble.com/Some-rules-I-created-for-suspicious-Javascript-practices-tt3130.html
# 
# WTFPL 2.0 basically says "rename things and they're essentially public domain"
# Rules have been renamed and slightly tweaked

rawbody  JS_EXTRA_UNESCAPE  
/[+=]\s{0,9}unescape\s{0,9}\(\s{0,9}["']%(?i:6[1-9A-F]|7[0-9A])/
describe JS_EXTRA_UNESCAPE  JavaScript: Unnecessary URI escaping
#score LOCAL_UNNECESSARY_UNESCAPE 1.7

rawbody  JS_EXTRA_CONCAT
/[+=]\s{0,9}["'][a-z0-9]{1,64}["']\+["'][a-z0-9]{1,64}["']/i
describe JS_EXTRA_CONCATJavaScript: Unnecessary string concatination
#score LOCAL_UNNECESSARY_STRCONCAT 0.5

rawbody  JS_FROMCHARCODE/=\s{0,9}String\.fromCharCode\b/
describe JS_FROMCHARCODEJavaScript: function String.fromCharCode
#score LOCAL_HIDE_FROMCHARCODE 0.7

#rawbody  LOCAL_HIDE_URL/"h"\+"tt"\+"p:"\+"\/"/
rawbody  JS_CONCATINATED_HTTP   
m@(?!http:/)h["'+]{0,3}(?:t["'+]{0,3}){2}p['"+]{0,3}:['"+]{0,3}/@
describe JS_CONCATINATED_HTTP   Contains concatenated URI like "htt"+"p://..."
#score LOCAL_HIDE_URL 0.7



signature.asc
Description: OpenPGP digital signature