> Would a rule to calculate some kind of "special chars" vs > "total chars" ratio be useful? > Does anybody have that kind of rule already?
Doing that as a ratio would require an eval, I suspect. However, detecting obfuscated things is pretty easy. You need some new rules! :-) Hie thee off to exit0 or rulesemporium or the like and get Matt's antidrug set, just for a start. Here are some results from the first few of your spam: Content analysis details: (27.1 points, 4.6 required) pts rule name description ---- ---------------------- ------------------------------------------------ -- 1.4 SARE_ALC Some header matches /improve your/i 0.6 RATWR10a_MESSID Message-ID has ratware pattern (HEXHEX.HEXHEX@) 1.8 LOCAL_OBFU_CELEXA BODY: Obfuscated 'CELEXA' in body 1.8 LOCAL_OBFU_XANAX BODY: Obfuscated 'XANAX' in body 1.8 LOCAL_OBFU_LEVITRA BODY: Obfuscated 'LEVITRA' in body 1.8 LOCAL_OBFU_PAXIL BODY: Obfuscated 'PAXIL' in body 1.8 LOCAL_OBFU_VIAGRA BODY: Obfuscated 'VIAGRA' in body 1.0 SARE_OBFUGIRLS BODY: masked spam word(s) 1.8 LOCAL_OBFU_MERIDIA BODY: Obfuscated 'MERIDIA' in body 0.1 TW_OC BODY: Odd Letter Triples with OC 1.8 LOCAL_OBFU_CIALIS BODY: Obfuscated 'CIALIS' in body 1.8 LOCAL_OBFU_XENICAL BODY: Obfuscated 'XENICAL' in body 1.5 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug 1.0 DRUGS_ANXIETY_OBFU Obfuscated reference to an anxiety control drug 1.0 DRUGS_ERECTILE Refers to an erectile drug 0.0 DRUGS_ANXIETY Refers to an anxiety control drug 0.0 DRUGS_DEPRESSION Refers to an antidepressant 0.0 DRUGS_DIET Refers to a diet drug 1.0 DRUGS_DEPR_EREC Refers to both an erectile and an antidepressant 1.0 DRUGS_ANXIETY_EREC Refers to both an erectile and an anxiety drug 1.0 DRUGS_DIET_EREC Refers to both an erectile and a diet drug 1.0 DRUGS_MANYKINDS Refers to at least four kinds of drugs Content analysis details: (31.0 points, 4.6 required) pts rule name description ---- ---------------------- ------------------------------------------------ -- 1.8 LOCAL_OBFU_XANAX BODY: Obfuscated 'XANAX' in body 1.8 LOCAL_OBFU_ZOLOFT BODY: Obfuscated 'ZOLOFT' in body 1.8 LOCAL_OBFU_LEVITRA BODY: Obfuscated 'LEVITRA' in body 1.8 LOCAL_OBFU_CELEBREX BODY: Obfuscated 'CELEBREX' in body 1.8 LOCAL_OBFU_PAXIL BODY: Obfuscated 'PAXIL' in body 2.8 LOCAL_OBFU_VICODIN BODY: Obfuscated 'VICODIN' in body 1.8 LOCAL_OBFU_VIAGRA BODY: Obfuscated 'VIAGRA' in body 1.8 LOCAL_OBFU_MERIDIA BODY: Obfuscated 'MERIDIA' in body 1.8 LOCAL_OBFU_VIOXX BODY: Obfuscated 'VIOXX' in body 1.8 LOCAL_OBFU_XENICAL BODY: Obfuscated 'XENICAL' in body -0.0 BAYES_44 BODY: Bayesian spam probability is 44 to 50% [score: 0.4966] 1.5 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug 1.0 DRUGS_ANXIETY_OBFU Obfuscated reference to an anxiety control drug 1.0 DRUGS_ERECTILE Refers to an erectile drug 0.0 DRUGS_ANXIETY Refers to an anxiety control drug 1.0 DRUGS_PAIN_OBFU Obfuscated reference to a pain relief drug 0.0 DRUGS_DEPRESSION Refers to an antidepressant 0.0 DRUGS_PAIN Refers to a pain relief drug 0.0 DRUGS_DIET Refers to a diet drug 1.0 DRUGS_DEPR_EREC Refers to both an erectile and an antidepressant 1.0 DRUGS_ANXIETY_EREC Refers to both an erectile and an anxiety drug 1.0 DRUGS_PAIN_EREC Refers to both an erectile and a painkiller 0.5 DRUGS_DIET_PAIN Refers to both a diet drug and a pain drug 1.0 DRUGS_DIET_EREC Refers to both an erectile and a diet drug 1.0 DRUGS_MANYKINDS Refers to at least four kinds of drugs Content analysis details: (9.3 points, 4.6 required) pts rule name description ---- ---------------------- ------------------------------------------------ -- 0.6 RATWR10a_MESSID Message-ID has ratware pattern (HEXHEX.HEXHEX@) 3.3 SARE_SUB_ONLINE_OB subject has obfuscated spammer topic 1.7 BAYES_80 BODY: Bayesian spam probability is 80 to 90% [score: 0.8257] 1.7 SARE_SPEC_ANUMA URI: Domain with ALPHAs NUMBERs APLHAs