Hello Loren, Mario,

Wednesday, August 25, 2004, 12:39:23 PM, Loren wrote:

>>We are receiving in our server tons of Spam emails with Subjects like this:
>>M___crie as suas proprias estampas  391
>>As you can see, it has at least two underscore characters ( __ ) and I
>>think this would be the only way to scan them.

LW> You should have posted a complete spam, or at least complete headers.
LW>  There probably are a lot more spam signs available in a typical spam.
Agreed.  (The reason I don't use such a rule myself is because these spam
are always flagged by other attributes.)

LW> The specific rule you  asked for would be written as
LW> header SUB_UNDERSCORES    Subject =~ /__/
LW> score    SUB_UNDERSCORES    0.1
LW> But don't use it, or at least not with any significant score.

Well, actually, a quick scan of my corpus, 24k ham and 46k spam, shows 40
spam hits and no ham hits. IMO that could warrant a SARE score as high as
0.777 (my email client often gives different results than mass-check
does, so don't take this as gospel). Expect to see this in my next SARE
mass-check request, so we can see if it works on other corpora.

LW> If I assume that the underscores always come in the middle of the first word
LW> of the subject, and the subject always ends with a number, ...
Not here.
Enlarge yo,ur`  p`eni*s  ` to;day        ,__"    smyblbaavs
[EMAIL PROTECTED]@ - We have something better!
Gene**ric viagr__a 80% DISCOUNT
longbeach__Investor_Update,_Don't_Miss_This.,,..ryooau
only exklusive [EMAIL PROTECTED] m00viez pikturez story romance coerces
Y0__ng prn0 archive exalting ornament
etc

Bob Menschel



Reply via email to