Hello Loren, Mario, Wednesday, August 25, 2004, 12:39:23 PM, Loren wrote:
>>We are receiving in our server tons of Spam emails with Subjects like this: >>M___crie as suas proprias estampas 391 >>As you can see, it has at least two underscore characters ( __ ) and I >>think this would be the only way to scan them. LW> You should have posted a complete spam, or at least complete headers. LW> There probably are a lot more spam signs available in a typical spam. Agreed. (The reason I don't use such a rule myself is because these spam are always flagged by other attributes.) LW> The specific rule you asked for would be written as LW> header SUB_UNDERSCORES Subject =~ /__/ LW> score SUB_UNDERSCORES 0.1 LW> But don't use it, or at least not with any significant score. Well, actually, a quick scan of my corpus, 24k ham and 46k spam, shows 40 spam hits and no ham hits. IMO that could warrant a SARE score as high as 0.777 (my email client often gives different results than mass-check does, so don't take this as gospel). Expect to see this in my next SARE mass-check request, so we can see if it works on other corpora. LW> If I assume that the underscores always come in the middle of the first word LW> of the subject, and the subject always ends with a number, ... Not here. Enlarge yo,ur` p`eni*s ` to;day ,__" smyblbaavs [EMAIL PROTECTED]@ - We have something better! Gene**ric viagr__a 80% DISCOUNT longbeach__Investor_Update,_Don't_Miss_This.,,..ryooau only exklusive [EMAIL PROTECTED] m00viez pikturez story romance coerces Y0__ng prn0 archive exalting ornament etc Bob Menschel
