On Mon, 17 Jan 2005, Vincent wrote:

> I am new to perl, I receive some spam email with subject like "st0ck, 
> 0pportunities, gr0wth...", how can I match those words with number "0" 
> in

You don't.

I spent a about a year doing pretty much what you're asking for here, 
though with Procmail rules instead of Perl. Same difference, in this 
case. Getting these patterns right is *really* hard to do -- as a simple 
example, what happens when someone sends you a legit mail with a zero in 
the subject line? 

  To: Vincent <[EMAIL PROTECTED]>
  From Vincent's significant other <[EMAIL PROTECTED]>
  Subject: meet for drinks at 10 First St at 8:00 tonight?

A mail like that would trip a naive filtering rule. 

  To: Vincent <[EMAIL PROTECTED]>
  From Vincent's significant other <[EMAIL PROTECTED]>
  Subject: meet for drinks at Pier01 at 8:00 tonight?

And, if such a place as "Pier01" existed, you wouldn't be able to have 
it in the subject line even with a good filter running.

This is the wrong way to attack spam.

You're much better off setting up something like SpamAssassin, which 
among other things builds up a collection of little rules like this 
which then collectively determine if a message probably is or is not 
spam. Additionally, it can look at messages you explicitly classify as 
spam or non-spam, and build up a statistical profile of what does or 
does not look like spam to *you*, using Bayesian statistics. This is 
approach is far more effective than just about anything else, and is the 
general strategy that most other spam filters are using today.

If you need help setting up SpamAssassin, go to their website at 
<http://spamassassin.apache.org/>. They have documentation and mailing 
lists that can help you get up and running.

SA is written in Perl, so once you're running with it, you can adjust it 
as you like. If the "digits in words" rule is important to you and there 
isn't one for it already (there probably is, but no problem if not), you 
can add one and set a score for it as needed. You can ask for help with 
that either here or on the SA lists.



-- 
Chris Devers

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to