Hi,

On Wed, 17 Mar 2004 11:19:44 +0000 Mat Harris <[EMAIL PROTECTED]> wrote:

> On Wed, Mar 17, 2004 at 04:04:58 -0600, David B Funk wrote:
> > Would somebody please mass-check the following rule set
> > and let me know if there's any collateral damage?
> > I whiped them up to deal with a new flavor of spam that I'm
> > seeing more of these days.
> > 
> > 
> > rawbody L_FAKE_HREF     /\w\whref=http:/i
> > describe L_FAKE_HREF    Faked href to hide spammer URLs
> > score L_FAKE_HREF       1.0
> > 
> 
> i am probably just seeing things and being stupid, but what is
> invalid about the above href?

\w matches [a-zA-Z0-9_] so /\w\whref=http:/i matches 'href=http:'
preceded by two characters that are neither punctuation or whitespace.
Meaning 'zzhref=http:' matches, but '<a href=http:' doesn't.

See `perldoc perlre` for details.

Hrm. Does it hurt to change

  /\w\whref=http:/i

to

  /\w\whref="?https?:/i

or even

  /\w\whref="?[a-z]{4,8}:/i

?

-- Bob

Reply via email to