At 10:29 AM 2/20/2004, Steven Manross wrote:
Is there a way to do a negated regex in an SA rule?

I don't know about negated regexps, but you ought to be able to tell it not to match certain characters:


/<A HREF[^<]*><\/A>/

ought to be equivalent to the previous expression, except for the string after HREF. "[^<]" will match anything except a <, meaning that it won't trigger on actual tags being nested inside a link.

I was originally going to suggest "[^<>]" but realized it would miss something like this:
<A HREF="unclickable">></A>


It can still be defeated trivially by this, though:
        <A HREF="unclickable"><b></b></A>

What's really needed is something that will check for *any* set of empty tags (valid or otherwise), like this:

/<A HREF[^<]*>(\s*<[^<]+>)*\s*<\/A>/

This should catch any supposed "link" which doesn't contain anything clickable, even if tags are nested wrong. Only one problem: We're back to the <a><img/></a> case! (Although in this case, it only applies if the image is the only content.)

Any suggestions on refining this further?


Kelson Vibber
SpeedGate Communications <www.speed.net>





Reply via email to