Larry Gilson <[EMAIL PROTECTED]> wrote:

>   / \w{1,7}<\/?[^<>]{0,150}>\w{1,7}/
> 
> This one seems to be working well so far.  It will catch any
> normal and funky stuff within the tags but makes sure it will
> not run over any subsequent tags.

Since '/' matches the character class '[^<>]', there's not much 
point in having the '\/?' in there.  Also, since you're not 
looking for anything after the '\w{1,7}' at the end, you might 
as well change it to '\w', since the '{1,7}' isn't making any 
difference.

It seems to me that your rule is going to have a fair number of 
false positives, though.  For example, '<br>' often shows up 
between words with no intervening whitespace, and depending on 
what's used to produce the HTML I wouldn't be that surprised to 
find other tags, like '<p>' or '<li>', with words on both 
sides.  Are you not seeing FPs?

-- 
Keith C. Ivey <[EMAIL PROTECTED]>
Washington, DC



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to