YARQ (Yet Another Regexp Question)

Etienne Marcotte Tue, 13 Nov 2001 11:45:12 -0800

I saw somewhere on the web a good regexp for removing html tags. Can't
re-find it and it needed some minor mods.


Let's say the $line is 'this is a <font size="2">large word</font>in
size 2';

I played a little around, but it always removed between the first < and
the last > (and I knwo the tutorial on the web said how to avoid this)

I'd like to make something like this (I know this one's not good, but
please help place parenthesis and [] and {} :)

   .*     < (.*) \s    .*    >     .*     </  \1  >    .*
this is a < font    size="2" > large word </ font > in size 2

the above line show what is the match for each part...

thanks for help...

And also is tthe a way to specify a list of allowed tags? or a list of
unallowed tags.
like if the (.*) is foo or bar to delete, keep is something else...

I don't think it's clear, but I'll try to help if you need more details
on what I'm trying to accomplish

Etienne

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

YARQ (Yet Another Regexp Question)

Reply via email to