Jack Gates wrote: > On Friday 26 September 2008 01:20:29 pm Rob Dixon wrote: >> Jack Gates wrote: >>> s!(<|</)([^\!][A-Z0-9 ]+>)!$1\L$2\E!g; >>> or >>> s/(<|<\/)([^!][A-Z0-9 ]+>)/$1\L$2\E/g; >>> >>> The RE above captures and replaces all HTML tags with lowercase >>> as desired except for any tag that has only one letter such as >>> <P>, <B> or <I> >>> >>> It will get the </B>, </P> and </I> >>> >>> It properly ignores the <!DOCTYPE> tag >>> >>> What is the correct way to write the above RE? >> HTML tag names can't contain spaces, so you want >> >> s|(</?)([A-Z][A-Z0-9]*)|$1\L$2|g; > > Thanks for the effort. Your RE does not work as well as what I have. > > HTML tags can contain spaces. > You forgot about > <p id="something" class="something">
No, I didn't. The tag name is 'p' and it has no spaces in it. The tag has two attributes named 'id' and 'class'. They don't have spaces in them either. If you prefer yours that doesn't work on tags with single-character names just because it happens to also modify the first attribute name then go ahead and use it. I'm pretty sure there is no simple way to change the case of a tag's name and all its attribute names while leaving the attribute values intact. Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
