Re: Help understanding why the RE does not totally work

Rob Dixon Fri, 26 Sep 2008 12:01:11 -0700

Jack Gates wrote:
> On Friday 26 September 2008 01:23:23 pm John W. Krahn wrote:
>> Jack Gates wrote:
>>> s!(<|</)([^\!][A-Z0-9 ]+>)!$1\L$2\E!g;
>>> or
>>> s/(<|<\/)([^!][A-Z0-9 ]+>)/$1\L$2\E/g;
>>>
>>> The RE above captures and replaces all HTML tags with lowercase
>>> as desired except for any tag that has only one letter such as
>>> <P>, <B> or <I>
>>>
>>> It will get the </B>, </P> and </I>
>>>
>>> It properly ignores the <!DOCTYPE> tag
>>>
>>> What is the correct way to write the above RE?
>> Perhaps this is what you want?
>>
>> s{ ( < (?!!) /? [[:upper:]]{2,} > | < [[:upper:]]{2,} \s* /> ) }
>> {\L$1}xg;
> 
> Yours worked with three exceptions it was missing all the single 
> letter tags open and close and the H 1-6 tags and the tags that had 
> and element or attribute
> 
> A little tweaking and it works. It gets only and all of what I want 
> it to get.
> 
> s{ ( < (?!!) /? [[:upper:][:digit:] ]{1,} > | < [[:upper:]
> [:digit:] ]{1,} \s* /> ) } {\L$1}xg;


That won't modify tags that have attributes either. Are you sure you know what
you want?

R

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Help understanding why the RE does not totally work

Reply via email to