Hello List,

I'm writing a little script that will turn some webpages I have from HTML to XHTML. This requires several substitutions. So far, I have been able to write regexps that properly close link, meta, br, and hr tags and either put in or replace an existing <! DOCTYPE...> tag.

Finally, I want it to take anything within a tag and make it lowercase, UNLESS it's inside double-quotes. However, I want it to ignore any tags that begin with '<!'.

The following does about 95% of the job:

    if (/<(.*?)>/g && !/<\!/g) {       # if it's an HTML and not a
                                      # <!DOCTYPE or comment line...
        s/^(.*?)"(.*?)"/\L$1\E"$2"/g;
    }

However, if there are commented-out HTML tags with caps outside of double-quotes, they get ignored. For example:

    <!--<p><A HREF="sweet_nothings_site/index"></p>-->

will not come out with 'a href'.

Ignoring anything that is commented out is cool, and I might keep it this way, but I'm wondering if there is a way to get the stuff in tags in comments "lowercased" as well.

Thanks!
Adam

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to