Hello List,
I'm writing a little script that will turn some webpages I have from
HTML to XHTML. This requires several substitutions. So far, I have
been able to write regexps that properly close link, meta, br, and hr
tags and either put in or replace an existing <! DOCTYPE...> tag.
Finally, I want it to take anything within a tag and make it lowercase,
UNLESS it's inside double-quotes. However, I want it to ignore any tags
that begin with '<!'.
The following does about 95% of the job:
if (/<(.*?)>/g && !/<\!/g) { # if it's an HTML and not a
# <!DOCTYPE or comment line...
s/^(.*?)"(.*?)"/\L$1\E"$2"/g;
}
However, if there are commented-out HTML tags with caps outside of
double-quotes, they get ignored. For example:
<!--<p><A HREF="sweet_nothings_site/index"></p>-->
will not come out with 'a href'.
Ignoring anything that is commented out is cool, and I might keep it
this way, but I'm wondering if there is a way to get the stuff in tags
in comments "lowercased" as well.
Thanks!
Adam
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>