Re: [OT] HTML to XHTML conversion
On Fri 23-Aug-2002 at 11:07:35AM -0500, D. Hageman wrote: My suggestion would to just use a XML parser module like XML::LibXML. Load the file up using the HTML loading functions and print it using the XML printing functions ... since the only difference I can see between HTML and XHMTL is that optional ending tags are no longer optional (per XML spec) and single tags must be ended properly (per XML spec). There's a lot more than that. bodybody/body/body is not valid XHTML for example. input type=text name=foo/input is not valid XHTML either. You have to be careful about block-level and inline elements. etc. etc... Besides, you cannot use an XML parser to parse HTML. You have to use something like HTML::TreeBuilder instead. Part of HTML::Tree, excellent module IMHO. Cheers, -- IT'S TIME FOR A DIFFERENT KIND OF WEB Jean-Michel Hiver - Software Director [EMAIL PROTECTED] +44 (0)114 255 8097 VISIT HTTP://WWW.MKDOC.COM
Re: [OT] HTML to XHTML conversion
On Wed, 28 Aug 2002 10:07:07 +0100, Jean-Michel Hiver [EMAIL PROTECTED] said: JM bodybody/body/body is not valid XHTML for example. JM input type=text name=foo/input is not valid XHTML either. JM You have to be careful about block-level and inline elements. Actually input type=text name=foo/input is valid XHTML. Correct me if I'm wrong but AFAIK xxx/xxx is exactly equivalent to xxx/. input type=text name=foosomething/input is not valid. JM etc. etc... JM Besides, you cannot use an XML parser to parse HTML. You have to use JM something like HTML::TreeBuilder instead. Part of HTML::Tree, excellent JM module IMHO. XML::LibXML supports HTML too. -- Ilya Martynov (http://martynov.org/)
RE: [OT] HTML to XHTML conversion
Reviewing the What is different between HTML and XHTML? we have All tags must close X/X or be single tags like X / Tag names are case sensitive All attributes must be name=value (double quotes required, no more multiple,checked,selected) And all tags must nest properly XHTML also has rules about which elements can appear where (the XHTML DTD) NOTE: There are two XHTML DTD's of interest, Strict and Transitional. Transitional is much more forgiving. I always View Source of www.w3.org to get the strange DOCTYPE syntax for Transitional, and the path to the DTD -Original Message- From: D. Hageman [mailto:[EMAIL PROTECTED]] Sent: Friday, August 23, 2002 12:08 PM To: Jonathan M. Hollin Cc: [EMAIL PROTECTED] Subject: Re: [OT] HTML to XHTML conversion My suggestion would to just use a XML parser module like XML::LibXML. Load the file up using the HTML loading functions and print it using the XML printing functions ... since the only difference I can see between HTML and XHMTL is that optional ending tags are no longer optional (per XML spec) and single tags must be ended properly (per XML spec). On Fri, 23 Aug 2002, Jonathan M. Hollin wrote: [OFF TOPIC] I am trying to find a module that can convert HTML to XHTML, but have drawn a blank on CPAN and GOOGLE. Is there anything out there to do this other than HTML TIDY? I am developing a mod_perl CMS application at the moment. All its output is compliant with XHTML Transitional. But its users can create content that isn't (and are likely to) and I'd like to parse this and convert it XHTML before it goes into the RDBMS if possible. If nothing exists along these lines - would anyone like to collaborate on the development of a module for this purpose? HTML::XHTML anyone? -- //\\ || D. Hageman[EMAIL PROTECTED] || \\// -- This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
Re: [CGI] [OT] HTML to XHTML conversion
Complete automatic conversion is not possible since someone could enter HTML code that omits or contains certain attributes are either required or not allowed in XHTML Transitional and the conversion program would 1. not know what value to add for required but omitted attributes, or 2. removing the not allowed attributes will seriously change the rendering of the page. Of course the simple mechanical rules - 1. all tag names in lower case 2. all tags closed 3. all attribute values quoted 4. proper tag nesting etc. could be automated and may be sufficient for your purposes. Regards Roy - Original Message - From: Jonathan M. Hollin [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: CGI List [EMAIL PROTECTED] Sent: Friday, August 23, 2002 8:54 AM Subject: [CGI] [OT] HTML to XHTML conversion [NOTICE: see the message footer for important information] [OFF TOPIC] I am trying to find a module that can convert HTML to XHTML, but have drawn a blank on CPAN and GOOGLE. Is there anything out there to do this other than HTML TIDY? I am developing a mod_perl CMS application at the moment. All its output is compliant with XHTML Transitional. But its users can create content that isn't (and are likely to) and I'd like to parse this and convert it XHTML before it goes into the RDBMS if possible. If nothing exists along these lines - would anyone like to collaborate on the development of a module for this purpose? HTML::XHTML anyone? -- Jonathan M. Hollin Co-ordinator: WYPUG (http://wypug.pm.org/) -- To unusbcribe, send an email contining the words: 'unsubscribe cgi-list' to the following email address: [EMAIL PROTECTED] Archives of the following mailing lists are available at: http://www.perl.jann.com/ the CGI Mailing List the mod_perl mailing list the embperl mailing list Searching, browsing and posting are available at http://www.perl.jann.com/
Re: [OT] HTML to XHTML conversion
On Friday, August 23, 2002, at 04:54 pm, Jonathan M. Hollin wrote: [OFF TOPIC] I am trying to find a module that can convert HTML to XHTML, but have drawn a blank on CPAN and GOOGLE. Is there anything out there to do this other than HTML TIDY? [snip] If nothing exists along these lines - would anyone like to collaborate on the development of a module for this purpose? HTML::XHTML anyone? Out of curiosity... why not tidy? It seems to do a pretty darn good job of it - I use it all of the time. Adrian