I'd suggest Tidy (http://sourceforge.net/projects/tidy) or JTidy 
(http://sourceforge.net/projects/jtidy) from Sourceforge.  You can 
configure them to adjust errors found in the source HTML.

Come to think of it.  Has anyone thought of implementing JTidy in the 
standard html serialiser to minimize the bytes sent to the browser?  JTidy 
can filter out all the unnecessary whitespace in the html.  This can mean 
an average saving of 20% on the downloaded file size (read: a download 
speed increase of 20%!!)

Bert

At 18:03 28/02/2002 -0600, you wrote:
>Hi Guys,
>
>I was wondering whether anyone knows of an effective way of converting
>old html content, from a content management system into Valid XML /
>XHTML ? I know its slightly outside the biref of this list but it
>relates to using old content within a Cocoon project and I need to find
>the most cost effective way of doing a batch job to see if there is a
>way of avoiding a manual file by file migration.
>Suggestions appreciated.
>
>All the best,
>
>ColmOR.
>
>--
>Colm O'Riordan | Director
>Communicraft
>mobile: 353 86 2225078
>web: www.communicraft.com
>
>
>
>---------------------------------------------------------------------
>Please check that your question has not already been answered in the
>FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
>To unsubscribe, e-mail: <[EMAIL PROTECTED]>
>For additional commands, e-mail: <[EMAIL PROTECTED]>


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>

Reply via email to