I'd suggest Tidy (http://sourceforge.net/projects/tidy) or JTidy (http://sourceforge.net/projects/jtidy) from Sourceforge. You can configure them to adjust errors found in the source HTML.
Come to think of it. Has anyone thought of implementing JTidy in the standard html serialiser to minimize the bytes sent to the browser? JTidy can filter out all the unnecessary whitespace in the html. This can mean an average saving of 20% on the downloaded file size (read: a download speed increase of 20%!!) Bert At 18:03 28/02/2002 -0600, you wrote: >Hi Guys, > >I was wondering whether anyone knows of an effective way of converting >old html content, from a content management system into Valid XML / >XHTML ? I know its slightly outside the biref of this list but it >relates to using old content within a Cocoon project and I need to find >the most cost effective way of doing a batch job to see if there is a >way of avoiding a manual file by file migration. >Suggestions appreciated. > >All the best, > >ColmOR. > >-- >Colm O'Riordan | Director >Communicraft >mobile: 353 86 2225078 >web: www.communicraft.com > > > >--------------------------------------------------------------------- >Please check that your question has not already been answered in the >FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> > >To unsubscribe, e-mail: <[EMAIL PROTECTED]> >For additional commands, e-mail: <[EMAIL PROTECTED]> --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>