Hi Anna,

Since JTidy sometimes messes up the original file (like with scripts inside tables - I posted
a message about it few weeks ago), I would want to preprocess the original html file before passing it to the HTMLGenerator.
In order to use a Cocoon pipeline you need to generate SAX events out of your HTML - this is what the HTMLGenerator does. You could use an external servlet to preprocess the HTML, but I suspect this would mean reproducing most of JTidy's functionality.

So, unless you're able to fix JTidy for your particular problem, you will need to use or write a different Generator. The wiki [1] mentions the NekoHTML parser [2] which apparently could be used instead of JTidy.

If NekoHTML works for you (you can probably try it at the command line), maybe writing a Generator that uses it would be the best solution? There has been some discussion about it on cocoon-dev, see [3].

Hope this helps.
--
Bertrand Delacretaz (codeconsult.ch, jfor.org)

buzzwords: XML, java, XSLT, Cocoon, FOP, mentoring/teaching/coding.
blogspace http://www.codeconsult.ch/bertrand

[1] http://wiki.cocoondev.org/Wiki.jsp?page=HTMLGenerator
[2] http://www.apache.org/~andyc/neko/doc/html/index.html
[3] http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=102137565005025&w=2


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>



Reply via email to