Ola Berg wrote:
> From: "Nicola Ken Barozzi" <[EMAIL PROTECTED]>
> 
>>HTMLGenerator uses JTidy directly, without making assumptions itself.
>>If you can use JTidy to work for you, it should work - or can be easily 
>>made to work - with HTMLGenerator too.
> 
> 
> What do you mean? I can use JTidy on my system, whether Cocoon utilizes or not was 
>my question to you, dear community ;-)

I meant if you can make it work from the commandline to generate the 
result you want, then also Cocoon can do it.

> Therefore I provided both the sitemap snippet as well as the test bhtml-document.
> I use the binary distribution of Cocoon 2.0.2 (where documentation says that this 
>feature is enabled by default). And if it is not enabled by default, I haven't been 
>able to find out how to enable it. 
> 
> Question restated: given my configuration and the bhtml document that fails, is it 
>safe to believe that HTMLGenerator utilizes JTidy and that JTidy fails, or is it safe 
>to believe that HTMLGenerator fails because it fails to utilize JTidy? 

I don't know, that'e why I made you that question.
USe JTidy outside of Cocoon to see if it works.
If it does, tell us how you did it, and we will patch the Cocoon 
HTMLGenerator to play nice.

> And if the latter is true, how could I tweak it so that JTidy will be utilized by 
>HTMLGenerator? 

This is what HTMLGenerator does
():

             // Setup an instance of Tidy.
             Tidy tidy = new Tidy();
             tidy.setXmlOut(true);
             tidy.setXHTML(true);
             //Set Jtidy warnings on-off
             tidy.setShowWarnings(getLogger().isWarnEnabled());
             //Set Jtidy final result summary on-off
             tidy.setQuiet(!getLogger().isInfoEnabled());
             //Set Jtidy infos to a String (will be logged) instead of 
System.out
             StringWriter stringWriter = new StringWriter();
             PrintWriter errorWriter = new PrintWriter(stringWriter);
             tidy.setErrout(errorWriter);

             // Extract the document using JTidy and stream it.
             org.w3c.dom.Document doc = tidy.parseDOM(new 
BufferedInputStream(this.inputSource.getInputStream()), null);


If you know how to make JTidy output as you need, tell us and we will 
path the HTMLGenerator.

> If the first is true ("HTMLGenerator can't handle the bhtml-snippet no matter what") 
>I really need to investigate another solution, such as:
> 
>>Look here, maybe it's the right time to ditch tidy entirely
>>
>>http://www.apache.org/~andyc/neko/doc/html/index.html
> 
> 
> ...sounds promising. I'll try to download and investigate. Hopefully I can provide a 
>CleaningHtmlGenerator soon, if it is needed.

Cool :-)

>>>BTW: the example I provided is actually cleaner than much of the code I need Cocoon 
>to deal with.
>>
>>:-O
> 
> 
> I could provide a list of testsnippets that the tidying thing should handle, fx:
> ---
> <h1>Hello <p>How do you do 
> <table border="2 >thing1<td>thing2</table>
> Wondering<p>foo <b>bar <i>baz</b> garply</i>"
> --- should become something like ---
> <html>
> <head>
> </head>
> <body>
> <h1>Hello</h1>
> <p>How do you do
> </p>
> <table border="2">
> <tr><td>thing</td><td>thing2</td></tr>
> </table>
> <p>Wondering
> </p>
> <p>foo <b>bar <i>baz</i></b> <i>garply</i>
> </p>
> </body>
> </html>
> ---


I tried it in the C version og Tidy, this is what I got:

<h1>Hello
<p>How do you do
<table border="2 &gt;thing1&lt;td&gt;thing2&lt;/table&gt; 
Wondering&lt;p&gt;foo &lt;b&gt;bar &lt;i&gt;baz&lt;/b&gt; garply&lt;/i&gt;">
</table>
</p>
</h1>

Maybe changing the rules..

-- 
Nicola Ken Barozzi                   [EMAIL PROTECTED]
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <[EMAIL PROTECTED]>
For additional commands, e-mail:   <[EMAIL PROTECTED]>

Reply via email to