På 28. apr. 2004 kl. 00.58 skrev Joerg Heinicke:

On 20.04.2004 11:02, Sjur Nørstebø Moshagen wrote:

all my XML documents are completely in UTF-8, but Cocoon outputs entities for many non-ascii characters. Although this does not create any badly formatted pages, it does increase the size of the output html file (most such utf-8 characters will take 2 bytes, whereas the entities regularly take 7 or more bytes), and seems both unneccessary and some extra work in an all-utf-8 context, both for the server and the client. As my site contains a lot of these characters, I would like to turn it off. But it doesn't seem to be possible:
After some searching I hunted down the following paragraph in the description for XalanJ 2.6.0 (http://xml.apache.org/xalan-j/readme.html):
• For HTML output, Xalan-Java 2 outputs character entity references (© etc.) for the special characters designated in Appendix A. DTDs of the XHTML 1.0: The Extensible HyperText Markup Language. Xalan-Java 1.x, on the other hand, outputs literal characters for some of these special characters.
That is, it seems default behaviour, and I have found no Cocoon or other documentiation or tips to change it. Anyone can help me with this?

I don't know any option to influence this behaviour.

Thanks for the answer. Due to the lack of responses (apart from yours), and the general lack of documentation on this feature, I have accepted the behaviour as intended and non-changeable. The "solution" would be to change from HTML to XML (e.g. XHTML). On the other hand, the behaviour has the nice (most likely intended) side effect that even browsers/OS-es as old as to not support Unicode/UTF-8 will be able to render all non-ASCII characters that are enccoded as entities. Not that that is very useful on my site, but it _does_ make it possible to read help/info pages that explain the character set issues involved for the site, and how possible browser problems can be resolved.


So for the time being I won't do anything to change the output, despite the increased size.

Sjur


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to