Hi All,

I have a Cocoon 2.1.8 application using Saxon, and after a very long time with no problems, we just stumbled upon a bug in the (now rather old) version of Saxon that we were using. So I downloaded the latest Saxon distribution and swapped out the Saxon JAR, and... problem solved! Except that now, I seem to have a new problem with character encoding...

Cocoon is serving a web page with a bunch of occurrences of the "ndash" character (Unicode #8211). These displayed correctly with the old Saxon, but now with the new version they instead look like this:

        –

:-(.  The HTMLSerializer is adding the correct

        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

The declaration for that component did not include any <encoding> element. I tried adding one like this:

        <encoding>utf-8</encoding>

which had no effect.  But then just for giggles I tried

        <encoding>iso-8859-1</encoding>

...and discovered that this "fixed" the bad characters. (It also changes the <meta http-equiv="Content-Type"> element generated by the serializer). Which is interesting, but not really what I want; I want to be all UTF-8. So I reverted back to 'utf-8' in the <encoding> of the HTMLSerializer configuration and kept fiddling around. I tried adding

        <xsl:output format="xml" encoding="utf-8"/>

to my stylesheets, but that had no effect. In addition, I then also tried adding 'encoding="UTF-8"' to the <?xml?> preamble of my source document, and that also had no effect.

Anybody have any clues to share?

thx,
—ml—


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to