Engin,

I reproduced your problem, but with some differences.  Your call to
Source stylesheet = tFactory.getAssociatedStylesheet(new
StreamSource("x.xml"),media, title,charset);
gave me a null, so I changed your code to this:

package jan12;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class Jan12 {
    public static void main(String[] args)
        throws TransformerException, TransformerConfigurationException {

        String media = null, title = null, charset = null;

        try {

            TransformerFactory tFactory = TransformerFactory.newInstance();
            StreamSource ss = new StreamSource("jan12/x.xsl");

            final Transformer transformer;
            transformer = tFactory.newTransformer(ss);

            //create input stream with special encoding

            FileInputStream fi = new FileInputStream("jan12/x.xml");

            InputStreamReader i = new InputStreamReader(fi, "ISO8859_9");

            StreamSource so = new StreamSource(i);

            //create output stream with special encoding

            FileOutputStream f = new FileOutputStream("xout.xml");

            OutputStreamWriter o = new OutputStreamWriter(f, "ISO8859_9");

            StreamResult s = new StreamResult(o);

            transformer.transform(so, s);

            fi.close();

            i.close();

            o.close();

            f.close();

        } catch (Exception e) {

            e.printStackTrace();

        }

    }

}


The input x.xml was irrelevant, because I used this stylesheet for x.xsl:
<?xml version="1.0" encoding="ISO-8859-9"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="xml" indent="yes" encoding="ISO-8859-9" />

<xsl:template match="/">
<out>char 287:&#287; char 350:&#350; Dotted capital I char 304 &#304;</out>
</xsl:template>

</xsl:stylesheet>


The behavior is different depending on whether the Java Class
sun.io.CharToByteConverter is available or not.

I suspect that when your run on windows the class is there, but on your
UNIX system the JRE is different and the class is not available.  You can
add this to you Java code:
    Class clazz = Class.forName("sun.io.CharToByteConverter");
and test whether clazz is null in one environment but not the other. I
suspect that when this class is available that you get the correct output.

When this class is not available it looks like it exposes a configuration
error in Xalan in its Encodings.properties file in the
org.apache.xml.serializer package.  It has information for the Turkish
characters in lines like this:
  ISO8859_9 ISO-8859-9 0x00FF
  ISO8859-9 ISO-8859-9 0x00FF
The third word on the line, 0x00FF indicates the code point of the highest
value used in the character set.  In base 10 this value is 255. But these
Turkish characters are 287, 350, 304, which is bigger than 255.  When
writing the characters to the output file, the serializer thinks the
unicode characters are out of range because they are larger than the
supposed maximum codepoint value. So the serializer converts them to
numerical character references, e.g. the five characters &#304; rather than
the single unicode character with a code point of 304.

At this point I'm not sure what the correct maximal code point value is for
this character set, but I think that getting the value right might fix your
problem.

Please open a defect in JIRA ( http://issues.apache.org/jira/ ) against
XalanJ2.





----------
Brian Minchau
XSLT Development, IBM Toronto
e-mail:        [EMAIL PROTECTED]

Reply via email to