Redirect Extension produces invalid UTF8 from a valid UTF8 stream
-----------------------------------------------------------------
Key: XALANJ-2352
URL: http://issues.apache.org/jira/browse/XALANJ-2352
Project: XalanJ2
Issue Type: Bug
Components: Xalan-extensions
Affects Versions: 2.4, 2.6, The Latest Development Code
Environment: Mac OSX 10.4.x , Java 1.5
Reporter: Ian Boston
java -cp ~/.m2/repository/xalan/xalan/2.6.0/xalan-2.6.0.
org.apache.xalan.xslt.Process -XML -IN test.xml -XSL WriteOutput.xsl -OUT
testout.xml
test.xml contains some UTF-8 characters (raw, not &# Entities)
WriteOutput.xsl perfroms a transform with the redirect extensin, the text.xml
input is copied to testout.xml
test.xml, testout.xml are valid xml (using xmllint)
testredirectout.xml has broken UTF-8 encoding
Command Line output from my box.
Files attached
Any ideas, I've looked at the source in SVN and cant see anything that will
fix, and there the serializer is the same for both output streams.
auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout test.xml
auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout testout.xml
auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout
testredirectoutput.xml
testredirectoutput.xml:9: parser error : Input is not proper UTF-8, indicate
encoding !
Bytes: 0xCA 0x5D 0x5B 0xD4
[?][?][?][?][?]
^
auto9:~/Caret/darwin/letters/xdocgen ieb$
auto9:~/Caret/darwin/letters/xdocgen ieb$ more test.xml
<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output">
<links>
<link>letters/339/letter1814.xml</link>
<output:write-file file="testredirectoutput.xml">
<document>
<source type="letter" lognum="1814" calendarnum="339"/>
<header>
<title>Here is a Valud UTF Char[<E2><80><82>]</title>
</header>
<body>
<section>
[<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
</section>
</body>
</document>
</output:write-file>
<source>
<document>
<source type="letter" lognum="1814" calendarnum="339"/>
<header>
<title>Here is a Valud UTF Char[<E2><80><82>]</title>
</header>
<body>
<section>
[<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
</section>
</body>
</document>
</source>
</links>
</documents>
auto9:~/Caret/darwin/letters/xdocgen ieb$
auto9:~/Caret/darwin/letters/xdocgen ieb$ more testout.xml
<?xml version="1.0" encoding="UTF-8"?><directoutput
xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output"
xmlns:xalan="http://xml.apache.org/xalan" xml
ns:redirect="http://xml.apache.org/xalan/redirect">
letters/339/letter1814.xml
<redirect:write file="testredirectoutput.xml">
<document>
<source type="letter" lognum="1814" calendarnum="339"/>
<header>
<title>Here is a Valud UTF Char[<E2><80><82>]</title>
</header>
<body>
<section>
[<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
</section>
</body>
</document>
</redirect:write>
<source>
<document>
<source type="letter" lognum="1814" calendarnum="339"/>
<header>
<title>Here is a Valud UTF Char[<E2><80><82>]</title>
</header>
<body>
<section>
[<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
</section>
</body>
</document>
</source>
</directoutput>
auto9:~/Caret/darwin/letters/xdocgen ieb$
auto9:~/Caret/darwin/letters/xdocgen ieb$ more testredirectoutput.xml
<?xml version="1.0" encoding="UTF-8"?>
<document xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output">
<source type="letter" lognum="1814" calendarnum="339"/>
<header>
<title>Here is a Valud UTF Char[?]</title>
</header>
<body>
<section>
[?][<CA>][<D4>][<D5>][<D0>]
</section>
</body>
</document>
auto9:~/Caret/darwin/letters/xdocgen ieb$
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]