Redirect Extension produces invalid UTF8 from a valid UTF8 stream
-----------------------------------------------------------------

                 Key: XALANJ-2352
                 URL: http://issues.apache.org/jira/browse/XALANJ-2352
             Project: XalanJ2
          Issue Type: Bug
          Components: Xalan-extensions
    Affects Versions: 2.4, 2.6, The Latest Development Code
         Environment: Mac OSX 10.4.x , Java 1.5
            Reporter: Ian Boston


java -cp ~/.m2/repository/xalan/xalan/2.6.0/xalan-2.6.0. 
org.apache.xalan.xslt.Process -XML -IN test.xml -XSL WriteOutput.xsl -OUT 
testout.xml 


test.xml contains some UTF-8  characters (raw, not &# Entities)

WriteOutput.xsl perfroms a transform with the redirect extensin, the text.xml 
input is copied to testout.xml

test.xml, testout.xml are valid xml (using xmllint)
testredirectout.xml has broken UTF-8 encoding



Command Line output from my box.
Files attached


Any ideas, I've looked at the source in SVN and cant see anything that will 
fix, and there the serializer is the same for both output streams.


auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout test.xml 
auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout testout.xml 
auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout 
testredirectoutput.xml 
testredirectoutput.xml:9: parser error : Input is not proper UTF-8, indicate 
encoding !
Bytes: 0xCA 0x5D 0x5B 0xD4
  [?][?][?][?][?]
      ^
auto9:~/Caret/darwin/letters/xdocgen ieb$ 


auto9:~/Caret/darwin/letters/xdocgen ieb$ more test.xml 
<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output";>
<links>
<link>letters/339/letter1814.xml</link>
<output:write-file file="testredirectoutput.xml">
 <document>
 <source type="letter" lognum="1814" calendarnum="339"/>
 <header>
    <title>Here is a Valud UTF Char[<E2><80><82>]</title>
 </header>
<body>
  <section>
  [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
  </section>
</body>
</document>
</output:write-file>
<source>
 <document>
 <source type="letter" lognum="1814" calendarnum="339"/>
 <header>
    <title>Here is a Valud UTF Char[<E2><80><82>]</title>
 </header>
<body>
  <section>
  [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
  </section>
</body>
</document>
</source>
</links>
</documents>
auto9:~/Caret/darwin/letters/xdocgen ieb$ 


auto9:~/Caret/darwin/letters/xdocgen ieb$ more testout.xml 
<?xml version="1.0" encoding="UTF-8"?><directoutput 
xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output"; 
xmlns:xalan="http://xml.apache.org/xalan"; xml
ns:redirect="http://xml.apache.org/xalan/redirect";>

letters/339/letter1814.xml
<redirect:write file="testredirectoutput.xml">
 <document>
 <source type="letter" lognum="1814" calendarnum="339"/>
 <header>
    <title>Here is a Valud UTF Char[<E2><80><82>]</title>
 </header>
<body>
  <section>
  [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
  </section>
</body>
</document>
</redirect:write>
<source>
 <document>
 <source type="letter" lognum="1814" calendarnum="339"/>
 <header>
    <title>Here is a Valud UTF Char[<E2><80><82>]</title>
 </header>
<body>
  <section>
  [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
  </section>
</body>
</document>
</source>

</directoutput>
auto9:~/Caret/darwin/letters/xdocgen ieb$ 


auto9:~/Caret/darwin/letters/xdocgen ieb$ more testredirectoutput.xml 
<?xml version="1.0" encoding="UTF-8"?>
 <document xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output";>
 <source type="letter" lognum="1814" calendarnum="339"/>
 <header>
    <title>Here is a Valud UTF Char[?]</title>
 </header>
<body>
  <section>
  [?][<CA>][<D4>][<D5>][<D0>]
  </section>
</body>
</document>
auto9:~/Caret/darwin/letters/xdocgen ieb$ 



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to