Yury Mikhienko <[EMAIL PROTECTED]> writes:

> in serializers section I have:
>      <map:serializer logger="sitemap.serializer.text" mime-type="text/plain" 
> name="UTF_16_text" src="org.apache.cocoon.serialization.TextSerializer">
>       <encoding>UTF-16</encoding>
>      </map:serializer>
>
>
> in pipeline section:
>
>    <map:match pattern="test">
>         <map:generate src="test.xml"/>
>         <map:transform src="test.xsl"/>
>         <map:serialize type="UTF_16_text"/>
>    </map:match>
>
>
> my test.xml file:
> <?xml version="1.0" encoding="KOI8-R"?>
> <text>
>  тест test
> </text>
>
> my test.xsl file:
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>   <xsl:template match="text">
>      <xsl:value-of select="."/>
>   </xsl:template>
> </xsl:stylesheet>
>
> after serializing I get the following document (binary dump)
> ff fe 0a 00 20 00 42 04 35 04 41 04 42 04 20 00 74 00 65 00 73 00 74 00 0a 00
> ^^^----- WHY? 
That's the magic bytes that say that this is UTF_16 - i.e. Unicode.
16-bit (unicode) characters follow:
0x000a - '\n'
0x0020 - ' ' 
0x0442 - russian 'т'
0x0435 - russian 'е'
0x0441 - russian 'с'
0x0442 - russian 'т' 
0x0020 - ' '
0x0074 - t
0x0065 - e
0x0073 - s
0x0074 - t
0x000a - '\n'

Try to open this file(binary dump) under WindowsNT/XP notepad and it will show it 
nicely for you.

-- Ed

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to