Hello all,

I have the Russian Word (taken as example): *Основное*

that is encoded as UTF-8 by axis as:

Основное (a)

I may transmit this kind of information in a XML well formed packet using
axis 1.4 after a client request from the server to the client again. There
is no problem. The deserialization works perfectly.


However if I try to transmit applying wss4j with encryption signature and
timestamp the following error arises:

org.apache.xml.security.encryption.XMLEncryptionException: An invalid XML
character (Unicode: 0x1e)
was found in the element content of the document.

Therefore in order to avoid invalid characters in the packet I decide then
to escape all XML chars
using org.apache.commons.lang.StringEscapeUtils.escapeXML [1]



In the client in order to recover the original world I decide to do an
unescapeXML [1], which gives this Unicode string:

Основное (b)

First, it should be concluded that I am not getting the same Unicode string
as at the beginning (a) where [(a) != (b)]

I was then wondering what kind of encoding I got.
I looked at this web site http://2cyr.com/decode/?lang=en to understand more
and it looks like I got windows-1251 (see [2])
that can be displayed in a browser as encoding="iso8859-1".

*My question is: Why didn't i get UTF-8 and how is it possible I got (b)
?????*


Thank you for your reading and any comments you might have.

José Ferreiro

Many thanks to Martin Gainty and Ognjen Blagojevic for already commeting and
helping in another thread I posted.


[1] -
http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html
[2] - http://en.wikipedia.org/wiki/CP1251

PS: Thanks to Martin and

-- 
José Ferreiro
MSc in Communication Systems, EPFL.

Reply via email to