Ias,

Even if we consider the system which can't display the soap message well  for 
its lack of unicode-font,
I think the default encoding should be as-it-is not &#x escaping.
 
The soap message is not for display and it is better to generate the more 
compact soap message from the web services toolkit's point of view. 

For displaying, the application can convert the soap message to appropriate 
encoding. (as you know, here in korea, we use euc-kr. and also as you know, the 
conversion can be possible with some line of java code.) 
Also, as far as I know,  Axis used as-it-is way in Axis 1.0 or 1.1. 

I remember that the reason to use &#x escaping in UTF8Encoder was to handle the 
french accent or german umlaut a few months ago. This is reflected in 
test.encoding.TestString test case.

Any thought?

/Jongjin

----- Original Message ----- 
From: "Ias" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, December 29, 2004 1:53 AM
Subject: RE: UTF8Encoder question...


> 
> From: Jongjin Choi [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, December 28, 2004 11:56 AM
> To: [email protected]
> Subject: UTF8Encoder question...
> 
> 
> Dims and all, 
> 
> UTF8Encoder writes escaped string when the character is over 0x7F. 
> The escaping does not seem to be necessary because 
> the Writer (not OutputStream) is used. 
> 
> I think this could be just : (line 86)
> 
> writer.write(character);
> 
> instead of : (line 86 ~ 88)
> writer.write("&#x);
> writer.write(Integer.toHexString(character).toUpperCase());
> writer.write(";");
> 
> The escaping just increases the message size.
> 
ias> Yes, it does. However, I think representing a character of which codepoint
ias> is over 0x7F as a form of &#x XML entity is one of the aims of the encoder
ias> because some systems can't display that character properly due to no
ias> unicode-wide fonts built in there. In case it's 100% certain that every 
node
ias> in a messaging system has no problem with "as-it-is" character
ias> representation on a XML instance, it must be much more efficient to use a
ias> compact encoder as you pointed out instead of UTF8Encoder. Interestingly,
ias> AbstractXMLEncoder (which is not instantiable) works in such a way. In
ias> consequence, it would be a good idea to create a new encoder to optimize
ias> message size and use it with ease of configurability. (Yes, we can 
recommend
ias> it to users dealing with non-Latin character systems :-)
> 
> Happy new year,
> 
> Ias
> 
> P.S. I'm going to switch [EMAIL PROTECTED] to [EMAIL PROTECTED] (soon,
> very soon).
> 
> 
> If the OutputStream is used, the escaping or UTF-8 conversion (which
> existed in old UTF8Encoder.java) will be needed.
> 
> Thought?
> 
> /Jongjin
> 
>

Reply via email to