Hi David,

Thank you for your reply.
I think the sentence 
"The recommended encoding scheme to use is UTF-8" 
should really be understood as, the recommended encoding, when user did
not specify an encoding, is UTF-8, because that sentence is followed by 
"However, for compatibility reasons, if an encoding is not specified,
then the default encoding of the platform is used."

I also think this recommendation is for when the URLEncoder class is
used in the 'User Agent' side. Appendix B-2 of HTML4.01 specification
http://www.w3.org/TR/html4/appendix/notes.html#h-B.2
talks about the User Agent's behavior when it encounters the wrong url
string, and UTF-8 is recommended as the character encoding in such case.
(This does not apply to server side) Nowhere else in the spec I can find
such recommendation.

URI (and therefore, URL-encoding) in HTML is bound to rfc2396 which
does not recommend any character encoding to be used. In the section 1.6
it reads
   How a URI is
   represented in terms of bits and bytes on the wire is dependent upon
   the character encoding of the protocol used to transport it, or the
   charset of the document which contains it.

Now, http, the protocol used to transport html, says anything about
encoding? Sort of, in the charset extension of the content-type header.

Thus, I think the right behavior is to match the encoding to be used for
url-encoding with that of the
response(through ServletResponse#getCharacterEncoding).

(The Javadoc comment is misleading enough to make you think that you
should encode any string using UTF-8. I guess the javadoc should be
fixed.)

-------------------
Yasuhiko Sakakibara
[EMAIL PROTECTED]
[EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to