You might find the text below useful. It is my standard text on character
encoding.

Mark

REQUESTS
========

There are a number of situations where there may be a requirement to use non-US
ASCII characters in a URI. These include:
- Parameters in the query string
- Servlet paths

There is a standard for encoding URIs
(http://www.w3.org/International/O-URL-code.html) but this standard is not
consistently followed by clients. This causes a number of problems.

The functionality provided by Tomcat (4 and 5) to handle this less than ideal
situation is described below.

1. The Coyote HTTP/1.1 connector has a useBodyEncodingForURI attribute which if
set to true will use the request body encoding to decode the URI query
parameters.
  - The default value is true for TC4 (breaks spec but gives consistent
behaviour across TC4 versions)
  - The default value is false for TC5 (spec compliant but there may be
migration issues for some apps)
2. The Coyote HTTP/1.1 connector has a URIEncoding attribute which defaults to
ISO-8859-1.
3. The parameters class (o.a.t.u.http.Parameters) has a QueryStringEncoding
field which defaults to the URIEncoding. It must be set before the parameters
are parsed to have an effect.

Things to note regarding the servlet API:
1. HttpServletRequest.setCharacterEncoding() normally only applies to the
request body NOT the URI.
2. HttpServletRequest.getPathInfo() is decoded by the web container.
3. HttpServletRequest.getRequestURI() is not decoded by container.

Other tips:
1. Use POST with forms to return parameters as the parameters are then part of
the request body.


RESPONSES
=========

HTML META
 tags are ignored by Tomcat. You may use <%@ page pagEncoding="..." %> for JSPs.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to