Since the topic of character encoding on requests came up on TOMCAT-USER
as well, I thought I'd forward a response I did on TOMCAT-DEV
earlier.  Read Andrey's message (at the bottom) first for this to make any
sense.

Craig


---------- Forwarded message ----------
Date: Wed, 8 Aug 2001 09:52:53 -0700 (PDT)
From: Craig R. McClanahan <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: Tomcat-Dev <[EMAIL PROTECTED]>
Subject: Re: JSP encoding problem solution proposal

Andrey has done a good job of describing the problem with calling
request.setCharacterEncoding() in JSP pages.  As an alternative to adding
a parameter to the JSP servlet (which would not be portable to other
containers), I'd like to offer a standards-based approach to this.

The basic approach I'm going to suggest is to use a Filter (also new in
the Servlet 2.3 API) to do the setCharacterEncoding() call for you.  This
has many advantages:

* It's portable across all 2.3-based containers

* You can use different rules for different portions of your webapp

* The rules you implement do not have to be as simple as "all requests
  use this character set" -- you can base the decision on other aspects
  of the request (like who the user is, objects in the session, what
  user agent they are using, and so on).

* The rules can be implemented (and updated later) without changing
  anything about your servlets and JSP pages, because it's done
  in a separate component.

As it happens, I recently added a simple Filter implementation that does
this to the examples webapp in Tomcat 4.  Check out
"/WEB-INF/classes/filters/SetCharacterEncodingFilter.java", which you are
free to use as the basis for setting character filtering in your own
apps.  The implementation provided is very simple (use an initialization
parameter to set the encoding unconditionally), but you can easily
subclass to make more sophisticated policy decisions.

Filters are your friend :-).

Craig



On Wed, 8 Aug 2001, Andrey Aristarkhov wrote:

> Hi All!
> 
> Different encodings support in Servlet/JSP is an ancient well-known problem.
> The setCharacterEncoding() method of HttpServletRequest allows to change
> request
> encoding before reading parameters. Thus, servlet is able to change encoding
> in
> accordance with its needs. (Small lyrical digression: what does this
> encoding mean?
> I'll post my thoughts about it separately)
> Howevet the problem still exists in JSP (there were several postings about
> the problem in
> this maillist). The purpose of this mail is to propose a solution for
> encodings support in JSP.
> 
> Problem description
> ===================
> A JSP programmer is not able to change request encoding for incoming JSP
> request, since
> "This method [setCharacterEncoding] must be called prior to parsing any post
> data or
> reading any input from the request. Calling this method once data has been
> read will
> not affect the encoding." (Servlet 2.3 Spec). This happens because request
> parameters
> being read inside org.pache.jasper.servlet.JspServlet, before calling
> generated JSP-servlet.
> As a result we have the following behaviour of compiled JSP for non-English
> environments:
> 1) incoming request being read using 'ISO-8859-1'
> 2) getParameter() method returns a value in 'ISO-8859-1', but JSP-servlet
> suppose the
>    return value has JVM default encoding (say "KOI8-R") -- here is ???????
> instead of
>    real parameter value. Here is a problem.
> 
> Problem solution
> ================
> There should be a configurable optional parameter for JspServlet (say
> 'requestEncoding') to
> change request encoding. According to this parameter JspServlet should call
> setCharacterEncoding()
> before processing request. It does not conflict with JSP 1.2 Spec, since
> there are now any
> words about default encoding of incoming request over there.
> 
> I have made neccessary changes to implement this feature in
> tomcat-4.0-20010807. It works fine
> with different Cyrillic encodings. (Suppose the same result for the rest of
> non-Latin1 encodings).
> I clearly understand that proposed solution is not a panacea and it's a
> subject to discuss.
> 
> 
> Regards,
> Andrey Aristarkhov
> 
> 
> Diffs are followed (also as attachments). I have also attached a sample JSP
> for encoding testing.
> 
> 
> file: org/apache/jasper/EmbededServletOptions.java
> 
> 147a148,152
> >      * Java platform encoding for incoming request.
> >      */
> >     private String requestEncoding;
> >
> >     /**
> 219a225,228
> >     public String getRequestEncoding() {
> >     return requestEncoding;
> >     }
> >
> 320a330
> >         this.requestEncoding = config.getInitParameter("requestEncoding");
> 
> file: org/apache/jasper/EmbededServletOptions.java
> 
> 144a145,149
> >
> >     /**
> >      * Java platform encoding for incoming request.
> >      */
> >     public String getRequestEncoding();
> 
> file: org/apache/jasper/servlet/JspServlet.java
> 
> 422c422,426
> <             String includeUri
> ---
> >             // According to section 4.9 of Servlet 2.3 spec we have to
> >             // setCharacterEncoding() before reading any parameter
> >             if (options.getRequestEncoding()!=null)
> >               request.setCharacterEncoding(options.getRequestEncoding());
> >             String includeUri
> 
> 


Reply via email to