On Sat, 2003-11-01 at 12:58, Joerg Heinicke wrote: > Now I'm confused ... > > With the container encoding all resources are read, i.e. my text files > and the request.
Nope, these are two different encodings: * text files are read according to whatever encoding/locale is configured in your OS (unless you supply special parameters when starting the JVM) * request parameters are always decoded using ISO-8859-1 See also section 4.9 in the servlet 2.3 spec: -- begin quote Currently, many browsers do not send a char encoding qualifier with the Content- Type header, leaving open the determination of the character encoding for reading HTTP requests. The default encoding of a request the container uses to create the request reader and parse POST data must be ISO-8859-1 , if none has been specified by the client request. However, in order to indicate to the developer in this case the failure of the client to send a character encoding, the container returns null from the getCharacterEncoding method. If the client hasn t set character encoding and the request data is encoded with a different encoding than the default as described above, breakage can occur. To remedy this situation, a new method setCharacterEncoding(String enc) has been added to the ServletRequest interface. Developers can override the character encoding supplied by the container by calling this method. It must be called prior to parsing any post data or reading any input from the request. Calling this method once data has been read will not affect the encoding. -- end quote Since the mentioned setCharacterEncoding isn't supported since long (and must be called before any request parameter is read), Cocoon has its own mechanism to fix this, which does something like: new String(value.getBytes(container_encoding), form_encoding); container_encoding should always be ISO-8859-1 (unless you have a broken servlet container), and form_encoding should be the same one as on your serializer. > The form encoding only recodes the request parameters > to the expected (i.e. container) encoding. So it works like a servlet > filter. > > Joerg > > On 01.11.2003 12:36, Bruno Dumon wrote: > > > On Sat, 2003-11-01 at 12:24, Joerg Heinicke wrote: > > > >>On 01.11.2003 12:08, Reinhard Poetz wrote: > >> > >> > >>>>personally I think this patch should come together with a > >>>>change to our > >>>>web.xml so we rather change the default form-encoding to be > >>>>also "utf-8" > >>> > >>> > >>>sorry, I don't understand this. Does this mean the general encoding is > >>>iso-8859-1 and the form encoding is UTF-8? If yes, why two different > >>>encodings? > >> > >>These are two different things. > >> > >>On the one hand there is the container encoding. It defines with which > >>encoding textfiles are read, e.g. properties files. It's about servlet > >>container <=> file system. > >> > > > > > > The "container encoding" mentioned here is the encoding with which the > > servlet container decoded request parameters. The servlet spec says that > > this should always be ISO-8859-1 (unless the client specified another > > encoding or, from 2.3, request.setCharacterEncoding is used). This > > parameter has nothing to do with the encoding used to decode e.g. text > > files, and should normally always be left to ISO-8859-1. > > > > Some more info about all this can be found on this wiki page: > > http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding > > > > > >>On the other hand there is the form encoding. It defines with which > >>encoding requests are read. It's about servlet container <=> clients. > >> > >>I hope it's correct so. -- Bruno Dumon http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center [EMAIL PROTECTED] [EMAIL PROTECTED]
