Re: jasper: weird behaviour
You might be interested in this post from Tomcat-dev. --jeff --- Hi All! Different encodings support in Servlet/JSP is an ancient well-known problem. The setCharacterEncoding() method of HttpServletRequest allows to change request encoding before reading parameters. Thus, servlet is able to change encoding in accordance with its needs. (Small lyrical digression: what does this encoding mean? I'll post my thoughts about it separately) Howevet the problem still exists in JSP (there were several postings about the problem in this maillist). The purpose of this mail is to propose a solution for encodings support in JSP. Problem description === A JSP programmer is not able to change request encoding for incoming JSP request, since "This method [setCharacterEncoding] must be called prior to parsing any post data or reading any input from the request. Calling this method once data has been read will not affect the encoding." (Servlet 2.3 Spec). This happens because request parameters being read inside org.pache.jasper.servlet.JspServlet, before calling generated JSP-servlet. As a result we have the following behaviour of compiled JSP for non-English environments: 1) incoming request being read using 'ISO-8859-1' 2) getParameter() method returns a value in 'ISO-8859-1', but JSP-servlet suppose the return value has JVM default encoding (say "KOI8-R") -- here is ??? instead of real parameter value. Here is a problem. Problem solution There should be a configurable optional parameter for JspServlet (say 'requestEncoding') to change request encoding. According to this parameter JspServlet should call setCharacterEncoding() before processing request. It does not conflict with JSP 1.2 Spec, since there are now any words about default encoding of incoming request over there. I have made neccessary changes to implement this feature in tomcat-4.0-20010807. It works fine with different Cyrillic encodings. (Suppose the same result for the rest of non-Latin1 encodings). I clearly understand that proposed solution is not a panacea and it's a subject to discuss. Regards, Andrey Aristarkhov Diffs are followed (also as attachments). I have also attached a sample JSP for encoding testing. file: org/apache/jasper/EmbededServletOptions.java 147a148,152 > * Java platform encoding for incoming request. > */ > private String requestEncoding; > > /** 219a225,228 > public String getRequestEncoding() { > return requestEncoding; > } > 320a330 > this.requestEncoding = config.getInitParameter("requestEncoding"); file: org/apache/jasper/EmbededServletOptions.java 144a145,149 > > /** > * Java platform encoding for incoming request. > */ > public String getRequestEncoding(); file: org/apache/jasper/servlet/JspServlet.java 422c422,426 < String includeUri --- > // According to section 4.9 of Servlet 2.3 spec we have to > // setCharacterEncoding() before reading any parameter > if (options.getRequestEncoding()!=null) > request.setCharacterEncoding(options.getRequestEncoding()); > String includeUri
jasper: weird behaviour
Hello therer tomcat users ;) I'm not sure if this is a bug, so I'm posting a description of an unusual problem, and hope that if this is not a bug, somebody will prove that I'm missing something here... I have JSP page that has static content (outside <% %> tags) in ISO-8859-2, and a few static html pages. When it's completelly up to Tomcat to generate some page (for example redirect request, or internal server error) - it outputs Content-Type: header corectly (whole Linux enviroment is set to pl_PL) like this: Content-Type: text/html;charset=ISO-8859-2 When it sends static html, it outputs: Content-Type: text/html ..but this is corrected by tag. However *EVERYTIME* tomcat is sending back output of a JSP page, it is sending this: Content-Type: text/html;charset=ISO-8859-1 which is ok (as defined in JSP spec), but there's *NO* way to change it! I've tried nearly everything, including: <% response.setHeader("Content-Type", "text/html;charset=ISO-8859-2"); %> or <%@ page contentType("text/html;charset=ISO-8859-2"); %> All those tags make response.setHeader(...) apear on top of __jspService (inside proper .java file in $TOMCAT_HOME/work), but then... header get's overwritten by tomcat to ISO-8859-1 which scrambles all content and forces user to pick up ISO-8859-2 from browsers encoding menus everytime document is generated, which is really annoying. Bowsers seems to ignore tag in favour of server generated Content-Type: header. Thanks to tomcat beeing opensource, I can just play with share/org/apache/jasper/compilser/Compiler.java, and broke spec by setting default encoding to ISO-8859-2, but I feel like that's not the way... Is it a bug in jasper, or am I missing something here? -- Jacek Prucia 7bulls.com S.A.