Hi, the question is: How do you create the output of the servlet, that is, with which Writer or OutputStream.
If you do this: public void doGet( HttpServletRequest request, HttpServletResponse response ) throws IOException { response.setCharacterEncoding( "UTF-8" ); Writer writer = response.getWriter(); writer.write( "Hällo Wörld." ); } Then the Writer you obtain by response.getWriter() takes into account what you set by calling setCharacterEncoding(). Now this Writer will write strings in UTF-8 encoding to the output byte stream. But if you just obtain the output byte stream of the servlet, ie by calling OutputStream outputStream = response.getOutputStream(); and you use this stream to output character data, then the call to response.setCharacterEncoding() is completely useless. Then it only counts what you do write to this stream yourself. Wrong would be: outputStream.write( "Hällo Wörld.".getBytes() ); // who knows what encoding is used here: it is the // "platform's default encoding" Ok would be: Writer goodWriter = new java.io.OutputStreamWriter( response.getOutputStream(), "UTF-8" ); Only by using OutputStreamWriter explicitely with this constructor (or the newer ones, with the Charset and CharsetEncoder arguments) can you safely create a character data output with the intended encoding. Hope this helps. Georg [EMAIL PROTECTED] wrote:
Hi all, I noticed some encoding problems inside servlets, when switching from Tomcat 5.5.20 to Tomcat 6.0.10. I looked for it in the mailing lists, but didn't find something appropriate. Scenario: An own servlet (that is: a class derived from HttpServlet) is creating very simple HTML output, containing (beside the necessary HTML tags <html>,<body> etc.) just some German special characters (ä ö ü). The java source code is UTF-8, the response instance is configured via response.setContentType( "text/html;charset=UTF-8" ); Just for safety I also added response.setCharacterEncoding( "UTF-8" ); The created HTML text contains a meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Nevertheless: when calling the corresponding URL, all the special characters are not displayed correctly in the browser (Firefox), when using Tomcat 6. If I switch the encoding of the displayed page to ISO-8859-1 in Firefox the characters are displayed correctly. That is: it seems to me that everything is okay with the servlet, except that the used encoding for the response is ISO-8859-1 instead of UTF-8. When using Tomcat 5.5 everything is displayed correctly as UTF-8. Java Server Pages do _not_ show similar behaviour. Has anyone experienced similar problems? --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]