Hi,

the question is: How do you create the output of
the servlet, that is, with which Writer or OutputStream.

If you do this:

  public void doGet( HttpServletRequest request,
              HttpServletResponse response ) throws IOException {
    response.setCharacterEncoding( "UTF-8" );
    Writer writer = response.getWriter();
    writer.write( "Hällo Wörld." );
  }

Then the Writer you obtain by response.getWriter()
takes into account what you set by calling setCharacterEncoding().
Now this Writer will write strings in UTF-8 encoding to
the output byte stream.

But if you just obtain the output byte stream of the servlet,
ie by calling

  OutputStream outputStream = response.getOutputStream();

and you use this stream to output character data, then the
call to response.setCharacterEncoding() is completely useless.
Then it only counts what you do write to this stream yourself.
Wrong would be:

  outputStream.write( "Hällo Wörld.".getBytes() );
  // who knows what encoding is used here: it is the
  // "platform's default encoding"

Ok would be:

  Writer goodWriter = new java.io.OutputStreamWriter(
      response.getOutputStream(), "UTF-8" );

Only by using OutputStreamWriter explicitely with this
constructor (or the newer ones, with the Charset and
CharsetEncoder arguments) can you safely create a character
data output with the intended encoding.

Hope this helps.

Georg


[EMAIL PROTECTED] wrote:
Hi all,

I noticed some encoding problems inside servlets, when switching from
Tomcat 5.5.20 to Tomcat 6.0.10. I looked for it in the mailing lists,
but didn't find something appropriate.


Scenario:
An own servlet (that is: a class derived from HttpServlet) is creating
very simple HTML output, containing (beside the necessary HTML tags
<html>,<body> etc.) just some German special characters (ä ö ü).

The java source code is UTF-8, the response instance is configured via
  response.setContentType( "text/html;charset=UTF-8" );
Just for safety I also added
  response.setCharacterEncoding( "UTF-8" );

The created HTML text contains a meta tag
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Nevertheless: when calling the corresponding URL, all the special
characters are not displayed correctly in the browser (Firefox), when
using Tomcat 6. If I switch the encoding of the displayed page to
ISO-8859-1 in Firefox the characters are displayed correctly. That is:
it seems to me that everything is okay with the servlet, except that the
used encoding for the response is ISO-8859-1 instead of UTF-8.

When using Tomcat 5.5 everything is displayed correctly as UTF-8. Java
Server Pages do _not_ show similar behaviour.

Has anyone experienced similar problems?

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to