Bert Kemner wrote:
 I've a problem with a Javascript form on a german website.
 (http://informationservices.swets.de/web/show/id=47553)

My IE browser says that this page is in UTF-8. Therefore, you can expect to get the form data back to the server in UTF-8 as well.


 The input of the form contains german characters.
 But the output (which is generated by submitting the form) does not
 display those characters (see example beneath). My first reaction to
 this problem is that Unicode somehow does not translate these german
 characters to Windows (Outlook).

As Doug said, "Unicode" does not translate text. What translates text here is most likely your web server. Your server appears to think that the form data should be encoded according to something like ISO-8859-1, but the data is actually encoded in UTF-8.


The trick is to find out how to get your web server to assume the same encoding/charset for form data returned from the browser as it uses to encode and send the original page to the browser. If you use UTF-8 for the page encoding, then you need to use UTF-8 to go from the form data byte stream to Java strings.

Hint: There are Java String constructors and other methods that turn a byte array into a String. If those methods do not provide any way to specify the encoding/charset, then they probably assume ISO-8859-1. Use instead a method that takes an encoding parameter. You may need to use a variation of an InputStreamReader constructed for the "UTF8" encoding.

See also http://www.unicode.org/faq/unicode_web.html

Viel GlÃck,
markus




Reply via email to