Hi,

I am in the process of converting my web site to be UTF-8 compliant, so that it can better handle scripts used by various languages. To be clear I am using a Java application server and by default it only accepts ISO-8859-1 for POST requests. In order to be able to have it treat the request parameters as UTF-8 I have to call the

  request.setCharacterEncoding("UTF-8")

each time I receive the parameters. I would have preferred an option to tell my application server to default to treating all post content as UTF-8, but this was rejected based on RFC 2616, section 3.7.1 and 3.4.1.

I decided to try to specify the charset as part of the form's enctype attribute:

<form action="" method="post" enctype="application/x-www-form- urlencoded; charset=utf-8">

though having tested with Safari, Firefox and Opera, I found that only Opera included the "charset=utf-8" component in the content-type of the request. Additionally if I specify

<form action="" method="post" enctype="application/x-www-form- urlencoded; charset=utf-8" accept-charset="utf-8">

I get other strange results with Firefox and Safari. With Opera I just see questions marks when I pass my Japanese character test case.

Now to the questions:

- should web browsers be acknowledging the charset attribute specified in the form, and sending them to the HTTP server? - is considered wrong to force my application to treat all requests as UTF-8?

André


Reply via email to