Thing is you've got an HTML form that you tell browsers it is
ISO-8859-2, so when they post it to form target URL - it gets send
encoded as ISO-8859-2, it is then your responsibility to parse
incoming queries in the encoding you asked it to be encoded.

Depending upon your requirements, UTF-8 will fit most of any languages
needs but there are cases where you want to store some languages in
specific charsets as converting from specific charsets towards unicode
is reasonably reliable whereas converting from unicode towards
specific charsets can be tricky in some cases.

However, in your case your data is posted in ISO-8859-2 you'll need to
convert it in case you want to manipulate it as unicode using
something similar as this :

String value = request.getParameter("mytext");
try{
    value = new String(value.getBytes(), request.getCharacterEncoding());
}catch(java.io.UnsupportedEncodingException ex){
    System.err.println(ex);
}

But there might be some easier method and I'm not a JSP Guru ...

- Joseph

On Mon, Mar 16, 2009 at 3:40 PM, Gregor Schneider <rc4...@googlemail.com> wrote:
> On Mon, Mar 16, 2009 at 3:10 PM, Mikolaj Rydzewski <m...@ceti.pl> wrote:
>>
>> It doesn't work for me. By default Tomcat uses ISO-8859-1 encoding. And it
>> will try this encoding to parse input parameters.
>>
>
> That's true, I'm doing the same here for German Umlaute, however:
>
> One link in the Wiki is pointing to HTTP specification section 3.4.1,
> however, there's something that I  do not understand:
>
> The specs say in 3.4.1:
>
> <quote>
> HTTP/1.1 recipients MUST respect the
>   charset label provided by the sender; and those user agents that have
>   a provision to "guess" a charset MUST use the charset from the
>   content-type field if they support that charset, rather than the
>   recipient's preference, when initially displaying a document. See
>   section 3.7.1.
> </quote>
>
> So, for me as a non-native English speaker, I understand it in such a
> way that your conent-encoding must be obliged - or do I get it wrong
> here? So, if in the content-encoding UTF-8 is specified, why isn't it
> accepted then?
>
> Rgds
>
> Gregor
> --
> just because your paranoid, doesn't mean they're not after you...
> gpgp-fp: 79A84FA526807026795E4209D3B3FE028B3170B2
> gpgp-key available @ http://pgpkeys.pca.dfn.de:11371
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to