Konstantin Kolinko wrote:
2009/11/12 pramodpm <pramod_me...@satyam.com>:
We are  getting following error:
java.io.CharConversionException: Not an ISO 8859-1 character: <EF><BF><83>.
It is not just <83>. Sorry I missed those last time.

We are working with java6. If I use tomcat 5.5.23 it is working... But we
would like to use the tomcat 6.


Those 5.5 and 6.0 are probably running on different computers, with
different locale settings in their OSes.

There are places in programs, where byte -> character conversion
occurs. In all those places you should explicitly specify, what
encoding those bytes are using.

If you do not specify the encoding explicitly (if you are lazy or do
not know how to do it), you will end up with platform default
encoding, and that will be different in different locales.


What Konstantin writes above is true. In addition :

If you were running Tomcat 6 on the same machine as Tomcat 5.5, and with exactly the same environment, and retrieving the same external page, then the error (or absence of error) should be the same under Tomcat 5.5 and Tomcat 6, because the java servlet classes that you are using are the same. So, obviously, something is different here between your Tomcat 5.5 and your Tomcat 6 (apart from Tomcat).

The key here, is that you have, inside of your application, some Unicode string, containing some characters that are valid in Unicode. But then, you try to output them to the ServletOutputStream, which is set for ISO-8859-1. Which means that Java must do a character set conversion, from the internal Unicode, to the external ISO-8859-1 output stream. And that is when it complains, because internally the string contains a Unicode character (which in UTF-8 looks like the sequence <EF><BF><83>), and that character does not have a valid representation in ISO-8859-1.

So you must either change your ServletOutputStream to be also UTF-8 (and make sure you set everything in accordance to that), or else you must filter the output characters before passing them to the output stream, and anything that is not ISO-8859-1, you must take out, or replace by a placeholder characters (like "?") for example.
How that all fits with your application, we cannot tell.

There is no quick-and-dirty solution to this kind of thing, and no single Tomcat or Java setting that will solve the problem. You are dealing with multilingual data at the input, so you need to handle that properly.




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to