charset problems coming up during runtime

Christoph Lechleitner Wed, 05 Nov 2003 05:32:57 -0800

Hello together.

I have a really weird problem with charset handling concerning special
characters like German "umlaute" (i.e. ä, ö, ü) (it also concerns 
characters from French and so on).


I have done extensive Google and list searches, but all information I 
found handles installations that are unable to handle special characters 
at all, but my problem is a bit different:

When Tomcat (4.1.27) starts, my applications handle umlauts absolute 
correctly (i.e., an ä read from a file or a database is encoded correctly 
as &aamp; by my encoding methods). 

But, after some (mostly long) runtime, this changes and an ä is suddenly 
dedected as "something completely different", forcing my methods to
replace it with a ? or a space.

Unfortunately, as the problem does never occur on a freshly started tomcat,
it is impossible to reproduce it reliable ;-<<

My observation and research results so far: 

- The problem occurs before my encoding loop can do it's work, i.e.
  an 'ä' in a String to be parsed does not match a constcant char 'ä'
  any more, or in other words ...
  somestring.charAt(someIndex) == 'ä'
  is false althoug the character is an 'ä'.
  This observation does also mean that no output filtering functionality
  (which AFAIK I do not use) can be the "evil".

- As it happens with strings read from files as well as with strings
  read from mysql databases, it seems to be a Tomcat or JRE(?) problem.

- As the problem does not exist with a "freshly" started Tomcat, the
  general environment (language settings and so on) seem to be correct.

- In most cases, the problem starts after days or even weeks without
  a tomcat restart, but sometimes it occurs only minutes or a few hours
  after tomcat's start.

- It happens with several Sun JDKs from 1.3.1 up to all majar 1.4.x
  releases, i.e, 1.4.0, 1.4.1, 1.4.2.

The software versions used are:
- Tomcat 4.1.x (currently I am using 4.1.27)
- SuSE Linux 8.1, 8.2, 9.0, kernels 2.4.*, optimized for Athlon family.
- I am not using any template-engine or filter-functions of tomcat
  (as far as I understand it ;->>)
- System, filesystems, and all applications set to use ANSI respectively
  ISO-8859-1 / ISO-8859-15, which share the same codes at least for all
  legacy charachters and German Umlauts.

I am not sure if I should blame the JRE or SuSE or the compilers (jikes!?)
perhaps (instead of stealing your time), but if my problem is caused by 
some kind of bug or perhaps by an undetected feature in either of these 
software, this list is, by far, my best hope to find other victims ;;-))

Any Ideas?


Kind regards.

Christoph Lechleitner

------------------------------------------------------------------------
 IBCL - IT Bureau Dipl.-Ing. C. Lechleitner
 Defreggerstr. 24, A-6020 Innsbruck, Austria, Europe
 http://www.ibcl.net/
 Tel.: +43 512 390717, Fax: +43 512 390787, Mobile: +43 699 12353479
------------------------------------------------------------------------

-- 

Christoph Lechleitner

Geschäftsführung und Technik

------------------------------------------------------------------------
IBCL - Informatik-Büro Dipl.-Ing. C. Lechleitner
Defreggerstr. 24, A-6020 Innsbruck, http://www.ibcl.at/
Tel.: +43 512 390717, Fax: +43 512 390787, Mobil: +43 699 12353479
------------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

charset problems coming up during runtime

Reply via email to