Re: Java and Unicode

Markus Scherer Wed, 15 Nov 2000 17:05:58 -0800
Please let's keep types for single characters and types for strings separate.

ICU used to be in the same situation as Java: everything character/string used 16-bit 
types.
In extension to UTF-16, we decided to keep the string base type at 16 bits for very 
good reasons like interoperability and memory consumption.
For single characters, ICU changed APIs from 16-bit to 32-bit types.

In the case of Java, the equivalent course of action would be to stick with a 16-bit 
char as the base type for strings. The int type could be used in _additional_ APIs for 
single Unicode code points, deprecating the old APIs with char.

Whatever Sun decides to do with single characters, it will be most reasonable to keep 
the string encoding the same and just treat it as UTF-16 where that makes a difference.

For details, see my presentation at the IUC 17 Unicode conference (2000 September, 
session B2).
(See http://www.unicode.org/ - I am having some trouble with web access right now, so 
I cannot give you the URL...)

markus
Re: Java and Unicode

Reply via email to