Elliotte Rusty Harold wrote:
> For example, consider the charAt() method in java.lang.String:
> 
> public char charAt(int index)

Just for comparison, ICU added a method to its UnicodeString class equivalent to this:
    public int char32At(int index)

More difficult than the string class was the CharacterIterator: It had many more 
failings in common with its Java sibling than a lack of UTF-16 support, among them 
semantics for forward iteration that are inefficient and unusual and especially bad 
for a variable-width encoding.

The ICU API was changed this way within a few months this year. Some of the 
higher-level implementations are still to follow until next summer, when there will be 
some 45000 CJK characters that will be infrequent but hard to ignore - the Chinese and 
Japanese governments will insist on their support.

markus

Reply via email to