Op Fri, 26 Sep 2008, schreef Graeme Geldenhuys:
On Fri, Sep 26, 2008 at 10:43 AM, Michael Schnell <[EMAIL PROTECTED]> wrote:
It's no different then UTF-16 if you want to do it properly. In both you
have to look out for surrogates.
Is UTF-16 Widestring in FPC (and Delphi 200x ? ) not done just ignoring the
surrogates ?
Lets hope not, because then it would be UCS-2 and NOT UTF-16! As far
as I know D2009 (I think) handles this correctly, but I have no idea
how.
Let me put it like this: Someone writing a Russian/Arabic/Japanese spell
checker does not have to handle surrogates with UTF-16, but he does with
UTF-8, i.e. UTF-16 is much better for them than UTF-8.
Someone writing a spell checker for old-Egyptian Hieroglyphs will have to
deal with surrogates. For those people UTF-16 has few advantages over
UTF-8, (allthough in practice it's still a bit easier to handle than UTF-8).
Russian, Arabic, Japanese are languages in daily use on computers,
countless electronic documents in these languages exist. There is a
huge interrest in software handling it, and therefore it's worth spending
our valuable time on. Egyptian Hieroglyphs are not worth spending our
valuable time on.
Some UTF-16 support should come by default, like UTF-8 <-> UTF-16
conversion. In many situations it will not be necessary to bother with
surrogates at all. In some situations we may just accept patches if
someone is interrested.
Daniël
_______________________________________________
fpc-devel maillist - [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel