Op Fri, 26 Sep 2008, schreef Graeme Geldenhuys:

On Fri, Sep 26, 2008 at 10:43 AM, Michael Schnell <[EMAIL PROTECTED]> wrote:

It's no different then UTF-16 if you want to do it properly. In both you
have to look out for surrogates.


Is UTF-16 Widestring in FPC (and Delphi 200x ? ) not done just ignoring the
surrogates ?

Lets hope not, because then it would be UCS-2 and NOT UTF-16! As far
as I know D2009 (I think) handles this correctly, but I have no idea
how.

Let me put it like this: Someone writing a Russian/Arabic/Japanese spell checker does not have to handle surrogates with UTF-16, but he does with UTF-8, i.e. UTF-16 is much better for them than UTF-8.

Someone writing a spell checker for old-Egyptian Hieroglyphs will have to deal with surrogates. For those people UTF-16 has few advantages over UTF-8, (allthough in practice it's still a bit easier to handle than UTF-8).

Russian, Arabic, Japanese are languages in daily use on computers, countless electronic documents in these languages exist. There is a huge interrest in software handling it, and therefore it's worth spending our valuable time on. Egyptian Hieroglyphs are not worth spending our valuable time on.

Some UTF-16 support should come by default, like UTF-8 <-> UTF-16 conversion. In many situations it will not be necessary to bother with surrogates at all. In some situations we may just accept patches if someone is interrested.

Daniël
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to