RE: [fpc-devel] Unicode RTL

peter green Wed, 16 Nov 2005 14:21:09 -0800


> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Daniël
> Mantione
> Sent: 16 November 2005 21:58
> To: FPC developers' list
> Subject: RE: [fpc-devel] Unicode RTL
>
>
> Op Wed, 16 Nov 2005, schreef peter green:
>
> >
> > > pos('ë','Daniël');
> > >
> > > ... has a different implementation for utf-8 and 8-bit code pages.
> > one little desgin feature of utf-8 is that is was carefully
> designed to be
> > friendly to byte-orientated code. No special precautions are needed for
> > substring matching in utf-8!
>
> Which is the "be ignorant about multibyte character sets" model. Nothing
> wrong with that model, but it has its limitations.
UTF-8 however is far more friendly to that model than most legacy multibyte
character sets. Most importantly you CAN'T get a false match when doing
byte-orientated substring matching on utf-8 strings and if some code does
chop a UTF-8 string mid-character only the chopped character will be lost.



_______________________________________________
fpc-devel maillist  -  [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel

RE: [fpc-devel] Unicode RTL

Reply via email to