> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Daniël > Mantione > Sent: 16 November 2005 21:58 > To: FPC developers' list > Subject: RE: [fpc-devel] Unicode RTL > > > Op Wed, 16 Nov 2005, schreef peter green: > > > > > > pos('ë','Daniël'); > > > > > > ... has a different implementation for utf-8 and 8-bit code pages. > > one little desgin feature of utf-8 is that is was carefully > designed to be > > friendly to byte-orientated code. No special precautions are needed for > > substring matching in utf-8! > > Which is the "be ignorant about multibyte character sets" model. Nothing > wrong with that model, but it has its limitations. UTF-8 however is far more friendly to that model than most legacy multibyte character sets. Most importantly you CAN'T get a false match when doing byte-orientated substring matching on utf-8 strings and if some code does chop a UTF-8 string mid-character only the chopped character will be lost.
_______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel