Besides, in current implementation UTF8 might have a disadvantage with 2-byte+ 
encodings. Those encodings are in WideString format, and conversion to old 
string can be done either automatically or via special procedures (as it seems 
to be on kylix). UTF8 is implemented as a string. It has some advantages (easy 
works), but one big disadvantage. Working with 1-bit encoding strings everyone 
assumes it is one byte.
Let's imagine that an old pascal/delphi program hardly works with Russian 
words. It assumes that the length (number of letters) of a word contained in 
string can be obtained by length function. Besides, it can use fixed lengths in 
copy function and so on.
When this software will work with widestrings, in simple situation the 
widestring will be autoconverted to ansistring. In more complex situations the 
length of widestring will be calculated as a number of widechars contained, 
which is right too. Current UTF8 string is a "type string", and its length is 
currently a number of bytes...

_________________________________________________________________
     To unsubscribe: mail [EMAIL PROTECTED] with
                "unsubscribe" as the Subject
   archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to