Besides, in current implementation UTF8 might have a disadvantage with 2-byte+ encodings. Those encodings are in WideString format, and conversion to old string can be done either automatically or via special procedures (as it seems to be on kylix). UTF8 is implemented as a string. It has some advantages (easy works), but one big disadvantage. Working with 1-bit encoding strings everyone assumes it is one byte. Let's imagine that an old pascal/delphi program hardly works with Russian words. It assumes that the length (number of letters) of a word contained in string can be obtained by length function. Besides, it can use fixed lengths in copy function and so on. When this software will work with widestrings, in simple situation the widestring will be autoconverted to ansistring. In more complex situations the length of widestring will be calculated as a number of widechars contained, which is right too. Current UTF8 string is a "type string", and its length is currently a number of bytes...
_________________________________________________________________ To unsubscribe: mail [EMAIL PROTECTED] with "unsubscribe" as the Subject archives at http://www.lazarus.freepascal.org/mailarchives